Skip to content
Snippets Groups Projects
Forked from Grant Weddell / LDI
This fork has diverged from the upstream repository.
Wanxin Li's avatar
Wanxin Li authored
b82756ee
History

Overview

The usual way of defining a thread behavior in cases where performance is critical is in terms of a collection of n+1 files containing code in the C programming language. In particular, there are n "module" files with a ".c" suffix that contain the bodies of C functions and procedures, and one additional shared file with a ".h" suffix that contains definitions of C global variables and types that are used by each of the module files. Roseseed is a system that makes it possible to define performance critical thread behaviors in another way. First, the ".h" file is replaced with a relational database schema that abstracts all data relevant to the thread behavior. This now includes all local heap-memory data with the low level interface defined by the ".h" file. And second, each of the ".c" module files is replaced with a ".sql" version in which all data manipulation in the bodies of C functions and procedures is re-coded by using a variant of the static embedded SQL protocol.

Roseseed is itself a collection of performance critical thread behavior definitions that we call atomic tools. Thus, each atomic tool comprising Roseseed will ultimately be coded in this fashion and serve as part of the benchmark for evaluating Roseseed. (Of course, this raises the issue of how one bootstraps the initial Roseseed implementation; more on this below.)

SQLPtoSQL

Here, we introduce the abstract relational model (ARM) and its SQLP interface. The former is a simple extension to the relational model (RM) and the latter is a simple extension to SQL that admits attribute paths as a shorthand for a sequence of unary foreign key joins.

The first atomic tool comprising Roseseed, called SQLPtoSQL, translates SQLP queries over a given ARM schema S to an equivalent formulation in SQL over a corresponding RM schema T. The correspondence between S and T is defined by a so-called referring expression type assignment (RET) for each table defined in S. (More on this below.)

(outline for the bootstraping coding of SQLPtoSQL)

We have started to implement a bootstraping version of SQPtoSQP that uses Flex and Bison to accomplish all token scanning and parsing for SQLP.

To run the SQLP parser, run make then run ./SQLPParser. An example of a test input is

select s.name from student s where s.name = "John" and s.num = 10;.

Currently a syntax error will be reported when the semi-colon is read, but the parse tree that is output is correct.

The grammar is located in SQLPGrammar.y, and non-keyword types (such as INTEGER, IDENTIFIER, STRING) are defined in SQLPScanner.l. util.c contains the functions needed to build the con cell structure.

For parsing any general string into a tree of cons_cells, run make then run ./Parser. An example of a test input is

PROJ (AS (SPCOL (VAR (s) PF (ATTR (name)))) ATOM (TABLE (student) VAR (s))).

Every clause encosed in brackets is nested within a child cons_cell, and any two adjacent terms are connected via the next pointer.

NOTE OF THINGS TO BE CHANGED/CAREFUL ABOUT:

In SQLPGrammar.y currently we reuse create_rename_operator in select_list section, however this needs to be changed in the future as this is incorrect. Instead this should be used when we rename what is selected, for example

select s.t as st ...

should become something like

PROJ (RENAME (s.t (st)) ....

Currently we don't support this type of query with as in it even though it's valid.

Also the Makefile only has two commands, make and make clean. make will create both the SQLPParser and Parser programs to do what was described above. Maybe this should be changed in the future to be separate but we'll leave that discussion for the future.

Added in Spring 2019:

Entity descriptions for algebra in lisp form. There are two sorts of operators: (1) operators producing normal tables with an ordered finite sequence of attribute names, and (2) operators producing “column tables”, where tuples are a possibly infinite set of “SQLP column name”/“constant” pairs.

(1) ( select (as-list … ) ) ( union (query-list … ) )

distinct all ( limit ) ( no-limit ) ( as ) (2) ( atom ) ( natural-join-list … ) ( sub-query ) ( equal ) ( less-than ) ( less-than-or-equal ) ( exists ) ( not ) ( and-list … ) ( or-list … ) ( integer ) ( string ) ( column ( attribute-name-list … ) )