Wanxin Li
Roseseed

Repository



Overview
The usual way of defining a thread behavior in cases where performance is
critical is in terms of a collection of n+1 files containing code in the C
programming language. In particular, there are n "module" files with a ".c"
suffix that contain the bodies of C functions and procedures, and one additional
shared file with a ".h" suffix that contains definitions of C global variables
and types that are used by each of the module files. Roseseed is a system that
makes it possible to define performance critical thread behaviors in another
way. First, the ".h" file is replaced with a relational database schema that
abstracts all data relevant to the thread behavior. This now includes all local
heap-memory data with the low level interface defined by the ".h" file. And
second, each of the ".c" module files is replaced with a ".sql" version in which
all data manipulation in the bodies of C functions and procedures is re-coded by
using a variant of the static embedded SQL protocol.
Roseseed is itself a collection of performance critical thread behavior
definitions that we call atomic tools.  Thus, each atomic tool comprising
Roseseed will ultimately be coded in this fashion and serve as part of the
benchmark for evaluating Roseseed. (Of course, this raises the issue of how one
bootstraps the initial Roseseed implementation; more on this below.)
SQLPtoSQL
Here, we introduce the abstract relational model (ARM) and its SQLP interface.
The former is a simple extension to the relational model (RM) and the latter
is a simple extension to SQL that admits attribute paths as a shorthand for a
sequence of unary foreign key joins.
The first atomic tool comprising Roseseed, called SQLPtoSQL, translates SQLP
queries over a given ARM schema S to an equivalent formulation in SQL over a
corresponding RM schema T. The correspondence between S and T is defined by a
so-called referring expression type assignment (RET) for each table defined in
S. (More on this below.)
(outline for the bootstraping coding of SQLPtoSQL)
We have started to implement a bootstraping version of SQPtoSQP that uses
Flex and Bison to accomplish all token scanning and parsing for SQLP.
To run the SQLP parser, run make then run ./SQLPParser. An example of a
test input is
select s.name from student s where s.name = "John" and s.num = 10;.
Currently a syntax error will be reported when the semi-colon is read, but the
parse tree that is output is correct.
The grammar is located in SQLPGrammar.y, and non-keyword types (such as
INTEGER, IDENTIFIER, STRING) are defined in SQLPScanner.l. util.c
contains the functions needed to build the con cell structure.
For parsing any general string into a tree of cons_cells, run make then run
./Parser. An example of a test input is
PROJ (AS (SPCOL (VAR (s) PF (ATTR (name)))) ATOM (TABLE (student) VAR (s))).
Every clause encosed in brackets is nested within a child cons_cell, and any
two adjacent terms are connected via the next pointer.
NOTE OF THINGS TO BE CHANGED/CAREFUL ABOUT:
In SQLPGrammar.y currently we reuse create_rename_operator in select_list
section, however this needs to be changed in the future as this is incorrect.
Instead this should be used when we rename what is selected, for example
select s.t as st ...
should become something like
PROJ (RENAME (s.t (st)) ....
Currently we don't support this type of query with as in it even though it's
valid.
Also the Makefile only has two commands, make and make clean. make will
create both the SQLPParser and Parser programs to do what was described above.
Maybe this should be changed in the future to be separate but we'll leave that
discussion for the future.
Added in Spring 2019:
Entity descriptions for algebra in lisp form. There are two sorts of operators:
(1) operators producing normal tables with an ordered finite sequence of attribute
names, and (2) operators producing “column tables”, where tuples are a possibly
infinite set of “SQLP column name”/“constant” pairs.
 (1)
( select   (as-list  … )  )
( union   (query-list  … ) )

	distinct
	all

	( limit  )
	( no-limit )

	( as   )
 (2)
	
	( atom   )
	( natural-join-list  … )
	( sub-query   )

	( equal   )
	( less-than   )
	( less-than-or-equal   )
	( exists  )
	( not  )
	( and-list  … )
	( or-list  … )

	
	( integer  )
	( string  )

	( column  ( attribute-name-list  … ) )