Modify your code from Part 3 (runSQL) to support
select-from-where queries involving joins between exactly
two tables. The two tables to be joined may be distributed
tables with fragments residing on multiple nodes. The input
to runSQL consists of two filenames (stored in variables
clustercfg
and sqlfile
) passed in as commandline arguments.
The file clustercfg
contains access information for the
catalog DB. The file sqlfile
contains the SQL terminated by
a semi-colon to be executed. The join query needs to be
executed on multiple threads; However, you should develop
the sequential non-multi-threaded version first
and convert the join QEP into a
multi-threaded one.
runSQL
should output the rows retrieved to the standard
output on success or report failure.
You may assumed that the SQL queries only operate on at most two tables and do not contain any nested subqueries.
You should consider using the ANTLR compiler compiler to generate a SQL parser that you can use to extract the table names and to detect the join conditions. Only equi-joins need to be supported.
Policy on grammar sharing: You may discuss, share, and/or develop the grammar with other groups.
You may test your program on a single computer by using different databases to simulate the multiple computers.
clustercfg
file
catalog.driver=com.ibm.db2.jcc.DB2Driver
catalog.hostname=jdbc:db2://10.0.0.3:50001/mycatdb
catalog.username=db2inst1
catalog.passwd=mypasswd
localnode.driver=com.ibm.db2.jcc.DB2Driver
localnode.hostname=jdbc:db2://10.0.0.3:50001/mydb1
localnode.username=db2inst1
localnode.passwd=mypasswd
localnode
identifies the node in the cluster that received
the query and that will co-ordinate the execution of the
query.
sqlfile
file
SELECT S.sname
FROM Sailors S, Reserves R
WHERE S.sid=R.sid and R.day='2009-12-21'
;
./run3.sh ./cluster.cfg ./sailors.sql
Ramakrishnan
Silberstein
[jdbc:db2://10.0.0.3:50001/mydb2]: ./sailors.sql success.
[jdbc:db2://10.0.0.3:50001/mydb1]: ./sailors.sql failed.
It is recommended that you develop your program in the following steps
Integrate all the code from all the previous programming
assignments into one program runSQL
. Fix any defects from
previous assignments.
For the load case, if the config file contains tablename
,
runSQL
behaves like a loader and the sqlfile
(second
commandline argument) is treated as the csv file.
Otherwise, the operation is detected from the SQL statement
in sqlfile
.
Please ensure that query output is in pipe delimited format. No trailing whitespace for char/varchar data types should be printed in the query output. Floating point numbers should be printed up to 3 decimal spaces.
Write at least one testcase for each of the following functionality of your final program in Part 6. (There should be at least 4 testcases in total).
Each testcase must test a non-trivial feature or code path including exception handling. Repeating the same create table statement test with different table names is considered trivial. Your testcases need to conform to the format of the testcases in /home/db2inst1/tests/program[1-2]/*. You should name your testcase using the format:
test<function_num>-<userid>-<testnum>.{cfg,sql,csv,...}
Suppose your user name is jane and your first testcase attempts to test the create table functionality from Assignment 1, then your testcase would consist of the following files:
test1-jane-1.cfg
: config filetest1-jane-1.sql
: sql or csv filetest1-jane-1.pre
: shell script that sets up the required
tables for the testtest1-jane-1.post
: shell script that queries the
databases after the testtest1-jane-1.post.exp
: expected output of previous sql
scripttest1-jane-1.grep
: alternative to exp file. the grep file
contains a list of patterns that must be present in the
output of the program.test1-jane-1.txt
: a description of what the test is testing.Each testcase would be executed in the following sequence of shell commands:
bash $ ./test1-jane-1.pre
bash $ ./run.sh test1-jane-1.cfg test1-jane-1.sql | sort > test1-jane-1.out
bash $ ./test1-jane-1.post | sort > test1-jane-1.post.out
bash $ diff ./test1-jane-1.post.out ./test1-jane-1.post.exp
When the output to be compared is in test1-jane-1.out (eg. for the SQL queries), then the diff is performed on test1-jane-1.out and test1-jane-1.post.exp
1) Writing: submit a link to the google doc in Laulima->Assignments. Please ensure that you share your google doc with the instructor. Every individual should make a submission.
2) Code: You will submit your code by uploading your code onto the
submission server (you can also clone it from github).
Create a directory called hw3
in your home directory and
put your code in that directory. You need to ensure that
your code compiles and runs on the submission server. If you
need any additional libraries, please contact the TA to
install the libraries. Every group needs to make one
submission.
Access the submission server at IP address: 35.185.202.213
Example of test1-jane-1.pre (assume password for MySQL is XXXXXX)
#!/bin/bash
mysqladmin -u root -pXXXXXX drop lydb1
mysqladmin -u root -pXXXXXX create lydb1
mysql -u root -pXXXXXX lydb1 < test1-jane-1.pre.sql
mysql -u root -pXXXXXX lydb2 < test1-jane-1.pre.sql
Example of test1-jane-1.pre.sql
drop table XXTNXX;
Example of test1-jane-1.cfg
catalog.driver=com.mysql.jdbc.Driver
catalog.hostname=jdbc:mysql://localhost:3306/lydb1
catalog.username=root
catalog.passwd=XXXXXX
numnodes=2
node1.driver=com.mysql.jdbc.Driver
node1.hostname=jdbc:mysql://localhost:3306/lydb1
node1.username=root
node1.passwd=XXXXXX
node2.driver=com.mysql.jdbc.Driver
node2.hostname=jdbc:mysql://localhost:3306/lydb2
node2.username=root
node2.passwd=XXXXXX
Example of test1-jane-1.sql
create table XXTNXX (col1 int, col2 char(13), col3 decimal);
Example of test1-jane-1.post
#!/bin/bash
mysql -u root -pXXXXXX lydb1 < test1-jane-1.post.1.sql
mysql -u root -pXXXXXX lydb2 < test1-jane-1.post.1.sql
mysql -u root -pXXXXXX lydb1 < test1-jane-1.post.2.sql
Example of test1-jane-1.post.1.sql
select * from XXTNXX;
Example of test1-jane-1.post.2.sql
select nodedriver, nodeurl, nodeuser
from dtables
where tname='XXTNXX';
You will need to generate an expected output file that you manually check to be correct.