ICS 421 - Spring 2010 - Programming Assignment 1


You may work in a team of two students, but each student needs to make a submission. You are encouraged to engage in general discussions with other teams regarding the assignment, but specific details of a solution, including the solution itself, must always be the team's own work. You may submit the same code as the rest of your team.

Part 1: Distributed DDL Processor (60 pts)

Write a program runDDL that executes a given DDL statement on a cluster of computers each running an instance of a DBMS. The input to runDDL consists of two filenames (stored in variables clustercfg and ddlfile) passed in as commandline arguments. The file clustercfg contains access information for each computer on the cluster. The file ddlfile contains the DDL terminated by a semi-colon to be executed. The runDDL program will execute the same DDL on the database instance of each of the computers on the cluster concurrently using threads. One thread should be spawned for each computer in the cluster.

runDDL should report success or failure of executing the DDL for each node in the cluster to standard output.

You may test your program on a single computer by using different databases to simulate the multiple computers.

You may first write a non-threaded program that executes the DDL on each computer in the cluster in a for-loop and then convert the program to a multi-threaded version.

Part 2: Catalog for Distributed Tables (40 pts)

Modify your program from Part 1 so that it stores metadata about the DDL being executed in a catalog database. The access information of the catalog database will be provided in the clustercfg file as well. The metadata should be stored in a table


where

If this table does not exist in the catalog database, your program will create the table. The field tname should be obtained using a simple parsing of the DDL for the keyword TABLE that precedes the table name (we will switch to a more sophisticated SQL parser later). The fields partmtd, partcol, partparam1, partparam2 should be left as null for this assignment. This table should only be updated on successful execution of the DDLs. For create table DDL, this table should be populated and for drop table DDLs, the relevant entries in this table should be deleted. This operation need not be multi-threaded.

General Requirements

Submission Procedure

Since Part 1 and Part 2 are cumulative, you should submit only one program, i.e., the final code for Part 2.

See Submission Instructions


Sample contents of a clustercfg file


Sample contents of a ddlfile file


Sample Output of `./run.sh ./cluster.cfg ./books.sql`