Computer Science 3510
Fall 1999
Programming Assignment 1

Due date:
First solution: Tuesday, Sep 14, 5:00 pm
Second solution: **Fri Oct 8, 5:00pm

START EARLY ON THIS ASSIGNMENT. DO NOT WAIT

If you experience difficulty, ask for help early. Do not wait until the last minute to ask for help.

This is mainly an exercise in the use of dynamic memory allocation. It also uses a cgi program to process a web page query.

Hash Tables

Accompanying this assignment is an implementation of hash tables. This is not an application, but is a library module that might be used in many applications. You can find the implementation in directory /export/stu/3510Abrahamson/prog1 on the Sun or Dell machines (under Linux) in Austin 320, in files table.cc and table.h. Copy those files so that you can work on them.

Hash tables are described in file table.cc.

Hash Table Modification 1.

The implementation has the difficulty that it has a fixed size table, and the table can fill up. Your assignment is to modify the table implementation so that the table does not fill up.

It is not acceptable merely to increase the fixed size to some very large value. That would be wasteful of memory when few things are actually put into the table. Instead, your new implementation must reallocate the table when it runs out of memory. Do this by adding another field to the table type, indicating the physical size of the array --- that is, how many slots the cell array currently has. When the cell array becomes nearly full, allocate a new, larger array, and copy everything that was in the old array into the new array. Be sure to update the field that indicates the physical size of the array of cells.

Watch out. In a hash table, the position of an item is computed based on a hash function, and the position computation depends on the physical size of the hash table. So when the physical size changes, the positions of entries also change. To copy the old values into the new array, you will need to do hashInsert calls, so that the positions are set correctly. Think about this.

When you allocate a new array, don't get just one more cell. If you do, then a growing hash table becomes very expensive, since each insertion needs to make a copy of the entire table. One idea is to double the size of the table when it is reallocated. At any rate, increase the size by some factor that is greater than one.

You will need to modify function hashFull. Do not delete it, since it is part of the data type, even if now a useless one. (It is possible, as far as you know, that someone has already written a program that uses these hash tables, and that employs function hashFull to ask whether the table is full. You need for such a program to continue to work with the new table implementation.) Your hash tables never fill up. Think carefully about the correct implementation of hashFull. Read the contract for hashFull to see what its advertised behavior is. How might some other program be using hashFull?

Hash Table Modification 2.

A second difficulty is that this implementation allows the table to become almost completely full. A hash table of the kind implemented here that is almost completely full is very inefficient to search. It is better to keep at least 1/4 of the cells empty. Modify the implementation to reallocate the table when it becomes 3/4 full. Be careful: if you use integer arithmetic, you might get incorrect results. For example, in C++, 3/4 is 0. How can you test whether a > (3/4) b using only integer arithmetic? Can you rearrange this inequality so that it does not require division?

A CGI Program

World wide web pages are not necessarily fixed. They can be computed on the fly, either on the client computer or on the server computer. CGI is a method of allowing the server to build dynamic web pages.

File /export/stu/3510Abrahamson/prog1/form.html creates a form that can be passed to a cgi program. When the user hits the submit button, information from the form is passed to the indicated program (formquery.cgi) for processing. The information is put in the standard input of the program. For example, if you enter a name of Hilbert and a password of Topology, then the standard input to formquery.cgi will contain

    LASTNAME=Hilbert&PASSWORD=Topology
Program formquery.cgi is supposed to look up information in a file called grades. The lines are of the following form.
    name password info
    name password info
    ...
For example, file grades might contain the following.
    Hilbert     Topology A
    Grothendiek xxxxx    B
    Dahl        r:z$$    I am not finished grading your work.  Sorry.
The cgi program works by writing, to the standard output, a preamble describing what the form of the output is, followed by the page to show. To cause the web browser to display a new Html page, the preamble can be
Content-type = text/html

The Html page to be displayed follows the preamble in the standard output. (There is additional information provided to the cgi program. Some environment variables are set. For example, environment variable REQUEST_METHOD is set to POST and variable CONTENT_LENGTH is set to the number of characters passed to the cgi program in its standard input. The values of environment variables can be obtained from function getenv. We do not use these variables in this simple program.)

What to Do

Get file formquery.cc from directory /export/stu/3510Abrahamson/prog1. Also get all of the html files there. Most of it is written for you. You need to write the main program. Write it. You will need to read the contracts of the prewritten functions. The main program should not be very long. I wrote it in 27 noncomment lines. That does not mean that you should not provide any comments! Also, do not be worried if your line count is not exactly 27. If your noncomment line count for the main program is 100, you are doing something wrong.

Testing

First test the hash tables. Do not just make the modifications to the table.cc and table.h file and then start on the cgi program without testing the table implementation!!!!!!!!!!!!!!!!

Write a program to do the testing of the hash table implementation. Test things so that you think all aspects of the implementation have been tested. Do not be lulled into a false sense of security just because your program passes a few very simple tests. In the past, I have had assignments turned in whose tests did not even put enough things in the hash table to trigger a reallocation. As a result, the newly added code was never even run during the tests. Do not make that mistake. How can you be absolutely sure that you are running your new code?

After you are satisfied that the table implementation is working correctly, work on the cgi program. Compile it so that the executable code is put in file formquery.cgi. For example, a short compile line would be

  g++ -g -Wall -o formquery.cgi formquery.cc table.cc
(Of course, using a makefile is easier than typing this over and over.)

You can test it without using a web browser. Create your own grade file. Write some data files that are what the web browser might provide to your program. Run the program, redirecting the standard input to those files. For example, if you create a data file called testdata containing

LASTNAME=Hilbert&PASSWORD=Topology
then run your program using command
formquery.cgi <testdata
The program should print the Html page that would be sent to the browser for display.

When your cgi program works to your satisfaction, try it with a browser. Create a directory called 3510_html in your home directory. Create a subdirectory of that called prog1, or whatever else you want to call it. Put, in that directory, all of the html files that are part of this program, as well as formquery.cgi and the grades file. Make sure the permissions are correct. (formquery.cgi needs to be world executable, and the html files need to be world readable. Use ls -l to see the permissions, and chmod to change them.) Now start a web browser, and visit page

  http://darla.math.ecu.edu/~myid/prog1/form.html
where you should replace myid by your login id. (Note: By placing the grades file in the 3510_html directory, you lose control over it, since it will be readable by everybody. That is a security issue that we do not attempt to deal with here.)

Turning in your solution

Do all modications for the first pass. The second pass is to correct mistakes that are pointed out in the first pass.

Email me your solution, as described in the syllabus. Attach all three files table.h, table.cc and formquery.cc. Please do not rename them. Be sure that your name is in every file. Do not attach your test program. Do not put any part of your program in the body of the message.

I will compile test your solution. I will also test your hash table implementation separately with another tester. Be sure that your solution is prepared to link with my tester. It must, therefore, have the same function prototypes as the original implementation. If you change the function prototypes, it will be inconsistent with what my test program expects.

Note: Your solution is expected to compile. Solutions that do not compile will receive no credit, but will receive some feeback.