Computer Science 3510
Summer 2001
Programming Assignment 1

Due date:
First solution: Thursday, June 28
Second solution: Friday, July 6

START EARLY ON THIS ASSIGNMENT. DO NOT WAIT

If you experience difficulty, ask for help early. Do not wait until the last minute to ask for help.

Try to turn in a high quality solution for your first version, so that the second version is just touch up. Do not skip the first version.

This is mainly an exercise in the use of dynamic memory allocation. It also uses a cgi program to process a web page query.


Hash Tables

Accompanying this assignment is an implementation of hash tables. This is not an application, but is a library module that might be used in many applications. You can find the implementation in directory /export/stu/3510Abrahamson/prog1 on the Sun or Dell machines (under Unix) in Austin 320, in files table.cc and table.h. Copy those files so that you can work on them. Those files are also available by following these links.

  • table.cc
  • table.h
  • Hash tables are described in file table.cc. Read that description.


    Hash Table Modification 1.

    Hash tables use arrays to hold data. In table.cc, the array is called the cell array, and is pointed to by a pointer called cells.

    The implementation in table.cc has the difficulty that it has a fixed size array, and the table can fill up. Your assignment is to modify the table implementation so that the table does not fill up.

    It is not acceptable merely to increase the fixed size to some very large value. That would be wasteful of memory when few things are actually put into the table. Instead, your new implementation must reallocate the table when it runs out of memory. Do this by adding another field, or variable, to the table type, indicating the physical size of the array --- that is, how many slots the cell array currently has. When the cell array becomes nearly full, allocate a new, larger array, and copy everything that was in the old array into the new array. Be sure to update the field that indicates the physical size of the array of cells.

    Watch out. In a hash table, the position of an item is computed based on a hash function, and the position computation depends on the physical size of the hash table. So when the physical size changes, the positions of entries also change. To copy the old values into the new array, you will need to do hashInsert calls, so that the positions are set correctly. Think about this.

    When you allocate a new array, don't get just one more cell. If you do, then a growing hash table becomes very expensive, since each insertion needs to make a copy of the entire table. One idea is to double the size of the table when it is reallocated. At any rate, increase the size by some factor that is greater than one.

    You will need to modify function hashFull. Do not delete it, since it is part of the data type, even if now a useless one. (It is possible, as far as you know, that someone has already written a program that uses these hash tables, and that employs function hashFull to ask whether the table is full. You need for such a program to continue to work with the new table implementation.) Your hash tables never fill up. Think carefully about the correct implementation of hashFull. Read the contract for hashFull to see what its advertised behavior is. How might some other program be using hashFull?

    Hash Table Modification 2.

    A second difficulty is that this implementation allows the table to become almost completely full. A hash table of the kind implemented here that is almost completely full is very inefficient to search. It is better to keep at least 1/4 of the cells empty. Modify the implementation to reallocate the table when it becomes 3/4 full. Be careful: if you use integer arithmetic, you might get incorrect results. For example, in C++, 3/4 is 0. How can you test whether x > (3/4)y using only integer arithmetic? Can you rearrange this inequality so that it does not require division?


    A CGI Program

    World wide web pages are not necessarily fixed. They can be computed on the fly, either on the client computer or on the server computer. CGI is a method of allowing the server to build dynamic web pages, so it is a form of server-side computation.

    File form.html is a form that can be filled out. When the submit button is pushed, the values that were put into the form are passed to a cgi program called formquery.cgi. There is nothing really special about a cgi program; it is an ordinary program. The information on the form is sent to the cgi program as its standard input. For example, if you enter a name of Hilbert and a password of Topology, and push the submit button, then the standard input to formquery.cgi will contain

        LASTNAME=Hilbert&PASSWORD=Topology
    
    Program formquery.cgi is supposed to look up information in a file called grades. The lines are of the following form.
        name password info
        name password info
        ...
    
    For example, file grades might contain the following.
        Hilbert     Topology A
        Grothendiek xxxxx    B
        Dahl        r:z$$    I am not finished grading your work.  Sorry.
    

    The cgi program works by writing, to the standard output, a preamble describing what the form of the output is, followed by the page to show. To cause the web browser to display a new Html page, the preamble can be

    Content-type = text/html
    
    
    The Html page to be displayed follows the preamble in the standard output.

    What to Do

    Get file formquery.cc Also get all of the html files there. They can also be obtained from the links below.
  • formquery.cc
  • formquery.html
  • form.html
  • badinfo.html
  • badname.html
  • badpassword.html
  • Most of formquery.cc is written for you. You need to write the main program. Write it. You will need to read the contracts of the prewritten functions. The main program should not be very long, and is very simple. Roughly 25 to 30 noncomment lines should suffice. That does not mean that you should not provide any comments! I am just not counting them.

    Use the functions provided. Read in the information from the web browser into a hash table (one or two lines). Then read in the information from the grades file (one or two lines). Then get the name and password from the hash table where the information from the web browser was put. (This involves using the hash table functions.) Then look up the name in the hash table that holds information from the grades file. Once you have the information, you can check the password. Finally, assuming all is looking right, you write the result form to the standard output (by calling build_form). You will need to check for various conditions, such as a bad name or a bad password. See the html forms that are provided for handling those situations.


    Testing

    First test the hash tables. Do not just make the modifications to the table.cc and table.h file and then start on the cgi program without testing the table implementation!!!!!!!!!!!!!!!!

    Write a program to do the testing of the hash table implementation. Test things so that you think all aspects of the implementation have been tested. Do not be lulled into a false sense of security just because your program passes a few very simple tests. In the past, I have had assignments turned in whose tests did not even put enough things in the hash table to trigger a reallocation. As a result, the newly added code was never even run during the tests. Do not make that mistake. How can you be absolutely sure that you are running your new code?

    After you are satisfied that the table implementation is working correctly, work on the cgi program. Compile it so that the executable code is put in file formquery.cgi. For example, a short compile line would be

      g++ -g -Wall -o formquery.cgi formquery.cc table.cc
    
    (Of course, using a makefile is easier than typing this over and over.)

    You can test your program without using a web browser. Create your own grade file. Call it grades. The assignment says that the grade file will be called grades. Write some data files that are what the web browser might provide to your program. Run the program, redirecting the standard input to those files. For example, if you create a data file called testdata containing

    LASTNAME=Hilbert&PASSWORD=Topology
    
    then run your program using command
    formquery.cgi <testdata
    
    The program should print the Html page that would be sent to the browser for display. It will print the html source; it will not display the page using a browser.

    When your cgi program works to your satisfaction, try it with a browser. Do the following.

    1. Compile your program on one of the Sparc machines. This is necessary, because you will be running it on darla, which is a Sparc machine.

    2. Create a directory called public_html in your home directory. Create a subdirectory of that directory called prog1, or whatever else you want to call it. Put, in that directory, all of the html files that are part of this program, as well as formquery.cgi and the grades file. Make sure the permissions are correct. (formquery.cgi needs to be world executable, and the html files need to be world readable. Use ls -l to see the permissions, and chmod to change them. A command that will work is

        chmod a+r *
       
      See the unix notes on the course web page.)

    3. Start a web browser, and visit page

        http://darla.csci.ecu.edu/~myid/prog1/formquery.html
      
      where you should replace myid by your login id.

    4. Try the form.

    (Note: By placing the grades file in the public_html directory, you lose control over it, since it will be readable by everybody. Anybody can get it directly using a web browser. That is a security issue that we do not attempt to deal with here. There are ways to solve it.)


    Turning in your solution

    Write your first version. It should, ideally, be the full program, and it will be graded on the assumption that you should be turning in the full, completed program. If you have not completed the whole thing, turn in what you have done. After receiving feedback, write the second version, also implementing the full assignment. See the syllabus for how to turn in the assignment.

    I will compile and test your solution. I will also test your hash table implementation separately with another tester. Rest assured that I will insert many things into the table, enough to cause more than one reallocation. I will create more than one hash table as well, so your implementation must handle each hash table independently of any others.

    Be sure that your solution is prepared to link with my tester. It must, therefore, have the same function prototypes as the original implementation. If you change the function prototypes, it will be inconsistent with what my test program expects.

    Note: Your solution is expected to compile. Solutions that do not compile will receive no credit, but will receive some feeback for the first version.