Computer Science 3300
Fall 2006
Section 001
Programming Assignment 2

Assigned: Monday, September 11
First version due: Thursday, September 21, 11:59pm
Second version due: Monday, October 2, 11:59pm


Read the entire assignment before you begin to work on it.

Table of contents

  1. Requirements
  2. Design issues
  3. Algorithmic issues
  4. Implementation issues and hints
  5. Testing
  6. Debug prints
  7. A plan
  8. Submitting your work
  9. Frequently asked questions
  10. Makefiles


Requirements

Write a program that takes two arguments on the command line. The first should be a string (call it str) and the second the name of a file (another string). Your program should print, on the standard output (cout), every line of the given file that contains str as a substring. Each line must be numbered, starting with line 1. For example, if your executable program is called showlines, the command

  showlines fish myfile.txt
will print each line in file myfile.txt that contains "fish". The output might be as follows.
3. The terns eat fish that they catch
10. fishermen see the terns migrating
Note that the string being sought is not required to be a full word. It can be part of a word, for example.

If the file to be searched does not exist, your program must print a message that the file does not exist, and immediatly stop.

Your program can presume that each line of text has no more than 255 characters.


Design Issues

This program must be written according to the following design guidelines. Programs that do not follow these guidlines will receive very low scores.

  1. This program must be written in two modules.

  2. One module, which must be called showlines.cpp, must contain the main program. That module must also contain a comment describing what the program does.

  3. The other module, which must be called search.cpp, must contain a function called search that takes two parameters: (1) an array pattern of characters telling the string to search for and (2) an array text of characters to search. Both arrays are assumed to be null-terminated arrays of characters. That is, there is a null character ('\0') in the array that indicates the end of the string. (The null character is not part of the string; it is only an end marker.) The search function must produce a boolean result: true if the pattern occurs in the text string, and false if not. For example, search("dog", "the dog jumped on the couch") yields true. The heading for the search function must be

       bool search(char pattern[], char text[])
    

  4. File search.cpp must contain another function that is used to help in implementing function search. That function takes three parameters: a pattern, a text and an integer k. It returns a boolean result, telling whether the pattern occurs in the text starting at index k. For example, if this function is called occursAt, then occursAt("the", "the dog jumped on the couch", 0) returns true, but occursAt("dog", "the dog jumped on the couch", 0) returns false since "dog" does not occur at the beginning of the text (index 0). But occursAt("dog", "the dog jumped on the couch", 4) returns true.

    You may have other helper functions in search.cpp. Provide a clear and concise contract for every function.

  5. There must be a third file, search.h, that contains a prototype for the search function. (The prototype is just the function heading followed by a semicolon.) Both showlines.cpp and search.cpp must include search.h. That is, they will have a line that says

    #include "search.h"
    

    File search.h must have a clear and concise contract describing what your search function does. The contract must clearly describe the role of each parameter, and the meaning of the result. The contract must not describe irrelevant things, such as the method used, or other details of the function body.

    File search.h must not contain prototypes or comments about any functions other than search. Module showlines.cpp must not include search.cpp. So, if you write
    #include "search.cpp"
    
    you are doing something wrong.

  6. The main program must use the search function from the search.cpp module to check a line of text. It cannot do the check directly itself.


Algorithmic Issues

The main program should perform a loop.

  linenumber = 1.
  Loop
    Read a line into an array.  Determine how many characters are in the line.
    If the end of file was reached then exit the loop.
    If the pattern occurs in the line then print the line, numbered by linenumber.
    Add 1 to linenumber
  End loop

The search function should also contain a loop

  k = 0
  While there are still enough characters left in the text array from index k to the end do
    If the pattern occurs in the text at index k
      return true
    Add 1 to k.
  End loop
  Return false.


Implementation Issues and hints

Implementation requirements

  1. The standard library contains a function called strstr that checks to see whether one string is contained in another. For this assignment you may not use that function, or any similar function. Do you own search. You may use library functions strlen and strcpy.

  2. Your program must not use any global variables. All communication between functions must be by parameter passing.

  3. Your program must be well indented and readable.

  4. Your program must contain debug prints, as described below.

  5. You must turn in your testers for occursAt and for search.

Implementation hints

You can get the command line parameters as follows. The heading for the main function should be

   int main(int argc, char* argv[])
Then the pattern is argv[1] and the file to read is argv[2]. Each of those is an array of characters that ends on a null character ('\0'). For example, to open the file, you can write
   ifstream inf(argv[2]);
You can test whether the file existed by asking whether the file opened correctly.
  if(inf.fail( )) 
  {
    ...
  }
tests whether the last operation on inf (including opening the file) failed.

Now you can use inf to read the file. Suppose that Line is an array of 256 characters. The following statement reads one line of text into array Line.

  inf.getline(Line, 256);

You can test whether you are at the end of the file using expression inf.eof( ). For example, you can exit a loop when at the end of the input file as follows.

  if(inf.eof( )) break;
Be careful about eof( ). It only returns true if you have tried and failed to read a character beyond the end of the file. If you are at the end of the file, but have not attempted to read something past the end, eof( ) returns true. (What were the people who designed this thinking?)

Compiling and running the program

Put your program in a separate directory, as you did for the first assignment. Create a file called Makefile to build your program. Here is a sample Makefile. (Just copy it and paste.)

CC=g++ -g -Wall -Wshadow -Wpointer-arith -O
showlines: search.o showlines.o
	$(CC) -o showlines search.o showlines.o
test1: search.o test1.o
	$(CC) -o test1 search.o test1.o
test2: search.o test2.o
	$(CC) -o test2 search.o test2.o
dotest1: test1
	./test1
dotest2: test2
	./test2
search.o: search.cpp
	$(CC) -c search.cpp
showlines.o: showlines.cpp
	$(CC) -c showlines.cpp
test1.o: test1.cpp
	$(CC) -c test1.cpp
test2.o: test2.cpp
	$(CC) -c test2.cpp
The blue spaces must be tabs. (See below for information on the structure of a makefile.) Command
g++ -c search.cpp
in Makefile compiles search.cpp, making a partial program called search.o that can be combined with other partial programs to make an executable file. Command
g++ -o showlines search.o showlines.o
combines (or links) search.o and showlines.o into an executable program, called showlines. Putting it together, all you need to type is
make
to build your showlines program.

To look for pattern "fish" in file "info.txt", use command

  showlines fish info.txt


Testing

Test each function separately. Write a main program that tests your occursAt function by trying it on a few fixed strings. Here is an example tester. The prototype for occursAt is needed because that function will not have a prototype in search.h.

  #include "search.h"
  #include <iostream>
  using namespace std;
  bool occursAt(char pattern[], char text[], int k);

  void test(int n, char pattern[], char text[], int k)
  {
    cout << "Test " << n << ": ";
    cout << occursAt(pattern, text, k);
    cout << "\n";
  }

  int main()
  {
    cout << "The first three tests should yield 1\n\n";

    test(1, "the dog", "the dog jumped on the couch", 0);
    test(2, "the couch", "the dog jumped on the couch", 18);
    test(3, "jumped", "the dog jumped on the couch", 8);

    cout << "\nThe next five tests should yield 0\n\n";

    test(4, "dog", "the dog jumped on the couch", 0);
    test(5, "dirigible", "the dog jumped on the couch", 4);
    test(6, "dolly", "the dog jumped on the couch", 4);
    test(7, "make", "the dog jumped on the couch", 3);
    test(8, "jumped", "the dog jumped on the couch", 9);
    return 0;
  }
If you call this test1.cpp, then, using the makefile shown above, command
make dotest1
will build the tester and run it.

Write another main program that tests your search function. You are not required to write contracts for functions that perform this "unit testing".

You are required to turn in your testers for the occursAt and search functions. Do not skip them, and do not throw them away. Call the program that tests occursAt test1.cpp, and the function that tests search test2.cpp.

Now write a main program that implements the program according to the requirements. You can feel confident that the search tool that it uses is correct.

Test your program. Test that it does something sensible when the file that it is supposed to search does not exist.


Debug prints

Large pieces of software make it possible to ask them to show what they are doing, so that, if something goes wrong, you can turn on the lights and see details of the process, and find out where the mistake is occurring.

The idea is to put extra prints in your program, but not to get rid of them when things appear to be working. Leave them in place, in case you need them again later. You can do that as follows. Surround each debug print (or section of debugging code) as in the following example.

#    ifdef DEBUG
       cout << "pattern = \"" << pattern << "\"\n";
       cout << "text    = \"" << text << "\"\n";
       cout << "index   = " << k << "\n";
#    endif
Now, at the top of your program, write
#define DEBUG
to switch on debugging. Just comment this out to turn off debugging. For example, write
//#define DEBUG
Then any parts between #ifdef DEBUG ... #endif will be ignored by the compiler, as if they were not written. You do not need to remove them.

Your program is required to provide debug prints in the occursAt function. When debugging is turned on, at the start of that function, it must show the pattern, the line and the index (k) that it is given. When that function ends (and debugging is turned on), it must report its answer. Make sure the prints are clear and readable. Never print raw numbers or string, without saying who is doing the print and what the numbers and strings are.


A Plan

Here is a suggestion of steps to get this done.

  1. Write the occursAt function. Include a clear contract. Do a hand simulation with small strings to see that it appears to be right. Be sure to keep it well-indented and readable.

    Create a main program module, test1.cpp, whose only job is to test the occursAt function. It should contain line

    bool occursAt(char pattern[], char text[], int k);
    
    to allow it to use the occursAt function. Run the test, to see whether occursAt appears to be working. Does it work? Inexperienced programmers tend to undertest their tools. Do more testing than you think you need. Always test extreme or unusual things. For example, what if the string occurs at the end of the string? What if the index k is too large for the pattern even to fit at the end?

  2. Regardless of whether occursAt is working, add debug prints to it. If it is not working, you can use them to help debug it. If it already works, you still might want the debug prints later. Compile and run this, with debugging turned on. Then compile and run it with debugging turned off.

  3. Write the search function. Keep it well indented and readable. Do a hand simulation to see whether it appears to work.

    Now create another test file, test2.cpp, that tests the search function by trying it on a few fixed inputs. Test it. If it does not work, fix it. Do not undertest it.

  4. Now write the main program, showlines.cpp. Add some debug prints. Before opening the file, show the name of the file, for example. Test your program.

    Be sure that all debug prints are either turned off or removed. Submit all of your files (search.cpp, search.h, showlines.cpp, test1.cpp and test2.cpp).

  5. After you get feedback on your first version, you might want to make some improvements. Make them, and turn in the second version.


Submitting Your Work

To turn in this program, log into one of the Unix computers in the lab. (You can log in remotely.) Ensure that your files are there. Change to the directory that contains those files. Then issue one of the following commands.

First version
~karl/3300/bin/submit 2a search.cpp search.h showlines.cpp test1.cpp test2.cpp
Second version
~karl/3300/bin/submit 2b search.cpp search.h showlines.cpp test1.cpp test2.cpp

Frequently Asked Questions

Makefiles

A simple makefile starts with definitions, such as

CC=g++ -g -Wall -Wshadow -Wpointer-arith -O
which says that $(CC) should be replaced by g++ -g -Wall -Wshadow -Wpointer-arith -O. After that are parts of the form
  target: dependencies
	command
	...
	command
where target is the name of something to build, dependencies is a list of things that need to be built before building the target, and the commands are a sequence of commands to do to cause the target to be built. Each command is preceded by a tab character.