Debugging

Measure Twice, Cut Once

Since computer programs are easy to change and to run, it is tempting to throw something together and begin testing and debugging right away. If you do that, you will spend far too long debugging. Try to be cautious about how you program. After you have written a function definition, for example, reread the definition, and convince yourself anew that the function is written correctly. Look for errors, such as memory allocation errors. Shoot for having the function work the first time, without going to extremes of hand checking that take too long.

The Best Laid Plans...

In spite of your best planning, you will make mistakes, and you should plan for them. To fix your program, you will need to use debugging strategy and tactics.

Debugging strategy tells you the large steps involved in debugging. The steps are as follows.

Determine that something is wrong. Simplify if possible to a small input that the program gets wrong.
Isolate the error as closely as possible within the program. For example, you might determine which function is at fault.
Determine exactly what is wrong.
Determine what can be done to fix the program.

Debugging tactics are particular ways to accomplish the goals of a strategy. Usually, isolating the error is the hardest thing to do. Here are some tactics for isolating errors. They also help in step 3, determining exactly what is wrong. Most of the time, however, once you have isolated the error to a sufficiently small part of the program, it is clear what is wrong.

Tactic 1: Successive Refinement

Successive refinement is a method of developing a program gradually. After each stage, you have a working program that does part of the job, or that at least incorporates some of the parts that the full program will hold.

To develop a program by successive refinement, start with a refinement plan. Think about the parts, and how they can be tested. Write each part, and test each before moving on. A big advantage of this is that isolation of errors is usually easy. The error is usually in the (small) part that you have just written.

Tactic 2: Code Inspections

One way to find an error is to read your program carefully to see if it looks correct. Try hand executions. For some errors, this approach will show you the error faster than any other method. But be sure not to give up if code inspection does not reveal to you what is wrong. You can fall back on tracing or running a debugger.

Tactic 3: Tracing

Add prints to your program to show what is happening. Direct the prints to a file, not to the standard output. You might let the main program open a file, and make that file available to all modules using an extern declaration. Be sure that you do not print raw data. Each added print should show who did the print, where it is, and what is being printed. For example, if you insert a print at the top of a loop body in function copy, and you are printing the value of an integer variable n, you might write

   fprintf(trace_file, "copy: Loop top: n = %d\n", n);

if you are using C style output, or

   traceout << "copy: Loop top: n = " << n << endl;

if you are using C++ style output.

If you are printing an array A of n integers at the beginning of function insert, you might write

  {int i;
   fprintf(trace_file, "insert: Begin, A =\n");
   for(i = 0; i < n; i++) {
     fprintf(trace_file, "A[%d] = %d\n", i, A[i]);
   }
  }

Better yet, write a small function that does this. It can then be used more than once, to print the contents of an array.

Tactic 4: Using a Debugger

A debugger can be a powerful tool if used correctly. There are many different debuggers, each with its own characteristics. Typical things that a debugger will do for you are

Show where a program is when an error occurs.
Show the values of variables.
Allow you to stop the program at selected places.
Allow you to step through a program.

Debuggers rely on extra information being put into executable files, telling the names of functions and variables. When you compile a file, you should be sure to tell the compiler to include debugging information. In Unix, for example, you use option -g on the compile command line. For example, you might say

	g++ -g -o program program.cc

to compile C++ program program.cc, and put the executable code into file program.

An example of a debugger is the Gnu debugger (gdb). To debug a program using gdb, you type

	gdb program

where program is the name of the executable file. The debugger has its own language. Use the following with gdb.

Command run runs the program. You can follow the run command by command-line arguments for your program if desired.
Command step runs the program until it gets to the next source line.
Command finish runs the program until the current stack frame finishes.
Command bt shows the run-time stack, most recent frame first, when the program encounters an error or pauses.
Command print x prints the value of variable x. You can only print variables that are in scope in the current stack frame (initially, the top frame).
Command up moves down one frame in the stack, so that you can print variables in the next frame down. (Yes, command up moves down.) Command down moves up one frame. Command up 3 moves up three frames, etc.
Command break name causes the program to pause, and return control to the debugger, when the program enters function name.
Command cont continues running the program after a pause.
Command help provides information about more commands. Type help stack to find out about commands for examining the run-time stack.
Command quit exits the debugger.

Which Tactic Should I Use?

Always use successive refinement. Even then, you will encounter errors that require other methods.

You will need to develop, though experience, a feel for which is more useful in each situation where you get an error: a debugger, tracing or code inspection. Here are a few suggestions on how to choose. Use them only as a starting point. Through experience, you will develop your own guidelines.

If you get a memory protection fault, such as a segmentation violation, a debugger will tell you where the fault is very quickly. Run the program under the debugger, and look at the run-time stack when the error occurs.
If you get an apparent infinite loop, run the program with a debugger. Break the program (using a ctrl-C in Unix) to stop it. Look at the run-time stack to see where the program is stuck. Look at some variables. If you still do not see why the program is in an infinite loop, let it go a bit longer, and stop it again. See what changed from one snapshot to the next.
If you have a function that you are suspicious of, you can break the program in that function and step through the function.
If your bug seems to be a subtle one, use tracing. Print things as the program goes. Print the values of variables whose values you think you know. Remember, since the program does not work, you must be mistaken about what it is doing, so do not trust your feelings too much. Shine some light on the execution, and look at what is going on.
If your program seems to be doing things that are impossible, the problem is sometimes an uninitialized variable, and sometimes stems from deleting memory before you are done with it. Do your best to isolate the problem, either using a debugger or by tracing. Find out where things go wrong. Then rely on code inspections of the suspicious code. Deleting memory prematurely is among the most difficult kind of bug to find, because sometimes it shows up far in the program from where the error occurs. If you suspect this, look carefully at your program, searching for premature deletions.
Suppose that you have a program that is working, and you make a small change to it. If the change causes the program no longer to work, you can be fairly sure that the problem is in the code that you just added or changed. Inspect it, to see if it looks reasonable.