2. General Advice

One of the goals of this course is to teach you techniques for developing large-scale software. But it is not feasible to ask you to write large-scale software, so you will be expected to use large-scale techniques on small-scale software.

Even small-scale software can be formidible to write if not approached in a sensible way. Without discipline, you will be tempted to take what appear to be shortcuts, but that actually take you into a vast swamp where your feet get stuck in the mud. Instead of making progress on your project, you will spend all of your time trying to get out of the swamp.

The smart thing to do is to resist the temptation to go into the swamp, and instead to adhere to sensible software development principles.


Some general principles for software development

  1. Strive for simplicity, clarity and elegance. Many of the errors that students make is because they take complicated approaches to simple problems.

  2. Embrace the importance of documentation. A certain faculty member who did not believe in documentation spent a long time writing a piece of software. He put the software aside over the summer and came back to it in the fall When he looked at it again, he said that he could not understand any of it. He could not even reverse-engineer it. He had no choice but to throw it away and start over.

    You are required to write documentation for this course. Since you have to write it, it makes sense to write it early.

    I am often asked, "why doesn't my function work?" Naturally, I ask what the function is intended to do. I can't help much without knowing that. Most of the time, the student does not know what it is intended to do. How can you solve a problem if you don't know what the problem is?

    Without well written documentation, you find yourself spending a lot of time reverse-engineering what you wrote earlier. You also waste time trying to write a function when you have only a vague notion of what the function is supposed to do. You are stuck in the swamp.

  3. Use examples and pictures. Before you write any code, work out algorithms by trying them on examples. If your code uses data structures, draw pictures of those data structures. Trying to keep everything in your head is a huge mistake.

  4. Use top-town design. When you are working out a definition of function F, you often find yourself thinking "it would be nice if I had a function G that did …"

    Top-down design tells you to work as if you have such a function G. Give it a name and write a contract for it. Use it. Don't go implement it right away. You will lose your train of thought about F.

    Once you are happy with your definition of F, write a definition of G. You can use top-down design for that too.

    If you need two or three functions, fine. Use them, then write them. You won't run the program until the necessary pieces are all written, so the order in which you write those pieces does not matter.

  5. Develop incrementally: Use successive refinement.

    Beginners typically imagine that the best way to write a program is to write the entire thing and then to begin testing it. But doing that takes you into the swamp.

    There will inevitably be many errors. It will be difficult to determine where in your program each error occurs. Sometimes, two errors work together to make it difficult to find either one.

    Experts know to write software in small increments. Write a small part, then test that. If it does not work (which it almost surely won't), fix it before moving on.

    Each increment should lead to a version that does something that can be tested. If you need to write two or three functions in order to be able to run a test, fine. But don't write more than you have to in a single increment.

    This approach is called successive refinement. The most important thing about it is that, when you discover a mistake, it is usually in the part that you have just added. So you can localize your search for the error without much thought at all.

  6. Turn on the lights. Use traces to diagnose an error. Make the program write out what it is doing.

    Never show raw data in a trace. Always say who's talking (the function where the trace is) and clearly label information that is shown. For example,

      printf("spotter: Trying to find globule %d\n", toFind);
    
    shows who is talking (spotter) and what it is trying to do.

    If there is important information, show it. Show functions' parameters and results. Clearly label them. A trace such as

      printf("I am here");
    
    is worse than useless. (When I see that, I am reminded of a person I once knew who wrote "taken last thursday" on the backs of photographs.)

    Experts prepare their software for testing by adding traces as they write the software, not as an afterthought when an error shows up. They enclose the tracing code in tests so that it can easily be switched on and off. For example,

      if(traceLevel > 0)
      {
         printf("getWidgetSize: finding the size of widget ");
         printWidget(widget);
         printf("\n");
      }
    
    creates a trace that can be turned on and off by setting the value of global variable traceLevel.

  7. Use a debugger where appropriate. A debugger can be a useful tool in certain circumstances. It can quickly show you where any of the following are happening.

    • Infinite loop.
    • Infinite recursion.
    • Memory fault.

    For more subtle problems, tracing is usually your best tool.

  8. Test thoroughly. Experts test their software thoroughly. They try to find errors. They torture their software.

    Beginners undertest software. It might be because they had to work hard to get the software to do one test correctly, and seeing it fail on another test is just too stressful. It might be because the inexperienced programmer imagines that if the software handles one test correctly, it handles others too.

    Whatever the reason, most submissions that I see don't work. That shows up in scores. Assignment scores show up in grades. If you want to get a good grade, do more tests.


Some apparent shortcuts that only take you into the swamp

  1. "All I need is a working program. I don't really need to know why it works."

    For beginners, it is tempting to concentrate on the end product. Since I am not going to submit the paperwork, the beginner thinks, why bother with paperwork? I will sit down at the computer and start typing.

    This is just about the worst thing that you can do. What you type will not work. Since you do not really understand what you have typed (and since you don't really see a need to understand it), you try to fix it by making random changes. You can spend the rest of your life doing that without success. If any part of your program was correct, you will only ruin that.

    This is a sure path into the swamp. You will never get out. A student once showed me his attempt to write one small function. It was nonsense. We worked it out and wrote it in about 20 minutes. The student was amazed. He said that, what we got in the end looked similar to what he had written to begin with, though there were slight differences. Then he spent 8 hours doing random changes. When I asked why he had done that, he said that he didn't think that he had time to plan first.

    Always plan before you type. When fixing an error, always diagnose the problem before making any changes.

  2. Develop and use common sense. Beginners tend to get the idea that computer software is cryptic and does not need to make sense. Nothing could be further from the truth.

    Make sure that what you write is sensible. If something is a vertex, don't call it an edge. If a function writes one edge, don't call it WriteEdges. If it writes several edges, don't call it WriteEdge.

    Beginners often end up violating common sense during the process of debugging a program. Suppose that your program has a variable called numVertices that holds the number of vertices in a graph. You write something like

      int info[numVertices];
    
    to create an array of information about vertices. But then you notice that you really needed an array whose size is one greater than the number of vertices. A sensible thing to do is to change the line that creates the array.
      int info[numVertices + 1];
    
    What students frequently do is, if there are 10 vertices, to store 11 into numVertices. But then numVertices is no longer the number of vertices. Doing that kind of thing is a very bad idea. It is confusing. Trickery like that tends to be lethal to the correctness of large pieces of software.

    When you write a function to print a graph, don't make it destroy the graph. Would you consider it reasonable to destroy a house in order to take photographs of it?

    Let common sense be your guide.

  3. Proofread. Give yourself time to proofread you work. Pay attention to documentation. I frequently see documentation with spelling, punctuation and grammatical errors.

    If you cannot make sense out of your function definitions, nobody else will be able to either.

    Look for nonsense in your code. One student submitted an assignment that said, if the queue is empty, then remove the first thing from the queue. Of course, an empty queue does not have a first thing to remove.

  4. Stay within the subset of the language that you know. A common blunder is to try inventing language features of your own. Usually, they are giberish, as far as a compiler is concerned. Sometimes, though, they have a meaning, just not what you think they mean. It can take a lot of debugging time to realize that your guess was not right.

    But do learn to generalize in the right way. For example, where you can write one kind of statement, you can write any kind of statement. Where you can write one expression of type T, you can write any expression of type T.


Other advice

The programming assignments are designed to teach fundamental concepts. If you don't bother to read the assignment, or if you try to avoid doing what the assignment tells you to do, you will be avoiding those fundamental concepts, and you will receive little or no credit.

  1. Start early. If you start early, you will probably be willing to try the methods summarized above, and you will develop your software efficiently.

    If you start late, you will probably decide that you do not have time to do those things, and that you have no choice but to try to run across the swamp, hoping not to get stuck in the mud. That won't work. The mud will get you every time.

  2. Never submit software that you have not tested. Never make a change, no matter how trivial, and then submit your work without testing it again.

  3. Do your own work. If you do not do the work then you will not get the education that you are here to get. When you graduate, you will regret that.

    Plagiarism is not limited to copying an entire document. If a significant part of your submission was written by somebody else, you have plagiarized. Plagiarism is much more obvious than you might think, and it is not acceptable behavior.

  4. Lastly, from Mark Twain: Always do right. This will gratify some people and astonish the rest.