Computer Science 3300
Section 001
Fall 2005
Programming Assignment 4

First version due: Nov. 15, 11:59pm.
Second version due: Nov. 28, 11:59pm

Preliminaries

A graph is a collection of vertices connected by a collection of edges. An edge connects exactly two vertices to one another. You can draw a picture of a graph by showing the vertices and the edges connecting them. Here is an example. The vertices are shown as circles with numbers in them and the edges are lines connecting the vertices.

A weighted graph is a graph in which each edge has a number attached to it, called the weight of the edge. Here is a picture of a weighted graph.

A spanning tree of a graph is obtained by deleting as many edges as possible, without making it impossible to get from any vertex to any other by following a sequence of edges. For example, the following is a spanning tree of the above weighted graph.

The weight of a spanning tree is the sum of the weights of its edges. For example, the weight of the above spanning tree is 59. Here is another spanning tree for the same graph. Its weight is 48.

Obviously, some spanning trees have smaller weight than others. A minimal spanning tree is a spanning tree with the smallest possible weight.

Motivation for computing minimal spanning trees (you can skip this)

Suppose that we have a collection of towns, and we must build railroad tracks so that it is possible for a train to get from any town to any other town. Our budget for building tracks is small, so we choose to build as little track as possible. One approach to deciding which tracks to build is to construct a weighted graph, where the vertices are the towns, and the weight of the edge from town A to town B is the length of the railroad track that would need to be built to connect towns A and B directly. Then a minimal spanning tree of that graph would be a good choice for connecting the towns, since it is the cheapest way to connect all of the towns using railroad tracks between towns.

A similar problem occurs when a house is being wired for electricity. All of the outlets on a given circuit need to be connected to the circuit panel, and to each other. To connect them with a minimum amount of wire, you might build a graph having a vertex for each outlet and for the circuit panel, with the weight between two vertices being the wiring distance between those vertices. Then get a minimal spanning tree for the graph.

(A minimal spanning tree is not always the best solution to either of these problems. You can often do better by introducing new railroad track junctions or wiring junctions, called Steiner points. For example, if you have three towns in a triangle, you can connect them together by sending tracks to a point in the middle of the triangle. The middle point is a Steiner point. But good placements of Steiner points are difficult to find, so a minimal spanning tree is a reasonable compromise.)

The assignment

Write a program that reads a description of a weighted graph, and prints the edges that are part of a minimal spanning tree of that graph, and also prints the total weight of the minimal spanning tree.

The input starts with a line that tells how many vertices the graph has. If there are five vertices, then those vertices have numbers 1, 2, 3, 4 and 5. In general, if there are n vertices, then they are numbered 1...n.

Following the first line are the edges, one per line. Each edge line has three integers on it. Line

2 4 50

indicates that there is an edge between vertices 2 and 4, and that its weight is 50. The end of the input is signaled by a line that contains just a 0. An input describing graph

might look like this.

The output of your program for this input might look like this.


The input graph has 5 vertices, and its edges are as follows.

  vertices    weight
  1   2            9
  1   3           12
  2   4           18
  2   3            6
  2   5           20
  3   5           15
  
A minimal spanning tree uses the following edges.

  vertices    weight
  2   3            6
  1   2            9
  3   5           15
  2   4           18

The total weight of the spanning tree is 48.

Your program is not required to insist that the three numbers that describe an edge are on one line, but it can if it wants to. For example, if your program allows an input like the following, that is acceptable. But your program is not required to allow such inputs. (You will find it easier to allow these additional input forms.)

5
1 2  
9
1 
3 12
2 4 18  2 3  6
2 5 20
3 5 15
0

The only difference between this and the previous input is the placement of line breaks.

Important note. The first number in the input is the number of vertices, not the number of edges. The only way you can determine the number of edges is by reading them and counting them.

Important note. The last line has only one number on it, not three.

Data structures

When designing a program, you need to decide how to represent your data. Typically, you draw diagrams showing arrays, structures, pointers, etc. to understand the data representation.

Create a structure type Graph. You can store a graph as an array of edges, where each member of the array is a triple giving two vertex numbers and a weight. A triple is just a structure with three variables in it. Of course, you must also store the number of edges and the number of vertices, in addition to the array. Store those numbers once for the graph, not once for each edge. So the graph itself will be a structure holding the number of vertices, the number of edges and an array of edges.

You may presume that there will be a maximum of 100 edges in the graph, but you must design your program so that it is very easy to change the maximum number of edges to some other number.

Use sensible names for things. If something is an edge, don't call it a graph. Call it an edge. Do not call a vertex and edge, or a graph a vertex. Call things what they are.

Computing minimal spanning trees

Here is a simple algorithm for computing a minimal spanning tree of a weighted graph. The algorithm starts without any edges, and adds edges one at a time until it has built the tree.

Create a new Merge object. The items in the sets are the vertices of the graph. Two items are in the same set if they are connected, either directly or indirectly, using the edges that are currently in the tree. Since the tree initially has no edges, no two vertices are in the same set.
Create an array that will end up holding the edges of the spanning tree. Initially, it has no edges in it, but you gradually add more edges. Also create a count, initially 0, of the total weight of the edges in the spanning tree array. You will find it convenient to make the spanning tree description have type Graph. The spanning tree is a graph, after all.
If the array of edges of the graph is called G.edges, then sort the edges in G.edges by their weights, with smaller weights earlier in the array. (You can use function qsort, described below, to do this.)
Scan through the array G.edges in increasing order of weight. (That is, do the following in a loop.) When you see edge (i,j), with weight w, ask the Merge object whether vertices i and j are already connected. (They are connected if they are together.) If they are connected, then just skip this edge, and go to the next edge. If they are not connected then add edge (i,j) to the spanning tree, add w to the total weight of the tree, and tell the Merge object to merge i and j. Then do the next edge.

When you are finished, you will have created an array holding the edges of a minimal spanning tree of the original graph, and you will have computed the total weight. Print them all.

Using qsort

There is a standard library function called qsort that implements a variant of the quicksort algorithm for sorting arrays. You should include header file <cstdlib> to use qsort. Qsort is a general sorting function, designed to be able to sort any type of array into any desired order. In order to achieve this degree of generality, qsort needs information about the array and how to sort it. It also needs for you to get perform some conversions that, in general, would be questionable, but that work correctly here.

The prototype for qsort is as follows.

     void qsort(void *base, 
                int nel, 
                int width,
                int (*compar) (const void *, const void *));

Parameter base is a pointer to the array that you want to sort. Notice that its type is void*. That just means that qsort knows that base is a pointer, but does not know what type of thing it points to.

Parameter nel is the number of elements in the array, and parameter width is the number of bytes occupied by each element.

Parameter compar is a function. (C++ allows you to pass a function as a parameter to another function.) Compar is responsible for telling qsort the order in which you want the array sorted. Qsort will sort into ascending order, according to what function compar says is ascending order. Function compar takes two parameters, which will be pointers to particular members of the array. compar(A,B) should return

a negative number	if A < B
0	if A = B
a positive number	if A > B

according to the order in which you want your array to be sorted.

Here is what you can do to use qsort to sort an array of edges, assuming each member of the array has type Edge, and Edge is defined as follows.

   struct Edge
   {
     int v1,v2,weight;
   };

First, define a comparison function.

  int compareEdges(const Edges* A, const Edge* B)
  {
    return A->weight - B->weight;
  }

Define a new type, QSORT_COMPARE_TYPE, as follows. It is the type of the compar parameter that qsort expects to receive.

   typedef int (*QSORT_COMPARE_TYPE)(const void*, const void*);

Your function, compareEdges, does not have type QSORT_COMPARE_TYPE, since compareEdges takes two parameters of type Edge*, and qsort says that it should take two parameters of type void*. But you can tell the compiler to ignore this, and to let you pass your compareEdges function to qsort, by doing a cast, which tells the compiler to treat a value of one type as if it is a value of a different type. This is, in many cases, a very dangerous thing to do, but you have to do it to use qsort.

To sort an array Arr of n edges, you use statement

   qsort((void*) Arr, n, sizeof(Edge), (QSORT_COMPARE_TYPE)compareEdges);

We have passed qsort the array to sort, the number of elements, the number of bytes occupied by each element (given by sizeof(Edge), since each element has type Edge) and the comparison function, cast to the appropriate type.

Note. If you wrote compareEdges as follows instead, qsort would sort into descending order by weight. You have control over the order.

  int compareEdges(const Edges* A, const Edge* B)
  {
    return B->weight - A->weight;
  }

Requirements

Place the Merge abstract data types in files called merge.h and merge.cpp. Those file should only hold the Merge data type and its functions. Do not add anything to those files that is specific to solving the minimal spanning tree problem.

Call the program that performs the minimal spanning tree computations span.cpp.

Your program should read the graph description from the standard input, and write the results to the standard output.

Implementing the program: a refinement plan

Here is a suggestion for how to implement this program. This is a refinement plan. It tells how to produce more and more of the program, gradually refining it until eventually it does the whole job. It is critical that a refinement plan have tests along the way, so that the part written so far is known to work. That way, if you get an error, it is almost always in the part of the program that was most recently written, and finding bugs is relatively easy.

You should have already implemented the Merge type. If you do things intelligently, you will have kept the original, working versions that does not have the improvements done. (Did you do that? Once you have something that works, keep it, even if you think you are replacing it with something better. Just keep an archive of module implementations.) Use the basic version of Merge. It is much more likely to be correct than the more efficient version.
Implement the part of the program that reads in and prints out the input graph. Test it. It is a good idea to use functions. You will need to print other graphs later, so make a function that prints a given graph instead of writing the code to print a graph inline inside the main function.
Implement the beginnings of the minimal spanning tree algorithm by sorting the edges and printing them after sorting. Be sure that this is working.
Implement the rest of the minimal spanning tree algorithm. Test it. Look at the results. Are they correct? You will have to solve an example by hand to make sure.
Create a test script. The script will run your program on several input files automatically. Here is a sample test script for Unix. It writes all of the output to a file called out, and also to the screen.
```
    #!/bin/sh
    g++ merge.cpp span.cpp
    echo "Test 1" | tee out.txt
    a.out <data1  | tee -a out.txt
    echo "\nTest 2" |tee -a out.txt
    a.out <data2  | tee -a out.txt
```
You can see how to add more tests. The tee command copies its input to two places, the standard output and the given file. Option -a indicates that new information should be added to the end of the file.

To run the script, put it into a file, such as runtest. Then execute command
```
    chmod u+x runtest
```
to give yourself permission to run file runtest as a command. (You only need to do that once.) Now, to run the tester, just do command
```
     ./runtest
```
You should see the output show up on the terminal, and it is also written into file out.txt.
Make the program's output look nice, if necessary. Test it again, using your automatic tester.
Now that you have a working program, remove the basic merge module and replace it with the improved module. You should not need to change a single line of your program. It is just a matter of compiling a different merge.cpp file with your program. Test it again, using your automatic tester. Is it still producing the correct output? An easy way to do the test is as follows. Run the tester using the basic Merge implementation. (You still have it, don't you?) Now rename file out.txt, so that you can hold onto it. Unix command
```
    mv out.txt basicout.txt
```
will change the name of out.txt to basicout.txt. Now run the tester using the improved version of Merge. You will get file out.txt holding the results. Now compare the two output files. Command
```
    diff basicout.txt out.txt
```
shows the differences between basicout.txt and out.txt, by giving a sequence of editor commands that will change one into the other. If there are no differences, which is what you want, diff will not say anything. No news is good news here.

Keep the program well-commented and well-indented throughout. Make sure every function has a clear contract. Do not plan to add in the comments after the program is finished.

When you do a test, if things don't work, then use the debugger or add some prints to see what is going on. Do not try to work in the dark. Before you try to fix the problem, understand what is wrong. After you have diagnosed the problem, fix it. Do not move on until everything that you have written so far is working.

What to turn in

Turn in all of your source files. Be sure that your name is in every source file. Turn in a Makefile for the program. Use the submit program. Version 1 is assignment 2a, and version 2 is assignment 2b. So to turn in version 1, you would use command

  ~karl/3510/bin/submit 2a merge.h merge.cpp mst.cpp Makefile

Asking questions

If you have questions, be sure to ask, and do so as soon as possible, so that you have time to deal with the response.

Computer Science 3300 Section 001 Fall 2005 Programming Assignment 4