Computer Science 3300
Section 001
Fall 2015
Programming Assignment 4

Assigned: Tuesday, September 29
Due: Friday, October 9, 11:59pm

Table of contents

  1. Background: graphs and spanning trees
  2. Program requirements
  3. An algorithm to find a minimal spanning tree
  4. Functions and types
  5. Compiling, linking and running
  6. Warnings and additional requirements
  7. Submitting your work


Background: Graphs and Spanning Trees

graph is a collection of vertices connected by a collection of edges. An edge connects exactly two different vertices to one another. You can draw a picture of a graph by showing the vertices as circles and the edges as lines connecting them. Here is an example.

A path from vertex v1 to vertex vk is a sequence of vertices (v1, v2, …, vk) such that there is an edge between vi and vi+1, for i = 1, …, n − 1.

We will only be concerned with connected graphs. A graph is connected if there is a path from every vertex to every other vertex, following edges. For example, in the above graph, there is a path from 3 to 4 that goes (3, 1, 2, 4). There are other paths from 3 to 4 as well. For any two vertices u and v, there is a path from u to v.

A weighted graph is a graph in which each edge has a number attached to it, called the weight of the edge. Here is a picture of a weighted graph.

A spanning tree of a graph is obtained by deleting as many edges as possible without making the graph so that it is no longer connected. That is, we still need to have a path from each vertex to each other vertex. For example, the following is a spanning tree of the above weighted graph.

The weight of a spanning tree is the sum of the weights of its edges. For example, the weight of the above spanning tree is 59. Here is another spanning tree for the same graph. Its weight is 48.

Obviously, some spanning trees have smaller weight than others. A minimal spanning tree is a spanning tree with the smallest possible weight.


Program Requirements

Write a program that reads a description of a weighted graph with integer weights and prints the original graph and the edges that are part of a minimal spanning tree of that graph. It should also print the total weight of the minimal spanning tree. The input should be read from the standard input, and the output should be written to the standard output.

Start from the standard template. Call your program mst.cpp.

Input format

The input starts with a line that tells how many vertices the graph has. If there are five vertices, then those vertices have numbers 1, 2, 3, 4 and 5. In general, if there are n vertices, then they are numbered 1, …, n.

Following the first line are the edges, one per line. Each edge line has three integers on it. Line

2 4 50
indicates that there is an edge between vertices 2 and 4, and that its weight is 50. The end of the input is signaled by a line that contains just a 0. An input describing graph

might look like this.

5
1 2  9
1 3 12
2 4 18
2 3  6
2 5 20
3 5 15
0

You can assume that there are no more than 100 edges, but that number must be easy to change. To increase this to 200, for example, should only require changing one line of your program.

Output format

The output of your program for this input might look like this.


The input graph has 5 vertices, and its edges are as follows.

  vertices    weight
  1   2            9
  1   3           12
  2   4           18
  2   3            6
  2   5           20
  3   5           15
  
A minimal spanning tree uses the following edges.

  vertices    weight
  2   3            6
  1   2            9
  3   5           15
  2   4           18

The total weight of the spanning tree is 48.


An algorithm to find a minimal spanning tree

Kruskal's algorithm

There is a well-known algorithm, called Kruskal's algorithm, for computing a minimal spanning tree of a weighted graph. It goes as follows.

  1. Sort the edges from small weight to larger weight. Call the sorted list of edges e0, …, em−1. So, for example, e0 is the edge with the smallest weight. (See below for an easy way to sort an array of edges.)

  2. Start with an empty graph K. It has all of the vertices, but no edges yet. Edges will be added to it until, at the end, it is the minimal spanning tree.

  3. For i = 0, …, m−1 do

    1. Look at edge ei. Suppose that it connects vertices u and v.
    2. If there is not already a path in K between vertices u and v, then add edge ei to K. Otherwise, do not add it to K. (How to determine whether there is already a path is discussed below.)

When the algorithm is done, K is a minimal spanning tree for your graph. See below for an argument that Kruskal's algorithm works.

Kruskal's algorithm is an example of a greedy algorithm, which is an algorithm that tries to optimize on a large scale by optimizing on a small scale. When Kruskal's algorithm chooses an edge to connect to groups of vertices, it chooses the edge with the smallest weight that does the job (because it looks at the edges in increasing order of weight). There is no thought about whether that decision will be a good one in the long run. It turns out that Kruskal's algorithm works, and always finds the minimal weight spanning tree. Many other greedy algorithms, however, appear reasonable but fail to produce the correct result.

Determining connections

For two vertices u and v, say that u ~ v just when there is a path between u and v. Then ~ is an equivalence relation. (~ is reflexive: the path from u to u is just (u). It should be easy to see that ~ is also symmetric and transitive.)

Use your equivalence manager from the previous assignment to keep track of the equivalence classes of ~. Initially, each vertex is in an equivalence class by itself. When Kruskal's algorithm calls for adding an edge between u and v, tell the equivalence manager to combine u and v. To ask if there is a path between u and v, just ask if u and v are in the same equivalence class.

Do not copy your equivalence manager into your file that implements Kruskal's algorithm. Keep it in separate file equiv.cpp. Just include equiv.h in mst.cpp.

Storing the edges

Store each edge as a structure holding two vertices and a weight. Store the edges of a graph as an array of edge structures.

Sorting an array of edges

You will need to sort the array of edges according to the weights of the edges. There is a function in the C standard library called qsort that will sort an array. It implements a variant of the Quicksort algorithm. You should include header file <cstdlib> to use qsort.

Qsort is a general sorting function, designed to be able to sort any type of array into any desired order. In order to achieve that degree of generality, qsort needs information about the array and how to sort it. It also needs for you to perform some conversions that, in general, would be questionable, but that work correctly here.

Qsort takes four parameters. In general, you write

   qsort(base, numElements, elementSize, compare);
where

To run qsort, define a new type, QSORT_COMPARE_TYPE, as follows. It is the type of the compare parameter that qsort expects to receive.

   typedef int (*QSORT_COMPARE_TYPE)(const void*, const void*);
Now statement
   qsort((void*) Arr, n, sizeof(Edge), (QSORT_COMPARE_TYPE) compareEdges);
will sort array of edges Arr according to weights, with smaller weights toward the beginning of the array.

Warning. Your function, compareEdges, does not have type QSORT_COMPARE_TYPE, since compareEdges takes two parameters of type const Edge*, and qsort says that it should take two parameters of type void*. But you can tell the compiler to ignore this, and to let you pass your compareEdges function to qsort, by doing a cast, which tells the compiler to treat a value of one type as if it is a value of a different type. In most cases, this is a very dangerous thing to do, but you have to do it to use qsort.


Functions and types

Represent a graph as a structure holding four things: (1) the number of vertices, (2) an array that holds the edges, (3) the current number of edges, and (4) the total number of slots for edges that are available in the edge array.

Break your program up into functions, including the following.

  1. A function that adds an edge (given by two vertices and a weight) to a given graph. Just add it to the end of the occupied section of the array. If the array is full, it should refuse to add the edge.

  2. A function that reads information about a graph, in the format shown in the requirements section, and that returns the graph. You will not know in advance how many edges there are. Put the following in your program.

       const int maxEdges = 100;
    
    The reading function can create an array of size maxEdges. It must call the array size maxEdges, not 100. A change to the definition of maxEdges must allow a different number of edges.
  3. A function to print a graph by telling how many vertices it has, how many edges it has, and then showing the edges. It should not print any specific heading. For example, it should not say that it is printing the input graph. Just provide information about the graph.

  4. A function to compute the total weight of a graph by adding up the weights of the edges. It should return the weight, and should not write anything.

  5. A function that takes a graph G as a parameter and returns another graph that is a minimal spanning tree of G. This function contains an implementation of Kruskal's algorithm.


Warnings and additional requirements

edge vs vertex edge array starts at 0. First vertex number is 1.
  1. Do not confuse edges with vertices. Keep track of the number of vertices in a graph separately from the number of edges. Remember that the first number in the input is the number of vertices, not the number of edges.

  2. The vertices are numbered 1, …, n. You will need to have an array of edges. It is a good idea to start numbering the edges from 0, not from 1. So put the first edge that you read at index 0 in the array.

  3. Your mst.cpp file should not make use of the fact that type EM is a pointer or an array. Just use the functions defined in equiv.cpp on EM objects.

  4. Every function is required to have a clear, concise and precise contract.

  5. Each function can have at most one loop in its body.

  6. A function body must not change the value of a call-by-value parameter. Pay attention to this one. In the past, many students have violated this requirement.

  7. Make sure that each function does its whole job., not just part of it job.

  8. Avoid code duplication.

  9. Do not use global or static variables. This is another one to watch out for. Using global or static variables will cost you a very large number of points.

  10. Every body of an if-statement, loop, etc. must be a compound statement.

  11. If code is only performed at the end of the last iteration of a loop, then it should be written after the loop, not inside the loop.


Compiling, linking and running

A Makefile is provided for you. If you put it in the directory that you have created for assignment 4, then you can use the following.

make mst
Just compile mst.cpp and equiv.cpp, if necessary, and link them to create executable file mst.
make test
Do make mst then run the program.
make debug
Do make mst then run the program via the gdb debugger.
make clean
Remove all machine-generated files.

Submitting your work

To turn in your work, log into the Linux system, change your directory for the one for assignment 4, use the following command.

  ~abrahamsonk/3300/bin/submit 4 mst.cpp equiv.cpp equiv.h
After submitting, you should receive confirmation that the submission was successful. If you do not receive confirmation, assume that the submission did not work.

Command

  ~abrahamsonk/3300/bin/submit 4
will show you what you have submitted for assignment 4.

You can do repeated submissions. New submissions will replace old ones.

Late submissions

Late submissions will be accepted for one day after the due date. If you miss a late submission deadline by a microsecond, your work will not be accepted.