Computer Science 2530
Spring 2017
Programming Assignment 3

Assigned: Thursday, February 9
Due: Monday, February 20, 11:59pm

Table of Contents

  1. Purpose of this assignment
  2. Equivalence relations
  3. An equivalence relation manager
  4. Implementation of an equivalence relation manager
  5. Linkage and testing
  6. Things to watch out for
  7. Submitting your work
  8. Asking questions
  9. Extra credit: improving the equivalence manager.


Purpose of This Assignment

This assignment exercises your knowledge of arrays and functions. It also gives you experience implementing a module that is not an application but is intended to be part of an application.

It is important for you to follow the design described here. Do not make up your own algorithm. Implement exactly the functions that are indicated. Keep the parameter order as shown here. If you change the parameter order, your module will not compile correctly with my tester.


Equivalence Relations

An equivalence relation ≡ is a relation with the following properties for all x, y and z.

  1. ≡ is reflexive: xx.
  2. ≡ is symmetric: If xy then yx.
  3. ≡ is transitive: If xy and yz then xz.

An equivalence relation on a set always partitions the set into equivalence classes, where members of a given equivalence class are equivalent to one another and members of different equivalence classes are not equivalent. For example, suppose that ≡ is a relation on set {1, 2, 3, 4, 5, 6} with equivalence classes {1, 3, 4}, {2, 6}, {5}. Then 1 ≡ 4 and 2 ≡ 6 but 2 is not equivalent to 3.


An Equivalence Relation Manager

Your goal for this assignment is to create a tool that manages an equivalence relation that can be changed by the program. The equivalence relation is always over a set of integers {1, 2, 3, …, n} for some n.

This tool is not a complete program. It is intended to be part of a larger program. It must not have a main function.

For this assignment, an equivalence relation has type ER. A module that uses this tool can create an equivalence relation called e by saying

  ER R = newER(n);
where n is an integer. Initially, each number is in its own equivalence class; the equivalence classes are {1}, {2}, …, {n}. There are two operations that can be performed.

  1. equivalent(R, x, y) yields true if x and y are currently in the same equivalence class in equivalence relation R.

  2. merge(R, x, y) modifies equivalence relation R by making x and y equivalent. It combines the equivalence class that contains x with the equivalence class that contains y. The merge function does not yield an answer.

For example, suppose that n = 7. The following shows a sequence of operations and shows the equivalence classes after each merge operation.

ER R = newER(7)
R = {1} {2} {3} {4} {5} {6} {7}
merge(R, 1,5)
R = {1, 5} {2} {3} {4} {6} {7}
merge(R, 2,7)
R = {1, 5} {2, 7} {3} {4} {6}
equivalent(R, 1,5) yields true
equivalent(R, 1,7) yields false
merge(R, 5,7)
R = {1, 2, 5, 7} {3} {4} {6}
equivalent(R, 2,5) yields true
equivalent(R, 2,3) yields false
merge(R, 2,3)
R = {1, 2, 3, 5, 7} {4} {6}
equivalent(R, 3,7) yields true
equivalent(R, 4,7) yields false
merge(R, 2,3)
R = {1, 2, 3, 5, 7} {4} {6}

As you can see from the last step, it is allowed to merge two values that are already equivalent. That should cause no change.


Implementation of an Equivalence Relation Manager

You will not store the equivalence classes directly. Instead, you will store them implicitly, using the following ideas. You are required to implement an equivalence manager in this way. You will receive no credit for a module that does not follow this algorithm.

  1. Each equivalence class has a leader, which is one of the members of that equivalence class. You will create a function leader(R, x) that yields the current leader of the equivalence class that contains x in equivalence relation R. Two values are equivalent if they have the same leader.

  2. There is another idea that is similar to a leader, but not exactly the same. Each value has a boss, which is a value in its equivalence class. Let's write boss[x] for x's boss.

    If x is the leader of its equivalence class then boss[x] = 0, indicating that x does not have a boss. If x is not the leader then boss[x] ≠ x and boss[x] is closer to the leader. If you look at the values x, boss[x], boss[boss[x]], boss[boss[boss[x]]], … then you will eventually encounter x's leader.

Use an array to store the bosses. Write the following functions.

  1. newER(n) returns an array of n+1 integers. Allocate the array in the heap. This array will be used to store the bosses. So, if R has type ER then R[x] is x's boss. (If you prefer, you can call an equivalence relation boss. Then boss[x] is x's boss.)

    In C++, arrays start at index 0. You will use indices 1, … n, so you need to allocate n+1 slots. (Index 0 will not be used.)

    Initialize the array so that each value is a leader of its own group. That is, boss[x] = 0 for x = 1, …, n.

  2. leader(R, x) returns the leader of x in equivalence relation R. To compute x's leader, just follow the bosses up to the leader. You are required to use this function any time you want to find the leader of a value.

  3. equivalent(R, x, y) returns true if x and y are currently in the same equivalence class in equivalence relation R. (They are in the same equivalence class if they have the same leader.)

  4. merge(R, x, y) merges the equivalence classes of x and y in R as follows. First, it finds the leaders x′ and y′ of x and y. If x′ and y′ are different (so x and y are not already in the same equivalence class) then y′ becomes the new boss of x′. (How can you changes x′'s boss?) So y′ becomes the leader of the combined equivalence class.

  5. destroyER(R) deallocates R.

  6. showER(R, n) prints the entire contents of array R (of size n) in a readable form for debugging.

Important note. It is crucial that your program never change the boss of a value that is not a leader.


Linkage and Testing

Create a directory to hold assignment 3. Put your function definitions in a file called equiv.cpp. Use the standard template. Notice that equiv.cpp must not contain a main function. It is intended to be a tool for a larger program to use, not a separate application.

Create another file called equiv.h (a header file) containing the following, except that you should edit the comment.

// CSCI 2530
// Assignment: 3
// Author:     ***
// File:       equiv.h
// Tab stops:  ***

// These #ifndef and #define lines make it so that, if this file is
// read more than once by the compiler, its body is skipped on all
// but the first time it is read.

#ifndef EQUIV_H
#define EQUIV_H

// An equivalence relation is an array of integers.
// So ER abbreviates int*.  

typedef int* ER;

// Public functions

ER   newER      (int n);
void destroyER  (ER e);
bool equivalent (ER e, int x, int y);
void merge      (ER e, int x, int y);

// The following is advertised here solely for debugging.  These must
// only be used for debugging.

void showER(ER e, int n);
int  leader(ER e, int x);

#endif

Testing Your Equivalence Manager

Create another file containing a main function that tests your equivalence relation manager. It should have the following line near its top.

#include "equiv.h"
That will allow the tester to use the functions from equiv.cpp. Test everything thoroughly, to the point that you are sure that your equivalence relation manager works. Do not be satisfied with a lazy test.

A Makefile is provided that assumes your tester is called testequiv.cpp. You can use the following commands with this Makefile.

make equiv.o
Just compile equiv.cpp, if it needs compiling.

make testequiv.o
Just compile the tester, if it needs compiling.

make testequiv
Compile both files, if they need compiling, and link them to create a executable file called testequiv

make test
Do make testequiv and then run the tester.

make debug
Do make testequiv then run the tester via the gdb debugger.

make clean
Remove all machine-generated files.

Things to Watch Out for

Your equiv.cpp and equiv.h function must follow the coding standards, which include the following.

  1. An equivalence relation has type ER. Any time you need to write the type of an equivalence relation, write ER, not int*.

  2. Every function is required to have a clear, concise and precise contract. Pay attention to this. In the past, students have lost a lot of points because they did not put enough care into writing clear, understandable contracts.

  3. Indent well. Put a blank line before and after each contract.

  4. Avoid long lines. Limit lines to about 80 characters.

  5. Each function can have at most one loop in its body.

  6. A function body must not change the value of a call-by-value parameter. Pay attention to this one. In the past, many students have lost points for violating this requirement.

  7. Avoid code duplication.

  8. Do not use global or static variables. This is another one to watch out for. Using global or static variables will cost you a very large number of points.

  9. The body of every if-statement, loop, etc. must be a compound statement.

  10. Do not use redundant tests in if-statements.

  11. You can use a boolean value like any other value. Do not force a boolean value to be a condition in an if-statement where that is not appropriate.

  12. Be cautious not to use a nonexistent index of an array. If array A has size n, then A[n] does not exist.


Submitting Your Work

You must submit your program using the following method. Email submissions will not be accepted. An excuse that you do not know how to use Linux will not be accepted.

To turn in your work, log into the Linux system, change your directory for the one for assignment 3, use the following command.

  ~abrahamsonk/2530/bin/submit 3 equiv.cpp equiv.h
After submitting, you should receive confirmation that the submission was successful. If you do not receive confirmation, assume that the submission did not work.

Command

  ~abrahamsonk/2530/bin/submit 3
will show you what you have submitted for assignment 3.

You can do repeated submissions. New submissions will replace old ones.

Late Submissions

Late submissions will be accepted for 24 hours after the due date. If you miss a late submission deadline by a microsecond, your work will not be accepted.


Asking questions

To ask a question about your program, first submit it, but use assignment name q3. For example, use command

  ~abrahamsonk/2530/bin/submit q3 equiv.cpp equiv.h
Then send me an email with your question. Do not expect me to read your mind. Tell me what your questions are. I will look at the files that you have submitted as q3. If you have another question later, resubmit your new file as assignment q3.


Extra Credit: Improving the Equivalence Relation Manager.

For extra credit (up to 20 points), implement two improvements on the equivalence manager algorithm.

But be careful. Make sure that they work and that they do not wreck the equivalence relation manager. You will get no extra credit for an improvement that does not work. If your "improvements" wreck the equivalence relation manager, you will received less credit than you would for a working equivalence relation manager without the improvements.

You can only receive extra credit if your program passes the tests that I run on it.

The First Improvement: Improving the Leader Function (5 pts)

The first improvement involves a change to the leader function. After the leader function scans through a chain of boss links to find the leader of a value, it should go back through the chain and put the current leader in the boss array for every number that was looked at in the chain. That way, subsequent leader computations will reach the leader much more quickly. For example, if the R array contains

  R[1] = 2
  R[2] = 4
  R[3] = 3
  R[4] = 6
  R[5] = 3
  R[6] = 6
  R[7] = 4
  R[8] = 5
  R[9] = 1
then computing the leader of 1 requires chaining through 1, 2, 4 and 6, stopping at 6. The improvement changes the contents of the array to the following, by installing the correct leader (6) of each of the numbers that was looked at.
  R[1] = 6
  R[2] = 6
  R[3] = 3
  R[4] = 6
  R[5] = 3
  R[6] = 6
  R[7] = 4
  R[8] = 5
  R[9] = 1
You can do this with another loop that rescans through the chain, the same way the chain was scanned the first time, but now putting the leader into the array as you go. Alternatively, make the leader function be recursive, and just change the boss after each recursive call.

Notice that we have not scanned the entire array from beginning to end! R[8] is still 5, even though the leader of 8 is 3. R[9] is still 1, since 9 was not encounted in the scan from 1 to 6. Only the numbers that were looked at in the original scan have their boss values changed. If you try to change everything in the array, you make the implementation slower, not faster.

Also notice that it was not just the boss of 1 that was changed. All of the numbers that were examined in the chain have their bosses set to their leaders. For example R[2] was changed too.

Be sure to test your improved leader function. It is easy to write it incorrectly. Try it by hand to see whether it seems to do the right thing.

The Second Improvement: Improving Merge (15 pts)

To do this improvement you will need to use structures, which we will cover a little later.

Each number that is a leader has a collection of constituents, namely the members of its equivalence class. For example, in the above array, number 3 is a leader, and the constituents of 3 are 3, 5 and 8, so 3 has three constituents, counting itself.

When doing a merge operation, you find two values s and t that are leaders. You can then either change the boss of s to t (so s is no longer a leader) or change the boss of t to s (so t is no longer a leader). Either one will accomplish the goal of merging the two sets, but the choice of which to do influences the efficiency of the implementation. The best choice is to change the boss of the number that has the fewest constituents. That tends to keep the lengths of the boss chains up to the leader short.

Modify your ER type so that it is an array of structures. Each structure holds a boss and a constituent count. So R[k].boss is the boss of k in equivalence relation R. If k is a leader then R[k].numConstituents is the size of the equivalence class that k leads. (If k is not a leader then R[k].numConstituents should be 0, since a nonleader has no constituents.)

A picture of the initial array, before any merges have been done, might look like this. Notice that each number has one constituent, itself.

  index    boss  numConstituents
    1        1           1
    2        2           1
    3        3           1
    4        4           1
    5        5           1
    6        6           1
    7        7           1

If you do merge(R, 3, 5) with the above array, you might arbitrarily decide to make the boss of 3 be 5, since they have the same constituent count. Then the array looks like this.

  index   boss  numConstituents
    1        1           1
    2        2           1
    3        5           0
    4        4           1
    5        5           2
    6        6           1
    7        7           1
If you now do merge(R, 5, 1), you must change the boss of 1, since it has fewer constituents than 5. The array ends up looking like this.
  index   boss  numConstituents
    1        5           0
    2        2           1
    3        5           0
    4        4           1
    5        5           3
    6        6           1
    7        7           1
As before, only change the boss of a number that is currently a leader. If you now do merge(R, 2, 1), you must realize that you are really being asked to merge 2 and 5, since 5 is the leader of 1. Since 5 has more constituents, you change 2's boss, yielding
  index   boss  numConstituents
    1        5           0
    2        5           0
    3        5           0
    4        4           1
    5        5           4
    6        6           1
    7        7           1
As you can see, this improvement tends to lead to shorter chains of bosses before the leader is found.

Suppose you continue by combining 6 and 7. You might get the following.

  index   boss  numConstituents
    1        5           0
    2        5           0
    3        5           0
    4        4           1
    5        5           4
    6        7           0
    7        7           2
Now merge 1 and 6. Their leaders are 5 and 7. Since 5 has more constituents, change the boss of 7 to be 5. The new information is as follows.
  index   boss  numConstituents
    1        5           0
    2        5           0
    3        5           0
    4        4           1
    5        5           6
    6        7           0
    7        5           0
Now 5 has six constituents. Although 6 is one of 5's constituents, its boss is still 7. The boss chains are shorter, but you still need to do the leader calculation using a loop (or recursion). A value's boss is not necessarily its leader.

Efficiency Before and After Improvements

If you do neither improvement, then the equivalence relation manager can take time proportional to n2 to process n merge and equivalent requests. With both improvements, the time is no worse than proportional to n α(n) where α(n), the inverse of Ackerman's function, is a very slowly growing function of n, so slow that, for all remotely practical values of n, α(n) ≤ 6.

(In the past, most students who did this improvement using a loop got it wrong, so be careful. Most students who did it with recursion got it right.)