Computer Science 3300
Section 001
Fall 2015
Programming Assignment 3

Assigned: Monday, September 21
Due: Wednesday, September 30, 11:59pm

Table of contents

  1. Equivalence relations
  2. An equivalence relation manager
  3. Implementation of an equivalence manager
  4. Linkage and testing
  5. Warnings
  6. Submitting your work
  7. Extra credit: improving the equivalence manager.


Equivalence relations

An equivalence relation is a relation, which can have any name but we call ≡, with the following properties for all x, y and z.

  1. ≡ is reflexive: xx.
  2. ≡ is symmetric: If xy then yx.
  3. ≡ is transitive: If xy and yz then xz.

An equivalence relation on a set always partitions the set into equivalence classes, where members of a given equivalence class are equivalent to one another and members of different equivalence classes are not equivalent. For example, suppose that ≡ is a relation on set {1, 2, 3, 4, 5, 6} with equivalence classes {1, 3, 4}, {2, 6}, {5}. Then 1 ≡ 4 and 2 ≡ 6 but 2 is not equivalent to 3.


An equivalence relation manager

Your goal for this assignment is to create a tool that manages a changing equivalence relation. The equivalence relation is always over a set of integers {1, 2, 3, …, n} for some n. Initially, each number is in its own equivalence class; the equivalence classes are {1}, {2}, …, {n}. There are two operations that you can perform.

  1. together(x, y) yields true if x and y are currently in the same equivalence class.

  2. combine(x, y) makes x and y be equivalent by combining the equivalence class that contains x with the equivalence class that contains y. The combine function does not yield an answer.

For example, suppose that n = 7. The following shows a sequence of operations and shows the equivalence classes after each combine operation.

{1} {2} {3} {4} {5} {6} {7}
combine(1,5)
{1, 5} {2} {3} {4} {6} {7}
combine(2,7)
{1, 5} {2, 7} {3} {4} {6}
together(1,5) yields true
together(1,7) yields false
combine(5,7)
{1, 2, 5, 7} {3} {4} {6}
together(2,5) yields true
together(2,3) yields false
combine(2,3)
{1, 2, 3, 5, 7} {4} {6}
together(3,7) yields true
together(4,7) yields false
combine(2,3)
{1, 2, 3, 5, 7} {4} {6}

As you can see from the last step, it is allowed to combine two values that are already equivalent.


Implementation of an equivalence manager

You will not store the equivalence classes directly. Instead, you will store them implicitly, using the following ideas. You are required to implement an equivalence manager in this way.

  1. Each equivalence class has a leader, which is one of the members of that equivalence class. You will create a function leader(e, x) that yields the current leader of x (in equivalence manager e). Two values are equivalent if they have the same leader.

  2. Each value has a boss, which is a value in its equivalence class. Let's write boss[x] for x's boss. If x is the leader of its equivalence class then boss[x] = x. If x is not the leader then boss[x] ≠ x and boss[x] is closer to the leader, so that if you look at the values x, boss[x], boss[boss[x]], boss[boss[boss[x]]], … then you will eventually encounter the leader (which is its own boss).

Use an array to store the bosses. Write the following functions. EM stands for equivalence manager.

  1. newEM(n) returns an array of n+1 integers. Allocate the array in the heap. This array will be used to store the bosses. In C++, arrays start at index 0. You will use indices 1, … n, so you need to allocate n+1 slots.

    Initialize the array so that each value is a leader of its own group.

    The array returned by newEM is called an equivalence manager.

  2. leader(e, x) is the leader of x in equivalence manager e. (Just follow the bosses up to the leader. You can recognize the leader because its boss is itself.) You are required to use this function any time you want to find the leader of a value.

  3. together(e, x, y) returns true if x and y are currently in the same equivalence class in equivalence manager e. (They are in the same equivalence class if they have the same leader.)

  4. combine(e, x, y) combines the equivalence classes of x and y in e as follows. First, it finds the leaders x′ and y′ of x and y. If x′ and y′ are different (so x and y are not already in the same equivalence class) then y′ becomes the new boss of x′. So y′ becomes the leader of the combined equivalence class.

  5. destroyEM(e) deallocates e.

  6. showEM(e, n) prints the entire contents of array e (of size n) in a readable form for debugging.

It is crucial that you program never change the boss of a value that is not a leader.


Linkage and testing

Put your functions in a file called equiv.cpp. Use the standard template. Notice that equiv.cpp should not contain a main function. It is intended to be a tool for a larger program to use, not a separate application.

Create another file called equiv.h (a header file) containing the follows, except that you should edit the comment.

// CSCI 3300
// Assignment: 3
// Author:     ***
// File:       equiv.h
// Tab stops:  ***

// These #ifndef and #define lines make it so that, if this file is
// read more than once by the compiler, its body is skipped on all
// but the first time it is read.

#ifndef EQUIV_H
#define EQUIV_H

// An equivalence manager is an array of integers.  So EM abbreviates int*.  
typedef int* EM;

EM   newEM    (int n);
void destroyEM(EM e);
bool together (EM e, int x, int y);
void combine  (EM e, int x, int y);

// The following is advertised here solely for debugging.  These must
// only be used for debugging.

void showEM(EM e, int n);
int  leader(EM e, int x);

#endif

Testing your equivalence manager

Create another file containing a main function that tests your equivalence manager. It should have the following line near its top.

#include "equiv.h"
That will allow you to use the functions from equiv.cpp in the tester. Test everything thoroughly, to the point that you are sure that your equivalence manager works.

A Makefile is provided that assumes your tester is called testequiv.cpp. You can use the following commands with this Makefile.

make equiv.o

Just compile the equivalence manager, if it needs compiling.

make testequiv.o

Just compile the tester, if it needs compiling.

make testequiv

Compile both files, if they need compiling, and links them to create a executable file called testequiv

make test

Do make testequiv and then run the tester.

make debug

Do make testequiv then run the tester via the gdb debugger.

make clean

Remove all machine-generated files.

For example, use command

make equiv.o
to compile your equivalence manager.


Warnings

  1. Every function is required to have a clear, concise and precise contract.

  2. Each function can have at most one loop in its body.

  3. A function body must not change the value of a call-by-value parameter. Pay attention to this one. In the past, many students have violated this requirement.

  4. Make sure that each function does its whole job., not just part of it job.

  5. Avoid code duplication.

  6. End the last line of output.

  7. Do not use global or static variables. This is another one to watch out for. Using global or static variables will cost you a very large number of points.

  8. Every body of an if-statement, loop, etc. must be a compound statement.

  9. Do not use redundant tests in if-statements.

  10. If code is only performed at the end of the last iteration of a loop, then it should be written after the loop, not inside the loop.


Submitting your work

You must submit your program using the following method. Email submissions will not be accepted. An excuse that you do not know how to use Linux will not be accepted.

To turn in your work, log into the Linux system, change your directory for the one for assignment 3, use the following command.

  ~abrahamsonk/3300/bin/submit 3 equiv.cpp equiv.h
After submitting, you should receive confirmation that the submission was successful. If you do not receive confirmation, assume that the submission did not work.

Command

  ~abrahamsonk/3300/bin/submit 3
will show you what you have submitted for assignment 3.

You can do repeated submissions. New submissions will replace old ones.

Late submissions

Late submissions will be accepted for one day after the due date. If you miss a late submission deadline by a microsecond, your work will not be accepted.


Extra credit: improving the equivalence manager.

For extra credit (up to 15 points), implement two improvements on the equivalence manager algorithm. But be careful. Make sure that they work and that they do not wreck the equivalence manager. You will get no extra credit for an improvement that does not work.

Improving the leader function

The first improvement involves a change to the leader function. After the leader function scans through a chain of boss links to find the leader of a value, it should go back through the chain and put the current leader in the boss array for every number that was looked at in the chain. That way, subsequent leader computations will reach the leader much more quickly. For example, if the boss array contains

  boss[1] = 2
  boss[2] = 4
  boss[3] = 3
  boss[4] = 6
  boss[5] = 3
  boss[6] = 6
  boss[7] = 4
  boss[8] = 5
  boss[9] = 1
then computing the leader of 1 requires chaining through 1, 2, 4, 6, stopping at 6. The improvement changes the contents of the boss array to the following, by installing the correct leader (6) of each of the numbers that was looked at.
  boss[1] = 6
  boss[2] = 6
  boss[3] = 3
  boss[4] = 6
  boss[5] = 3
  boss[6] = 6
  boss[7] = 4
  boss[8] = 5
  boss[9] = 1
You can do this with another loop that rescans through the chain, the same way the chain was scanned the first time, but now putting the leader into the array as you go. Alternatively, modify the leader function to be recursive, then just change the boss after each recursive call. (In the past, most students who did this improvement using a loop got it wrong, so be careful. Most students who did it with recursion got it right.)

Notice that we have not scanned the entire boss array from beginning to end! Boss[8] is still 5, even though the leader of 8 is 3. Boss[9] is still 1, since 9 was not encounted in the scan from 1 to 6. Only the numbers that were looked at in the original scan have their boss values changed. If you try to change everything in the array, you make the implementation slower, not faster.

Also notice that it was not just the boss of 1 that was changed. All of the numbers that were examined in the chain have their bosses set to their leaders. For example boss[2] was changed too.

Be sure to test your improved leader function. It is easy to write it incorrectly. Try it by hand to see whether it seems to do the right thing.

The second improvement: improving combine

To do this improvement you will need to use structures, which we will cover a little later.

Each number that is a leader has a collection of constituents, namely the members of its equivalence class. For example, in the above array, number 3 is a leader, and the constituents of 3 are 3, 5 and 8, and 3 has three constituents, counting itself.

When doing a combine operation, you find two values s and t that are leaders. You can then either change the boss of s to t (so s is no longer a leader) or change the boss of t to s (so t is no longer a leader). Either one will accomplish the goal of combining the two sets, but the choice of which to do influences the efficiency of the implementation. The best choice is to change the boss of the number that has the fewest constituents. That tends to keep the lengths of the boss chains up to the leader short.

Modify your EM type so that it is an array of structures. Each structure holds a boss and a constituent count. So e.boss[k] is the boss of k in equivalence manager e. If k is a leader then e.numConstituents[k] is the size of the equivalence class that k leads. (If k is not a leader then e.numConstituents[k] should be 0, since a nonleader has no constituents.

A picture of the initial array, before any combines have been done, might look like this. Notice that each number has one constituent, itself.

  index    boss  numConstituents
    1        1           1
    2        2           1
    3        3           1
    4        4           1
    5        5           1
    6        6           1
    7        7           1

If you do combine(e,3,5) with the above array, you might arbitrarily decide to make the boss of 3 be 5, since they have the same constituent count. Then the array looks like this.

  index   boss  numConstituents
    1        1           1
    2        2           1
    3        5           0
    4        4           1
    5        5           2
    6        6           1
    7        7           1
If you now do combine(e,5,1), you must change the boss of 1, since it has fewer constituents than 5. The array ends up looking like this.
  index   boss  numConstituents
    1        5           0
    2        2           1
    3        5           0
    4        4           1
    5        5           3
    6        6           1
    7        7           1
As before, only change the boss of a number that is currently a leader. If you now do combine(e,2,1), you must realize that you are really being asked to combine 2 and 5, since 5 is the leader of 1. Since 5 has more constituents, you change 2's boss, yielding
  index   boss  numConstituents
    1        5           0
    2        5           0
    3        5           0
    4        4           1
    5        5           4
    6        6           1
    7        7           1
As you can see, this improvement tends to lead to shorter chains of bosses before the leader is found.

Suppose you continue by combining 6 and 7. You might get the following.

  index   boss  numConstituents
    1        5           0
    2        5           0
    3        5           0
    4        4           1
    5        5           4
    6        7           0
    7        7           2
Now combine 1 and 6. Their leaders are 5 and 7. Since 5 has more constituents, change the boss of 7 to be 5. The new information is as follows.
  index   boss  numConstituents
    1        5           0
    2        5           0
    3        5           0
    4        4           1
    5        5           6
    6        7           0
    7        5           0
Now 5 has six constituents. Although 6 is one of 5's constituents, its boss is still 7. The boss chains are shorter, but you still need to do the leader calculation using a loop (or recursion). A value's boss is not necessarily its leader.

Efficiency before and after improvements

If you do neither improvement, then the equivalence manager can take time proportional to n2 to process n combine and together requests. With both improvements, the time is no worse than proportional to n α(n) where α(n), the inverse of Ackerman's function, is a very slowly growing function of n, so slow that, for all remotely practical values of n, α(n) ≤ 6.