Computer Science 2530
Spring 2017
Programming Assignment 7

Assigned: Monday, March 27
Due: Tuesday, April 11, 11:59pm

Table of Contents

  1. Purpose of this assignment
  2. Longest increasing sublists
  3. An algorithm to compute longest increasing sublists
  4. Storing lists backwards
  5. The assignment
  6. Compiling, linking and running
  7. Suggestions and additional requirements
  8. Memory sharing and reference counts
  9. Things to watch out for
  10. Submitting your work
  11. Asking questions


Purpose of This Assignment

This assignment involves linked lists with memory sharing. It also requires you to develop your program with less hand holding. You will need to draw pictures and think things out.

Because you have more flexibility in the design of this program, you are not required to follow the design precisely. But use it as advice.

An algorithm is described here, and you are required to use that algorithm. Do not invent a new algorithm. It is fairly simple but is probably not one that you would intuitively find.

It is crucial that you follow guidelines. Stay out of the swamp.


Longest Increasing Sublists

Suppose that s is a list of values. A sublist of s is a list that you can obtain by removing zero or more values from s. You cannot reorder values. For example, if s is [3, 5, 4, 8, 2, 9, 6] then [3, 5], [4, 2, 9] and [3, 4, 8, 9] are all sublists of s. Notice that the members of a sublist are not required to contiguous in the original list. ("Contiguous" means without gaps.)

An increasing sublist of s is a sublist of s that is in strictly ascending order.

A longest increasing sublist of s is an increasing sublist of s that has the greatest length among all increasing sublists of s. For example, if s is [3, 5, 4, 8, 2, 9, 6] then a longest increasing sublist of s is [3, 4, 8, 9].

There can be more than one longest increasing sublist of a list. For example, [3, 5, 8, 9] is also a longest increasing subsequence of [3, 5, 4, 8, 2, 9, 6]. They are considered equally good longest increasing sublists, and we are only interested in finding one of them.


An Algorithm to Compute Longest Increasing Sublists

Naive approaches to finding longest increasing sublists do not work. In order to get an algorithm that works for every list, we employ a powerful algorithm design technique called dynamic programming.

Best increasing sublists

For an integer k and list s, say that a best increasing sublist of length k is an increasing sublist of s having length k that ends on the smallest possible value.

For example, if s is [3, 5, 4, 8, 2, 9, 6] then a best increasing sublist of s of length 1 is [2]. A best increasing sublist of s of length 2 is [3, 4].

An example illustrating the algorithm

Our algorithm does a scan of s, looking at longer and longer prefixes of s, until it has looked at the entire list.

For each prefix of s, the algorithm computes a best increasing sublist of each length from 1 to the longest achievable length. It stores them all so that, when the length of the prefix is increased by adding one value, updating the information is easy.

Let's illustrate with list [3, 5, 4, 8, 2, 9, 6]. Term bis abbreviates best increasing sublist.

Prefix Bis's
[]
k bis of length k
0 []
[3]
k bis of length k
0 []
1 [3]
[3, 5]
k bis of length k
0 []
1 [3]
2 [3, 5]
[3, 5, 4]
k bis of length k
0 []
1 [3]
2 [3, 4]
[3, 5, 4, 8]
k bis of length k
0 []
1 [3]
2 [3, 4]
3 [3, 4, 8]
[3, 5, 4, 8, 2]
k bis of length k
0 []
1 [2]
2 [3, 4]
3 [3, 4, 8]
[3, 5, 4, 8, 2, 9]
k bis of length k
0 []
1 [2]
2 [3, 4]
3 [3, 4, 8]
4 [3, 4, 8, 9]
[3, 5, 4, 8, 2, 9, 6]
k bis of length k
0 []
1 [2]
2 [3, 4]
3 [3, 4, 6]
4 [3, 4, 8, 9]

Observe that, each time the prefix length is increased by one

  1. Sometimes, one of the existing bis's needs to be changed. But you never need to change more than one bis.

  2. Sometimes, none of the existing bis's needs to be changed. In that case, a new bis is added to the end, increasing the maximum bis length.

Also notice that the last values in the bis's strictly increase as the length k increases. For example, in the last row of the table, for prefix [3, 5, 4, 8, 2, 9, 6], the last members of the bis's are 2, 4, 6, 9, in that order. That must always be the case.

Determining how to do an update to the bis array

It is easy to find out what needs to be done. When adding new value x to the end of the current prefix of s, do the following. We illustrate for adding 4 to prefix [3, 5] to get new prefix [3, 5, 4].

  1. Try to find the longest currently stored bis B that ends on a value that is less than x, if there is one, and let be its length. For example, we are adding 4, and the currently stored bis's are

    k bis of length k
    1 [3]
    2 [3, 5]

    The longest stored bis that ends on a value that is less than 4 is [3], of length 1. So B = [3] and is 1.


    If none of the stored bis's end on a value that is less than x, then B = [ ] and = 0.


  2. Create a new list Q by adding x to the end of B. Store list Q as the new bis of length +1. If we already had a bis of length +1, list Q replaces it. If we did not already have one, then list Q is added as a new bis of length +1.

The algorithm goes through all of the prefixes until it has reached the entire list. Then it selects the bis of the longest length from the collection of stored bis's. That bis is a longest increasing sublist.

The algorithm does not need to store the prefix explicitly. All the algorithm needs at a given step is the new value x that is being added to the prefix. So it just does a loop, looking at each value in the original sequence.


Storing Lists Backwards

The algorithm actually stores all of the bis's backwards. There is a big advantage to doing that. When lists are stored in forward order, you update a list by adding a value to the end of a list. That is expensive. When bis's are stored backwards, you update a list by adding a value to its beginning. That is cheap.

Let's use bbis to mean a backwards bis. Here is the preceding example, [3, 5, 4, 8, 2, 9, 6], using bbis's.

Prefix Bbis's
[]
k bbis of length k
0 []
[3]
k bbis of length k
0 []
1 [3]
[3, 5]
k bbis of length k
0 []
1 [3]
2 [5, 3]
[3, 5, 4]
k bbis of length k
0 []
1 [3]
2 [4, 3]
[3, 5, 4, 8]
k bbis of length k
0 []
1 [3]
2 [4, 3]
3 [8, 4, 3]
[3, 5, 4, 8, 2]
k bbis of length k
0 []
1 [2]
2 [4, 3]
3 [8, 4, 3]
[3, 5, 4, 8, 2, 9]
k bbis of length k
0 []
1 [2]
2 [4, 3]
3 [8, 4, 3]
4 [9, 8, 4, 3]
[3, 5, 4, 8, 2, 9, 6]
k bbis of length k
0 []
1 [2]
2 [4, 3]
3 [6, 4, 3]
4 [9, 8, 4, 3]

The Assignment

Claude is a mountain climber. He has a book of mountains; each mountain has a name (a string) and an elevation (an integer). He wants to climb some of the mountains listed in the book, with the following requirements.

  1. He wants to climb mountains in increasing order of elevation.

  2. He wants to climb mountains in the same order in which they are listed in the book (which is not necessarily in increasing order by elevation).

  3. He wants to climb as many mountains as possible, subject to requirements (1) and (2).

Notice that constraints 1 and 2 indicate that the list of mountains to climb must be ordered simulataneously by two different orderings.

Input format and source

The input is required to come from the standard input.

The input has one line for each entry in the book, in the same order as the book. Each line contains a mountain name and an elevation. The mountain name is a string that does not contain any spaces and is no more than 50 characters long. For example, the input might be as follows.

Eiger            3970
Everest          8848
Denali           5500
Fuji             3776
GangkharPuensum  7570
K2               8611
Kilamanjaro      5895
Matterhorn       4478
Olympus          2917
Tocopuri         5808
Tupungato        6570
Whitney          4421

Note. The input ends at the end of the file. But remember that the program reads from the standard input. Do not make it read from a fixed file. I recommend that you redirect the standard input to a file. If you want to type in an input, type control-D for end-of-file.

You can test for end-of-file in your program using feof(stdin), which is true if there has been a previous attempt to read beyond the end of the file. Be careful to test this after trying to read something.

Output format

List the mountains to climb, in order, giving each in a format similar to the input. For example, the output might be as follows.

Fuji             3776
Matterhorn       4478
Tocopuri         5808
Tupungato        6570

Line up the names and the elevations in columns.


Compiling, Linking and Running

Call your program lis.cpp. A Makefile is provided for you. You can use the following commands with it.

make lis
Just compile lis.cpp, if necessary, to create executable file lis.

make test
Do make lis then run it.

make debug
Do make lis then run it via the gdb debugger.

make clean
Remove all machine-generated files.


Suggestions and Additional Requirements

Read the following. Follow these guidelines.

You will need to work out how to break your program into sensible functions. Do not jump into coding. Work out the design first. What functions will you write? Keep them short and simple. Keep their jobs focused.

Suggestions and requirements

1. Storing the book of mountains

Store the mountain names and elevations in an array of structures. While solving the longest increasing sublist problem, store the indices of the mountains in the best increasing sublists.

2. Computing Longest Increasing Sublists

Keep one array of linked lists to store the bbis's. Store NULL in all of the variables in that array. Also keep the length of the longest bis sequence that is currently stored in the array. It is initially 0.

Follow the longest increasing sublist algorithm above.

You will want a constructor for your ListCell type that plays to role of cons. Use that to add a value to the beginning of a list.

You would do well to write a function, lowerElevation(i, j, mountains), that returns true if mountains[i] has a lower elevation than mountains[j].

Do not try to delete any list cells. That is discussed below.

I strongly recommend a separate function to find the length of the bbis that needs to be extended. Be careful to remember that the mountain that you are considering might be the lowest one seen so far, which means that you extend the list of length 0.

3. Show the Solution

Write a function that shows the mountains in the longest bis. Remember that the list is backwards. Show the mountains in increasing order of elevation (and in forward order according to the book).


Memory Sharing and Reference Counts

Do not proceed to this part until your program is working. Make a copy of your program, so that, if you spoil it in this part, you still have a working copy.

You do not need to copy a bbis list to extend it. Just add the new value to the beginning of the linked list, using cons, or a constructor that plays the same role.

You find that you are using memory sharing. Memory sharing makes it tricky to know when you no longer need a list cell so that you can delete it.

To know when to delete a cell, store a reference count in each list cell. The reference count tells how many pointers are pointing to the list cell. Each time you create a pointer to a list cell, add 1 to that cell's reference count. Each time a pointer to a list cell is lost, for any reason, subtract 1 from that cell's reference count.

If the reference count becomes 0, you can delete the cell. (But remember that a cell contains a pointer to another cell, and that pointer is about to disappear. So deleting one cell involves decreasing the reference count on another cell, if the 'next' pointer is not NULL.)

Adding reference count management

4. Add a reference count field

Add a reference count field to your ListCell type.

5. Add function bump

Define a function bump(L) that adds 1 to the reference count of the cell pointed to by L, provided L is not empty. Make bump(L) do that, and only that.

You will want to use bump in the ListCell constructor. So put a prototype for bump before the definition of ListCell. You will want to say

  struct ListCell;
before the definition of type ListCell so that you can use type ListCell* before type ListCell is defined.

6. Make your ListCell constructor handle reference counts

Suppose list L is not empty. When you create new ListCell(x, L), list L gets another pointer to its first cell. Make the constructor bump that cell's reference count. (Remember that bump knows what to do with an empty list. Trust it.)

The cell that new ListCell(x, L) creates needs to have its reference count set. Set its reference count to 0. The reason why is explained below.

7. Add function drop

Write function drop(L). If L is not empty, it decreases the reference count of the first cell in list L by 1.

Let's write C for the first cell of list L. If the reference count of C becomes 0, then drop(L) must

  1. drop the tail of L, since the pointer stored in C is about to be destroyed.
  2. delete cell C.

Add a trace print to drop so that you can ask drop to tell you when it deletes a cell.

Also add a global variable to control deletion. Drop should only do a delete if that variable is true. Otherwise, drop should not perform the delete.

8. Add function setlist

Add another function, setlist(A, B), that changes list variable A to equal B. (So A must be a reference parameter.) Setlist(A, B) simulates statement A = B;, but it manages reference counts, as follows.

  1. Bump pointer B, to account for the new pointer to B that will be stored into A.
  2. Drop A, since its current value is about to lose a pointer to it (the one stored in variable A.)
  3. Set A = B.

9. Put your functions to work

You will need to use discipline to do this step.

  1. Whenever a variable of type ListCell* is created, set it to NULL.

  2. Whenever a variable of type ListCell* has its value changed, use setlist. So replace

      X = E;
    
    by
      setlist(x, E);
    
    That holds for arrays too. If you see
      A[n] = E;
    
    replace it by
      setlist(A[n], E);
    

10. Test your program

Does you program still produce correct results? Is any cell ever deleted?


Things to Watch Out for

  1. Call your program lis.cpp.

  2. Choose a sensible design, with sensible functions that are short, simple and well motivated. Do not regress to the way you wrote programs before taking this course.

    You will lose points for poor designs.

  3. Test your program on more than one input. In the past, many students have turned in programs that did not produce correct results. Be sure to choose one input that is in ascending order by elevation, so that the longest increasing sublist is the entire list.

  4. Contracts are extremely important for this assignment. They explain your choice of functions. When I grade your program, I will read each function's contract to see what you intend for that function to do.

    In the past, students have lost a lot of points because of poor or missing contracts. If I cannot figure out what a given function is trying to do, I will not give you a good grade for that function.

    Make sure that each function does just what its contract says it does, no less and no more. Make sure that all requirements on the parameters are noted in the contract.

  5. Also document defined types clearly.

  6. This program is supposed to read its input from the standard input. Do not submit an assignment that reads from a file. (I am always perplexed when a student submits a program that reads from a fixed file that only exists on his or her computer.)

  7. As always, follow the coding standards. Do not forget about them. Use the template.

  8. Do not duplicate sections of code unnecessarily. Instead, make a function holding the code and call it twice.

  9. Choose sensible names for types, functions and variables.


Submitting Your Work

To turn in your work, log into the Linux system, change your directory for the one for assignment 7, use the following command.

  ~abrahamsonk/2530/bin/submit 7 lis.cpp
After submitting, you should receive confirmation that the submission was successful. If you do not receive confirmation, assume that the submission did not work.

Command

  ~abrahamsonk/2530/bin/submit 7
will show you what you have submitted for assignment 7.

You can do repeated submissions. New submissions will replace old ones.

Late Submissions

Late submissions will be accepted for 24 hours after the due date. If you miss a late submission deadline by a microsecond, your work will not be accepted.


Asking questions

To ask a question about your program, first submit it, but use assignment name q7. For example, use command

  ~abrahamsonk/2530/bin/submit q7 lis.cpp
Include a data file if appropriate. Then send me an email with your question. Do not expect me to read your mind. Tell me what your questions are. I will look at the files that you have submitted as q7. If you have another question later, resubmit your new file as assignment q7.