10.2.4. Tables and Sets


Tables

Imagine that you want to create a lookup table, such as a telephone book, in your program. You might want to look up a name and find the associated telephone number, or look up a word and find how many times you have seen that word. The value being looked up is called a key.

Typical operations on tables are:

For a given key there can only be one (key, value) pair in the table with that key. After all, in a telephone book, there should only be one entry for a particular name. The insert operation can replace an entry, but it cannot add a second entry for a key. There is no such requirement for the values. Several keys can have the same associated value.


Implementation of tables as arrays

An obvious way to implement a table is to store it in an array. For example, a table whose keys and associated values are both strings can be represented as an array of TableEntry's, where type TableEntry is defined as follows.

  struct TableEntry
  {
    char* key;
    char* value;
  };
At any given point, only part of the array is used, with a vacant part after that for insertions. So the array's logical size can be smaller than its physical size.

But using an array to represent a table has some serious efficiency issues.

Using a linked list would appear to help to do an insertion without moving things around. But binary search requires an array, so that rules out using a linked list.

In the next few pages we explore a way to combine the best features of linked lists and ordered arrays to achieve an implementation of tables that takes time Θ(log2(n)) to look up a key, insert an entry or remove an entry, in the worst case.


Sets

The main issue in a table is keeping track of the keys. The associated values are just attached to them by being in the same structure. If we can find an efficient way to store just the keys, then it is easy to modify that to store (key, value) pairs.

Since storing just the keys is simpler, we look at the problem of storing a set of values. The operations are:

Also, it makes the presentation a little simpler to look at sets of integers instead of sets of strings. Handling strings is just a matter of changing types and using a function such as strcmp to compare strings. You also have to be careful about memory management of the strings in the table, which is not an issue with integers.