Imagine that you want to create a lookup table, such as a telephone book, in your program. You might want to look up a name and find the associated telephone number, or look up a word and find how many times you have seen that word. The value being looked up is called a key.
Typical operations on tables are:
An obvious way to implement a table is to store it in an array. For example, a table whose keys and associated values are both strings can be represented as an array of TableEntry's, where type TableEntry is defined as follows.
struct TableEntry { char* key; char* value; };At any given point, only part of the array is used, with a vacant part after that for insertions. So the array's logical size can be smaller than its physical size.
But using an array to represent a table has some serious efficiency issues.
Suppose that the entries are stored in an arbitrary order. Then adding a (key, value) pair is easy: add it to the end of the used part. But to look up a key requires searching the array. If there are n entries currently in the table then the time required in the worst case is Θ(n). (Because the time is a linear function of n, this algorithm is called linear search.) Removing a key is also difficult, since it requires finding the key first.
Suppose that the entries are stored in order by key. For example, when the keys are strings, the entries can be stored in alphabetical order. That makes lookup much more efficient since you can use binary search, which only takes time Θ(log2(n)) in the worst case. But now adding an entry is slow because you need to move other entries in order to make room for the new entry. The time to insert or remove a value is Θ(n) in the worst case.
Using a linked list would appear to help to do an insertion without moving things around. But binary search requires an array, so that rules out using a linked list.
In the next few pages we explore a way to combine the best features of linked lists and ordered arrays to achieve an implementation of tables that takes time Θ(log2(n)) to look up a key, insert an entry or remove an entry, in the worst case.
The main issue in a table is keeping track of the keys. The associated values are just attached to them by being in the same structure. If we can find an efficient way to store just the keys, then it is easy to modify that to store (key, value) pairs.
Since storing just the keys is simpler, we look at the problem of storing a set of values. The operations are: