10.3.5. Heaps


Heaps

A heap is a data structure that is useful for implementing a priority queue efficiently. There are actually two versions, min-heaps and max-heaps. We will start with min-heaps.

A heap is a binary tree. Each node in the tree holds an item and a priority. But it is simpler to concentrate on only one of those, the priority. Imagine the item to be attached to it but not written down. For simplicity we assume that the priorities are integers. They could just as well be real numbers or even strings.

There are two requirements that a min-heap must meet.

  1. (Ordering requirement) The priority in a node must be no larger than the priority in each of that node's children (if it has any children). Duplicate priorities are allowed.

  2. (Structural requirement) The levels of a tree are defined by their distance from the root. The root is the only node in level 0. The children of the root form level 1. The nodes at distance 2 from the root form level 2, etc. The structural requirement is that each of the following is true.

    1. Every level of the tree must be completely full, having the maximum number of nodes that it can possibly have, with the possible exception of the highest numbered level, which we call the bottom level.
    2. Nodes in the bottom level must be as far to the left as possible. There must be no gaps between nodes.

For example, the following is a min-heap. (Only the priority in each node is shown.)

               5
              / \
             /   \
           10     8
          /  \   /
         15  30 10
Notice that it obeys both the ordering and the structural requirement.


Implementing a heap as an array

The rigid structural requirement of a heap allows us to store a heap in an array. Number the nodes (from 0) starting at the root and working left-to-right across each level. The number is called the index of the node. The following shows indices in a small heap.

                 0
                / \
               /   \
              1     2
             / \   / \
            3   4 5   6
           / \
          7   8 
The idea is to store the node with index i of a heap at index i in an array. That allows us to move up and down in the heap easily, using the following functions.
  //===================================================
  //                leftchild
  //===================================================
  // leftchild(i) is the index of the left child of the
  // node with index i.
  //===================================================

  int leftchild(int i)
  {
    return 2*i + 1;
  }

  //===================================================
  //                rightchild
  //===================================================
  // rightchild(i) is the index of the right child of the
  // node with index i.
  //===================================================

  int rightchild(int i)
  {
    return 2*(i + 1);
  }

  //===================================================
  //                parent
  //===================================================
  // parent(i) is the index of the parent of the node
  // with index i.
  //===================================================

  int parent(int i)
  {
    return (i-1)/2;
  }


The PriorityQueue type

Our goal is to implement a priority queue using a heap. Here is a beginning, where a priority queue holds an array with its physical and logical sizes.

  const int initHeapSize = 20;

  struct ItemAndPriority
  {
    int item;
    int prioirty;
  };

  struct PriorityQueue
  {
    ItemAndPriority* A;  // The heap
    int load;            // The number of items in the heap
    int physicalSize;    // The physical size of A.

    PriorityQueue()
    {
      load         = 0;
      physicalSize = initHeapSize;
      A            = new ItemAndPriority[initheapSize];
    }
  };


Inserting into a heap

There are two steps to inserting something into a min-heap.

  1. Add a new node at the leftmost unoccupied position in the bottom level and put the new priority and item in that node. (If the bottom level is full, add the new node at the leftmost position in a new bottom level.)

  2. After inserting the new node, the ordering requirement might not be met. Perform a reheapUp operation to move the inserted item up the tree as high as it needs to go. (If the new value is smaller than its parent's value, swap with the parent and continue to move it upwards. It cannot move above the root.)

For example, here is the process of inserting priority 4 into a heap. The reheapUp operations includes all of the swap steps.

               5                         5                           5                          4
              / \                       / \                         / \                        / \
             /   \    insert 4         /   \     swap 4 with       /   \    swap 4 with       /   \
           10     8   ------------>  10     8    ------------>   10     4   ------------>   10     5
          /  \   /    into bottom   /  \   / \   its parent     /  \   / \  its parent     /  \   / \
         15  30 25    level        15  30 25  4                15  30 25  8               15  30 25  8
A reheapUp operations does not always move a value all the way to the root. For example, here is a similar insertion that inserts 6 instead of 4.
               5                         5                           5
              / \                       / \                         / \
             /   \    insert 6         /   \     swap 6 with       /   \
           10     8   ------------>  10     8    ------------>   10     6
          /  \   /    into bottom   /  \   / \   its parent     /  \   / \
         15  30 25    level        15  30 25  6                15  30 25  8

Implementation of reheapUp and insert are fairly simple.

  //===========================================
  //                 swap
  //===========================================
  // swap(x,y) swaps variables x and y.
  //===========================================

  void swap(ItemAndPriority& x, ItemAndPriority& y)
  {
    ItemAndPriority t = x;
    x = y;
    y = t;
  }

  //===========================================
  //              reheapUp
  //===========================================
  // reheapUp(i,A) does a reheapUp operation on
  // the heap represented by array A,
  // starting at index i.
  //===========================================

  void reheapUp(int i, ItemAndPriority A[])
  {
    int j = i;
    while(j > 0)
    {
      int p = parent(j);
      if(A[p].priority > A[j].priority)
      {
        swap(A[p], A[j]);
        j = p;
      }
      else
      {
        break;
      }
    }
  }

  //===========================================
  //            enlarge
  //===========================================
  // Move q into a larger array.
  //===========================================

  void enlarge(PriorityQueue& q)
  {
    int              n    = q.physicalSize;
    ItemAndPriority* newA = new ItemAndPriority[2*n];

    for(int i = 0; i < q.load; i++)
    {
      newA[i] = q.A[i];
    }
    delete[] q.A;
    q.A            = newA;
    q.physicalSize = 2*n;
  }

  //===========================================
  //            insert
  //===========================================
  // Insert item x with priority p into
  // priority queue q.
  //===========================================

  void insert(int x, int p, PriorityQueue& q)
  {
    if(q.load == q.physicalSize)
    {
      enlarge(q);
    }

    int n = q.load;
    q.A[n].priority = p;
    q.A[n].item     = x;
    q.load          = n+1;
    reheapUp(n, q.A);
  }


Removal from a heap

Recall that we can only remove the value with the smallest priority from a priority queue. Since the smallest priority is in the root of a heap, it is the value in the root that we want to remove. But it is not possible to remove the root node. Instead, find the rightmost node on the bottom level. Remove that node, and move its contents into the root, replacing the former contents of the root. But now the heap will not obey the ordering requirement. Push the value in the root downwards by swapping it with whichever of its children has a smaller priority. Keep doing this until this value either reaches a leaf or it reaches a node where the values beneath it are no smaller than it. Here is an example of a removal process.

               5                         25                          10                          10
              / \                       / \                         / \                         / \
             /   \    move 25 to       /   \     swap 25 with      /   \     swap 25 with      /   \
           10     12  ------------>  10     12   ------------>   25     12   ------------>   15     12
          /  \   /    the root      /  \         10             /  \         15             /  \
         15  30 25                 15  30                      15  30                      25  30
The swaps are called a reheapDown operation. Implementation of reheapDown is a little tricky because there are three kinds of nodes: those with two children, those with just a left child and those with no chilren.
  //===================================================
  //                  reheapDown
  //===================================================
  // reheapDown(i, A, n) does a reheapDown operation on
  // the heap represented by array A, starting at
  // index i.  Parameter n is the number of values
  // currently in the heap.
  //===================================================

  void reheapDown(int i, ItemAndPriority A[], int n)
  {
    int p = A[i].priority;
    int j = i;

    while(leftchild(j) < n)  // j is not a leaf
    {
      int lc    = leftchild(j);
      int rc    = rightchild(j);
      int pleft = A[lc].priority;

      if(rc < n)  // j has two children
      {
        int pright = A[rc].priority;
        int m      = min(p, min(pleft, pright));
        if(m == p)
        {
          break;
        }
        else if(m == pleft)
        {
          swap(A[j], A[lc]);
          j = lc;
        }
        else
        {
          swap(A[j], A[rc]);
          j = rc;
        }
      }
      else  // j has only a left child.
      {
        if(p > pleft)
        {
          swap(A[j], A[lc]);
        }
        break;
      }
    }
  }

  //===================================================
  //                  remove
  //===================================================
  // remove(x, p, q) removes the pair with the smallest
  // priority from priority queue q.  It stores the
  // item and priority removed into out-parameters
  // x and p.
  //
  // If q is empty, remove sets x and p each to -1.
  //===================================================

  void remove(int& x, int& p, PriorityQueue& q)
  {
    int n = q.load;

    if(n == 0) 
    {
      x = p = -1;
    }
    else
    {
       x      = q.A[0].item;
       p      = q.A[0].priority;
       q.A[0] = q.A[n-1];
       q.load = n-1;
       reheapDown(0, q.A, q.load);
    }
  }


Cost of heap operations

Each level in a heap has twice as many nodes as the level above it. Imagine a heap whose bottom level has depth d. If that level is full, then it contains 2d nodes. About half the total number of nodes in the heap are in the bottom level, when the bottom level is full. So if the heap has n nodes, then n is about 2d+1. That is, d is about log2(n) − 1.

Even if the bottom level is not full, it is of little consequence since that only adds one more to the depth. So we can realistically approximate the depth by log2(n).

The time required for insertion and deletion are dominated by the time required to do the reheapUp or reheapDown operations, whose cost is, in the worst case, proportional to the total depth. So insertion and deletion are both done in time O(log2(n)).


Exercises

  1. What min-heap do you get if you insert 9 into the following min-heap?

              7
             / \
            /   \
          15     8
         /  \   / \
        20  16 50  11
    
    Answer

  2. What min-heap do you get if you insert 5 into the following min-heap?

              7
             / \
            /   \
          15     30
         /  \
        20  16
    
    Answer

  3. What min-heap do you get if you remove the smallest value from the following min-heap?

              7
             / \
            /   \
          15     8
         /  \   / \
        20  16 50  11
    
    Answer

  4. What heap do you get if you remove the smallest value from the following min-heap?

              5
             / \
            /   \
           8     9
          / \   /
        20  10 50
    
    Answer