42B. Merge Sort


An efficient sorting algorithm: merge sort

We can do much better. To sort a nonempty list L of length n, break L into halves, where L1 is the first half and L2 is the second half. Sort lists L1 and L2. So now we have two sorted lists. All that is left to do is to merge those lists together into a single sorted list, taking advantage of the facts that they are already sorted. The following diagram illustrates.

This algorithm, called Merge Sort, can be shown to take time Θ(n log2(n)) in the worst case.

Θ(n log(n)) is much better than Θ(n2). If n is 10,000, n2 is 100,000,000, but nlog(n) is only about 130,000. (You should be able to show that log2(10,000) ≈ 13. Use the fact that log2(1000) is very close to 10.)


Implementation of merge sort for a linked list

The following is an implementation of Merge Sort. It uses a little trick to avoid the inconvenience of cutting the list in half before sorting the halves. Instead, it only sorts a prefix of L of a given length, and tells its caller what the suffix after that prefix is. For example, if L has length 20, then MergeSort(L, n) returns a sorted version of the first 10 members of L and sets L to point to the remaining 10 members. Then it is up to the caller to sort those last 10 members.

If you have trouble understanding the code, do a hand simulation with a linked list diagram.

/*======================================================*
 *			length				*
 *======================================================*
 * length(L) returns length list L.			*
 *======================================================*/

int length(ConstList L)
{
  int count = 0;

  for(ConstList p = L; p != NULL; p = p->tail) 
  {
    count++;
  }
  return count;
}

/*======================================================*
 *			merge				*
 *======================================================*
 * merge(A,B) requires that lists A and B are already   *
 * in nondescending order.  It merges them into a single*
 * list in ascending order and returns that list.       *
 *							*
 * NOTE: This function does not allocate new list cells.*
 * It reorders the cells that make up lists A and B.    *
 * After merge(A,B), lists A and B have been destroyed  *
 * to make the sorted list.				*
 *======================================================*/

List merge(List A, List B)
{
  if(A == NULL)
  {
    return B;
  }
  else if(B == NULL)
  {
    return A;
  }
  else if(A->head < B->head)
  {
    A->tail = merge(A->tail, B);
    return A;
  }
  else
  {
    B->tail = merge(A, B->tail);
    return B;
  }
}

/*======================================================*
 *			MergeSort			*
 *======================================================*
 * MergeSort(L,n) reorders the first n members of list  *
 * L into ascending order, and returns that list.  It   *
 * sets parameter L to point to the remainder of the    *
 * original list L, after the first n members.		*
 *							*
 * For example, if L = [5,2,3,6,4], then		*
 *   MergeSort(L,3)					*
 * returns [2,3,5] and sets L = [6,4].			*
 *							*
 * NOTE: This function rearranges the first n list 	*
 * list cells.  It does not create new cells.		*
 *======================================================*/

List MergeSort(List& L, const int n)
{
  if(n == 0)
  {
    return NULL;
  }
  else if(n == 1)
  {
    List p = L;

    L = L->tail;
    p->tail = NULL;
    return p;
  }
  else
  {
    int  m = n/2;
    List A = MergeSort(L, m);
    List B = MergeSort(L, n-m);
    return merge(A,B);
  }
}

/*======================================================*
 *			SortList			*
 *======================================================*
 * SortList(L) sorts list L into ascending order by	*
 * rearranging the cells.				*
 *======================================================*/

void SortList(List& L)
{
  L = MergeSort(L, length(L));
}