Sorting an Array


The sorting problem

Imagine that you have an array that is in some arbitrary order, and you need to sort it into ascending order (possibly so that you can use binary search on it). For simplicity, let's think about an array of integers. For example, the contents of the array might be as shown below before and after sorting.

 Index   Before   After 
0 35 8
1 16 16
2 74 19
3 8 22
4 22 35
5 19 55
6 55 74

Exchange sort

A simple idea is to sweep through the array, looking at consecutive values in the array. If two consecutive values are out of order, then they are swapped.

Unfortunately, one sweep through the array is not enough to sort the array. (Try it on the array shown above.) So do another sweep, and another, until the array is finally sorted. It suffices to keep a flag to say whether the most recently done sweep found that the job was finished. Here is an definition of method exchangeSort(a), which sorts the entire array (using its physical size also as its logical size). Notice that the inner loop only goes to k = a.length - 2. The last two values to compare are a[a.length - 2] and a[a.length - 1].

  public static void exchangeSort(int[] a)
  {
    boolean done = false;
    while(!done)
    {
      done = true;
      for(int k = 0; k < a.length - 1; k++)
      {
        // If a[k] and a[k+1] are out of order, swap and
        // record not yet done.
       
        if(a[k] > a[k+1]) 
        {
          int temp;
          temp   = a[k];
          a[k]   = a[k+1];
          a[k+1] = temp;
          done   = false;
        }
      }
  }

Exchange sort must work because (1) each time it does a swap, it gets closer to the array being sorted, and (2) it only stops when it has done a full sweep of the array and found that all consecutive values are in the correct order.


The cost of Exchange sort

It is worth thinking about how much time exchange sort takes, in terms of the size n of the array. In the worst case, the array is initially backwards. Let's look at an example with n = 5, and look at the array after each sweep.

 Index   Initially   After sweep 1   After sweep 2   After sweep 3   After sweep 4 
0 50 40 30 20 10
1 40 30 20 10 20
2 30 20 10 30 30
3 20 10 40 40 40
4 10 50 50 50 50

Notice that the largest value makes it to the bottom after the first sweep. That is because, after comparing the first two values a[0] and a[1] and swapping them, 50 ends up in a[1]. After that swap, a[1] and a[2] are compared. But then a[1] is 50, and 50 is moved into a[2]. 50 keeps moving down as the sweep progresses.

But also notice that the smallest value, 10, only moves up one place for each sweep. Since it needs to move all the way to the beginning, 4 sweeps are required before the array is sorted, followed by a fifth sweep to recognized that it is sorted. In general, n total sweeps are needed. Since each sweep requires looking at everything in the array, the time per sweep is proportional to n. So the total time for all n sweeps is proportional to n·n = n2.

You can improve exchange sort by noticinng that, after k sweeps, the last k values are in the correct place, so they no longer need to be looked at. That shortens the sweeps. The first sweep takes time n, the next time n-1, the next n-2, etc., until the last sweep only takes time 1. That reduces the time from n·n to (1 + 2 + ... + n). Unfortunately, (1 + 2 + ... + n) is close to n2/2, so that trick only reduces the time by a factor of 2, and the time is still proportional to n2, just with a smaller constant of proportionality.


Quicksort

Exchange sort can be improved on a lot. One popular sorting algorithm is called Quicksort. An important feature of quicksort is that it sorts part of an array, not necessarily the entire array. Like binary search, Quicksort is told low and high indices, and only sorts the array between those indices.

To run Quicksort on a section of an array that has at least two values in it, you begin by running a partition algorithm to do some rearrangement. This section describes the partition algorithm, and the next section uses it to finish Quicksort.

The partition algorithm selects the first value in the section that is being partitioned, and calls it the pivot. Then it rearranges the array so that it is broken into three parts. The first part consists of values that are less than the pivot, the second part holds only the pivot, and the third part contains values that are greater than the pivot. (If there are two or more occurrences of the pivot value, then the extra occurrences are put into the first part.)

The partitioning step is done as follows. First, say that a value is small if it is less than or equal to the pivot, and that it is large if it is greater than the pivot. Imagine taking the pivot value out and setting it aside, leaving behind (conceptually) a hole at the beginning of the section. Go to the end of the array section and do a backward scan (toward the beginning), looking for a small value. When one is found, move it into the hole, leaving behind a hole where near the end of the section. Now go to the beginning of the array, just after where the hole used to be, and do a forward scan until you see a large value. Move it into the new hole, leaving behind yet another hole. Go back to just prior to where the hole that was just filled in was and continue the backward scan. Bounce back and forth between backward and forward scans until the two scans crash into one another. That leaves a hole. Fill in the hole with the pivot value.

The following diagram shows an example of a partition, where the low index is 10 and the high index is 20. When we are about to do a backward scan, the red arrow points to the hole and the blue arrow points to the place where the backward scan starts. When we are about to do a forward scan, the blue arrow points to the hole and the red arrow points to the position where the forward scan starts. Partitioning stops when the red and blue arrows reach the same point, shown by a green arrow.

 Index   Before 

 backward 

 scan 1 
 Before 

 forward 

 scan 1 
 Before 

 backward 

 scan 2 
 Before 

 forward 

 scan 2 
 Before 

 backward 

 scan 3 
 Before 

 forward 

 scan 3 
 After 

 scans 
 Put pivot 

 in hole 
10 (  )    5    5    5    5    5    5    5
   11    62 62 (  )    30    30    30    30    30
   12    21    21    21 21    21    21    21    21
   13    8    8    8    8    8    8    8    8
   14    77    77    77    77 (  )    12    12    12
   15    90    90    90    90    90 90 (  )    40
   16    12    12    12    12 12 (  )    90    90
   17    30    30    30 (  )    77    77    77    77
   18    59    59 59    59    59    59    59    59
   19    5 (  )    62    62    62    62    62    62
   20 82    82    82    82    82    82    82    82

The first backward scan finds the small value 5. It moves 5 to the hole (leaving behind a hole where 5 used to be) then moves the red arrow down one slot to just after where the hole was, and is ready to do a forward scan.

The first forward scan immediately finds the large value 62. It moves 62 into the hole (leaving behind a hole where 62 used to be) then moves the blue arrow up one slot to just before where the hole used to be, and is ready to do another backward scan.

The following is a definition of partition. Variable i is the array index where the red arrow is, and variable j is the index where the blue arrow is. Calling partition(a, low, high) partitions array a from index low to index high, including the end indices. (So it partitions a[low], a[low+1], ..., a[high].) Not only does it do the partitioning, but it also returns the index where the pivot is after the partitioning. In the example above, that would be 15, the index where the pivot 40 is at the end.

  public static int partition(int[] a, int low, int high)
  {
    int pivot = a[low];
    int i = low;
    int j = high;
    while(i < j)
    {
      // Backward scan.

      while(i < j && a[j] > pivot) j--;
      if(i < j)
      {
        a[i] = a[j];
        i++;
      }

      // Forward scan.

      while(i < j && a[i] <= pivot) i++;
      if(i < j) 
      {
        a[j] = a[i];
        j--;
      }
    }
    a[i] = pivot;
    return i;
  }

Quicksort from partition

When Quicksort has an array section of no more than one value to sort, it just does nothing, since an array of size 0 or 1 is obviously already sorted.

When Quicksort has at least two things to sort, it begins by paritioning them. That gives three parts, those that are less then the pivot, the pivot itself, and those that are greater than the pivot. To finish the job, it suffices to sort the first and third parts. Quicksort does that job using Quicksort! Here is a definition of Quicksort in Java.

  public static void Quicksort(int[] a, int low, int high)
  {
    if(low < high)
    {
      int k = partition(a, low, high);
      Quicksort(a, low, k-1);
      Quicksort(a, k+1, high);
    }
  }

To see how this works, let's do an example. The array is shown after partitioning and then after completing the sorts of the first and third parts, without looking at the details of how those recursive sorts work. (Remember that, when you are checking a recursive function, you just presume that the recursive calls to the function do their job correctly.)

 Index   Before   After partitioning   After sorting the

first and third parts 
10 40 5 5
11 62 30 8
12 21 21 12
13 8 8 21
14 77 12 30
15 90 40 40
16 12 90 59
17 30 77 62
18 59 59 77
19 5 62 82
20 82 82 90

The cost of Quicksort

It is not at all obvious whether Quicksort is a fast algorithm. (Don't be fooled by the name. After all, Greenland is called Greenland.)

It turns out that Quicksort does its fastest work when the pivot value ends up close to the middle of the section after partitioning. So, for simplicity, let's imagine a situation where the pivot ends up right in the middle every time, which is the best that Quicksort can hope for. This is overly optimistic, but let's deal with that later.

Because Quicksort is recursive, many calls to quicksort are used to sort one array. We can add up the time taken by all of them using an accounting trick. We draw a diagram where Q(n) stands for a call to Quicksort on a section of size n. As long as n > 1, that call to Quicksort will call partition. We charge the cost of doing that partition to that call of Quicksort. So each call to quicksort has its own expense account, and it is just a matter of adding up all of the expense accounts.

When the partition method is asked to partition an array section of size n, it takes time proporitional to n to do the job. (It compares each value in the section with the pivot once.) So, for simplicity, we charge n units of time to Q(n). The total amount charged to each call to Quicksort is shown below it, in brackets.

Notice that the sum of the costs at each level is n. So the total cost is just n times the total number of levels. Each level has a section size that is half the size of the level above it. As the sizes are cut in half, eventually the size reaches 1, and Quicksort does not do anything at all. As we saw for binary search, the total number of levels must be about log2(n). So, in this good case where the pivot always ends up in the middle after partitioning, the time used by Quicksort on an array section of size n is proportional to nlog2(n).

That is the best possible case for Quicksort. But what about the worst case? It turns out the the worst case for Quicksort occurs when the array starts out already sorted! After partitioning, the pivot stays where it was, at the beginning of the section. In that case, each time you do a recursive call, the array section size only goes down by 1. You get the following diagram.

Now adding up all of the costs gives n + (n-1) + (n-2) + ... + 1 = n(n-1)/2. That is proportional to n2.

So in the best case, Quicksort's time is proportional to nlog2(n), but in the worst case the time is proportional to n2. What about the average case? It turns out that, when the initial array is in a random order, the average time taken by Quicksort is proportional to nlog2(n). Exchange sort takes time proportional to n2 even in the average case.


Improving Quicksort

You can improve Quicksort by selecting a pivot that is more likely to end up in the middle after partitioning. A common way to do that is to get three values, a[low], a[high] and a[mid], where mid = (low + high)/2. Select the median of those three values (the one that is between the other two). Swap it with a[low], so that it will be used as the pivot. That simple change makes Quicksort take its best time when the array starts out being sorted, and tends to make Quicksort very fast in practice.


Problems

  1. [solve] How long does Exchange sort take, in terms of n, to sort an array of size n?

  2. [solve] How long does Quicksort take, in terms of n, to sort an array of size n, on the average?

  3. [solve] How long does Quicksort take, in terms of n, to sort an array of size n in the worst case?


Summary

Exchange sort sorts an array of size n in time proportional to n2.

Quicksort sorts an array of size n in time proportional to nlog2(n) in the average case, but takes time proportional to n2 in the worst case.