8.3. Recursion


Recursion

A function is allowed to call itself. But recall that, whenever a function is called, it gets a new frame in the run-time stack. So each call has its own variables. Effectively, when a function calls itself, it actually calls a copy of itself. When a function calls itself, it is said to be recursive, or to use recursion.

Recursion is generally combined with solution by cases; in some cases, the function calls itself, while in other cases it does not. Suppose, for example, that you want to implement the writeInteger(i) function from the preceding page. There are two natural cases.

  1. If i < 10 then i is just a single digit. Just write that digit and return 1 (since the function returns the total number of characters written).

  2. If i ≥ 10 then more than one character is needed. It is easy to break i down into two parts by dividing it by 10, getting the quotient and the remainder. Here are a few examples.

     i  i/10 i%10
    35 3 5
    9341 934 1
    720 72 0
    Evidently, writing i in this case is a two-step process.
    1. Write i/10
    2. Write i%10
    For example, to write 9341, first write 934 then write 1. The total number of characters written is the sum of the number of characters written in step 1 and the number of characters written in step 2 (which will always be one character).

    The inspiration needed to finish this case is simple. We have a function that writes an integer. It is called writeInteger! Use it to do each of the steps.

We use an if-statement to decide which case we are dealing with, and write the solution of each case inside the if-statement.

  int writeInteger(int i)
  {
    if(i < 10)
    {
      putchar(i + '0');
      return 1;
    }
    else
    {
      int a = writeInteger(i/10);
      int b = writeInteger(i%10);
      return a + b;
    }
  }

Recursion is a substitute for a loop

Recursion is a substitute for a loop. Do not try to use both recursion and a loop in the same function until you are an expert. Use one or the other. Every job that can be done with a loop can also be done with recursion and every job that can be done with recursion can be done with a loop. But one is often much easier than the other. You will find that loops work well for some problems, but they are somewhat inflexible, and require you to do complicated workarounds for some problems. Recursion is more flexible and often gives you short and easy solutions to problems that are difficult to solve with a loop.


Recursive thinking

Here are a few hints that help you write recursive function definitions.

  1. Always have a clear and precise contract before you embark on writing a definition of a recursive function. Put this contract into your toolbox right away so that you can consider using the function that you are writing.

  2. Break the problem down into cases. Solve easy cases in appropriate ways. For more complicated cases, try to break the problem down into pieces.

  3. When solving the pieces of more complicated cases, imagine that the function that you are working on is already available, and assume that it works according to its contract. Do not think about how the recursive calls work. The contract is all you need.


Making sure that recursion stops

A danger with recursion is that your function might not ever produce an answer. As an extreme example, suppose that you need to define a function f(n). Since you imagine that f is available to you and that it works, you write the definition as follows.

  int f(int n)
  {
    return f(n);
  }
The problem is that f(2) does a call to f(2). That call does another call to f(2). Each call involves creating a frame in the run-time stack. The frames will just pile up until you run out of memory in the run-time stack, leading to an infinite recursion. (The function cannot run forever because it runs out of memory, but it would run forever if the memory were unlimited.)

To avoid infinite recursion, do the following.

  1. Make sure that, if f(n) calls f(x), then x is smaller than n. For example, there is no trouble if f(2) calls f(1).

  2. Make sure that the parameter cannot keep getting smaller forever. For example, if the parameter must be a nonnegative integer, then you know that it cannot keep getting smaller and smaller without end.

  3. Be sure that your recursive calls respect the requirements of your function. For example, if the function requires its parameter to be a nonnegative integer, be sure that every call to the function, including recursive calls, pass it a nonnegative integer.

If a function has more than one parameter, you typically concentrate on one of the parameters, and make sure that it is smaller at the recursive calls, and cannot keep getting smaller forever.


Another example

Let's look at the problem of computing xk (x to the k-th power) where x is a real number and k is a positive integer. Breaking this down into cases yields a simple case and a more complicated case.

  1. If k = 1 then xk = x.

  2. If k > 1 then it is clear that xk = x*xk−1. We can use our function to compute xk−1.

That leads to the following definition of power(x,k).

  // power(x,k) returns x to the k-th power.
  //
  // Requirement: k > 0.

  double power(double x, int k)
  {
    if(k == 1)
    {
      return x;
    }
    else {
      return x * power(x, k-1);
    }
  }
That works, but we can do better. Imagine computing x8. Our algorithm does it as follows.
  x8 = x * x7
     = x * x * x6
     = x * x * x * x5
     = x * x * x * x * x4
     = x * x * x * x * x * x3
     = x * x * x * x * x * x * x2
     = x * x * x * x * x * x * x * x1
     = x * x * x * x * x * x * x * x
It ends up doing 7 multiplications. But we can compute x8 more efficiently as follows.
  x8 = (x4)*(x4)
  x4 = (x2)*(x2)
  x2 = (x1)*(x1)
The advantage is that each right-hand side consists of a value multiplied by itself. We only need to compute that value once, and we end up computing x8 using only 3 multiplications. The same idea allows us to compute x1024 using only 10 multiplications. In general, we use the rule that, if k is even then xk = xk/2*xk/2. (We need k to be even to ensure that k/2 is an integer.) Here is the improved function definition.
  // power(x,k) returns x to the k-th power.
  //
  // Requirement: k > 0.

  double power(double x, int k)
  {
    if(k == 1)
    {
      return x;
    }
    else if(k % 2 == 0)
    {
      double p = power(x, k/2);
      return p*p;
    }
    else 
    {
      return x * power(x, k-1);
    }
  }
That is a much more efficient algorithm than the previous one. For example, to compute x9, it uses the rule
  x9 = x * x8
then it uses the efficient algorithm for x8 shown above. To compute x18, it uses
  x18 = x9 * x9
and we have seen that x9 is computed efficiently. In fact, the new function computes xk using no more that 2log2(k) multiplications.


Duplicated recursive calls

What if the definition of power were written as follows?

  // power(x,k) returns x to the k-th power.
  //
  // Requirement: k > 0.

  double power(double x, int k)
  {
    if(k == 1)
    {
      return x;
    }
    else if(k % 2 == 0)
    {
      return power(x, k/2) * power(x, k/2);
    }
    else 
    {
      return x * power(x, k-1);
    }
  }
Notice that, in the middle case, power(x, k/2) is computed twice. Does that make the function take twice as long? Actually, it is much worse than that. When it comes to efficiency of algorithms, recursion acts as an amplifier. If you do something a little bit good (as we did with the improved power algorithm), recursion amplifies that and makes it very good. But if we do something a little bit bad, recursion amplifies that too and makes it very bad. The definition that does two calls to power(x, k/2) uses k−1 multiplications to compute xk. That is much worse than using 2log2(k) multiplications.

Suppose that you have an array A of integers of size n (where n > 0), and you want to find the largest value in the array. If n = 1, then the largest value is just A[0]. If n > 1 then the largest value is the larger of A[n−1] and the largest of the first n−1 values in the array. Here is a definition of largest for consideration.

  // largest(A,n) yields the largest of
  // A[0], A[1], ..., A[n-1].

  int largest(int A[], int n)
  {
    if(n == 1)
    {
      return A[0];
    }
    else if(A[n-1] > largest(A, n-1))
    {
      return A[n-1];
    }
    else
    {
      return largest(A, n-1);
    }
  }
That will compute the correct answer, but it is very slow. The problem is that, when largest(A, n−1) > A[n−1], this function computes largest(A, n−1) twice, once in the test and once in the return statement. Recursion amplifies that. If the largest value in the array is in A[0], then this function takes at least 2n steps to find the largest value. If n = 200, then 2n is much larger than the number of protons in the earth!

It is easy to fix the problem. Just make sure not to compute largest(A, n−1) more than once.

  // largest(A,n) yields the largest of
  // A[0], A[1], ..., A[n-1].

  int largest(int A[], int n)
  {
    if(n == 1)
    {
      return A[0];
    }
    else
    {
      return max(A[n-1], largest(A, n-1));
    }
  }


Variables and recursion

If you examing the recursive function definitions made above, you will notice that none of them changes a variable, once that variable has a value. All of the variables are actually constants. That can come as a surprise to someone who has become accustomed to using loops. How can a repetition be accomplished if no variables ever change?

The answer to that question is simple. Because each call to a function creates a new frame for that function, recursion can create a lot of different variables. The variation needed for repetition is not accomplished by changing the value of one variable, but rather by creating many variables with different values.

That probably sounds extravagent. And, in some cases, it is. But computer resources are relatively cheap, while human resources are much more expensive. What people have found is that, for a lot of problems, recursion takes less development time that looping, and it leads to function definitions that work the first time, with no debugging, more often than you think you have a right to. So recursion can reduce utilization of the resource that really matters: your time.


Exercises

  1. What is the value of g(4), given the definition of g below? (Hint. Work out g(1), then g(2), then g(3), then g(4), in that order. Keep your work organized.)

      int g(int n)
      {
        if(n == 1) return 2;
        else return g(n-1) + 3;
      }
    
    Answer

  2. Write a C++ definition of function sum(a, b) that returns a + (a+1) + ... + b. For example, sum(2,5) = 2 + 3 + 4 + 5 = 14. More precisely, sum(a, b) returns the sum of all integers that greater than or equal to a and less than or equal to b. For this question do not use any kind of loop. Use recursion instead. Answer

  3. Write a recursive definition of the factorial function factorial(n) = 1*2*…*n. Assume that the parameter is a nonnegative integer of type long and the result has type long. By definition, factorial(0) = 1. Answer

  4. Write a function that takes an nonnegative integer argument n and writes the binary representation of n. For example, if n is 6 it writes 110 and if n is 13 it writes 1101. Answer

  5. Suppose the power function is written as follows. Notice that it uses power to do the squaring in the second case. Does it work?

      // power(x,k) returns x to the k-th power.
      //
      // Requirement: k > 0.
    
      double power(double x, int k)
      {
        if(k == 1)
        {
          return x;
        }
        else if(k % 2 == 0)
        {
          return power(power(x, k/2), 2);
        }
        else 
        {
          return x * power(x, k-1);
        }
      }
    
    Answer

  6. Suppose the power function is written as follows. Does it work?

      // power(x,k) returns x to the k-th power.
      //
      // Requirement: k > 0.
    
      double power(double x, int k)
      {
        if(k == 1)
        {
          return x;
        }
        else if(k % 2 == 0)
        {
          return power(x*x, k/2);
        }
        else 
        {
          return x * power(x, k-1);
        }
      }
    
    Answer

  7. Write a program that reads a decimal integer and writes the equivalent binary integer. (See this prior problem for a discussion of binary notation.) Both the input and the output are in standard order, starting with the highest-order digit. The input comes from the standard input and the output goes to the standard output.

    Hints.

    1. You will want to work at the low-order end of a number. If n > 1 then the binary representation of n is the binary representation of n/2 followed by n%2. For example, if bin(n) is the binary string that represents integer n, then

        bin(1)  = "1".
        bin(3)  = "11"    = bin(1)  followed by 1.
        bin(6)  = "110"   = bin(3)  followed by 0.
        bin(12) = "1100"  = bin(6)  followed by 0.
        bin(25) = "11001" = bin(12) followed by 1.
      

    2. If you use these ideas with a loop, you will get the bits in reverse order. The easy way to handle that is to use recursion.

    Answer