Quiz 7, question 7

60mph is one mile per minute. If you average 60mph for two miles, then it takes you two minutes to go the two miles.

If you travel the first mile at 30mph, it takes you two minutes. So you have used up your entire time on the first mile, and you need to travel the second mile infinitely fast.

When you talk about an average, it is important to know what you are averaging over. Average speed is an average over time. If you imagined that it was an average over distance, then you would conclude that you could travel the second mile at 90mph.

We talk about average times for algorithms, but the averages are not always over the same things. Here are some examples.

We say that Quicksort takes average time O(n log₂(n)). In this case, the average is taken over all possible ways in which the array could have been ordered. So it is an average over the possible inputs. This is the fishiest kind of average for algorithms. What makes me believe that the person asking me to sort an array hands me a randomly chosen array?
Some algorithms (called Monte Carlo algorithms) employ coin tosses. When you talk about their averages, you talk are averaging over the possible coin toss outcomes. That has nothing to do with the data, and you do have a reason to believe that the coin tosses are random, so this is much less fishy.
Some algorithms have amortized cost guarantees. That means you are averaging the cost per operation over the sequence of operations. That has nothing to do with how the operations are chosen (by whoever wants to do them). So it is also quite reasonable. It is a worst case over the input, but an average over the sequence of operations.

Some Monte Carlo algorithms occasional make mistakes, meaning they give the wrong answer. You estimate the probability of a mistake, again by looking at the coin tosses, not by looking at the input. If a Monte Carlo algorithm gives a mistake on input X, you can run it again on the same input. It will probably produce the correct answer the second time. There is a popular Monte Carlo algorithm for telling whether a number is prime. Sometimes it gives the wrong answer, but you can make the probability of that happening as low as you like.

Here is a problem that has a good Monte Carlo algorithm. You are given a list of equations, where the ith equation either has the form

X_i = c

where c is an integer constant, or has the form

X_i = X_j op Xk

where j < i and k < k, and op is one of the operators +, - or *. The problem is to say whether the value produced by the last line is 0. You have to do mathematically correct operations on integers. It is not acceptable, for example, to limit the size to no more than some predetermined limit. You will initially think that there is a trivial algorithm, but remember that as numbers become larger, arithmetic on them becomes more expensive. You can get the answer (with some possibility of making a mistake) by carefully choosing a modulus m and doing all of the arithmetic modulo m.