3.3. Demonstrating that a Language Is Not Regular


Naive arguments

The argument on the previous page that W is not regular has the following form, if painted with a broad brush.

  1. I have thought of a way to solve W.

  2. I notice that my way of solving W is not finite state.

  3. Therefore, nobody will ever find a finite-state machine for W.

Now, you might say that I am being unfair to the argument. After all, it did say that it appears that any other algorithm would need to count the number of a's.

But is that really so different? It still says that I cannot see how anybody else could come up with a clever scheme to avoid counting the a's.

It is very important for you to recognize naive arguments for what they are.

Later in this course, you will be tempted to use or to accept a naive argument as valid. Resolve now not to do that.


Proving that W is not regular

It is always logically valid to assume the negation of what you want to prove. (It is called proof by contradiction.) So assume that W is regular.

Now, what does that assumption do for us? Well, since W is regular, there must be a DFA M so that L(M) = W. (That is just going back to the definition of a regular language.)

Our goal is to show that M does not work. That is, L(M) ≠ W. That contradicts the assumption that M solves W.

To do that, I will describe an algorithm that finds an input on which M gives the wrong answer.

Let's try to catch M remembering a lot of information.

For every n > 0, run M on string an, and write down the state δ*(q0, an) on which M ends. Your table might begin

s δ*(q0, s)
a q2
aa q9
aaa q4
aaaa q6

Since M only has finitely many states, eventually you will write down a state that you have written down before. Stop at that point, and look at the two lines that have the same state.

s δ*(q0, s)
ai q
aj q

So strings ai and aj take M to the same state. Intuitively, that means that M cannot distinguish between ai and aj, since all M can remember is its state. We can exploit that to show that M does not work.

Think about what happens when we run M on input bi.

  ai
q0 q q'
  aj   bi

Because M ends on the same state q on string ai as on string aj it must also end on the same state q' = δ*(q, bi) on the two strings aibi and ajbi.

Notice that aibiW but ajbiW. If state q' is a rejecting state then M gives the wrong answer on input aibi. If state q' is an accepting state then M gives the wrong answer on input ajbi. In any event, L(M) ≠ W.