|
Most compilers translate a program from one language (the source language) to another (the target language).
The target language can vary greatly. Samples are:
But a compiler does not restrict itself to a single target language.
Instead, the compiler translates to intermediate code, a simplified machine language. Multiple back ends are written, each converting from the intermediate code into a particular target language.
Retargeting the compiler is just a matter of selecting (or writing) a particular back end.
Compilers with multiple back ends can also be used as cross-compilers; a cross compiler runs on one kind of machine and produces code for a different kind of machine.
3-address intermediate code is a very simple artificial machine language.
Real processors have instructions of more than one size. 3-address code instructions are all the same size.
Each 3-address instruction has 4 parts:
an instruction number telling what kind of instruction it is;
up to three "addresses" that are usually numbers or variable/function names.
The Dragon Book describes 3-address codes in section 6.2. Here are some of the instructions that the book describes.
Binary operator instructions x = y op z, where x is a variable name and y and z are variable names or numbers. Operator op might be + or *, for example. It is the instruction kind.
Unary operator instructions x = op y, where x is a variable name and and y is a variable name or number.
Copy instructions x = y.
Label instructions Label L. (The Dragon Book does not use these, but puts the label on the side.)
Branch instructions
where L is a label.
Comparison instructions if x op y goto L, where op is a comparison such as = or <.
Procedure calls. The following sequence of instructions calls p(x, y, z).
param x param y param z call p, 3
Array instructions x[i] = y and x = y[i].
Address and pointer instructions
Code fragment
do i = i + 1; while(a[i] < v);
might be translated to the following sequence of 3-address codes. Variables t1, etc. are temporaries.
Label L t1 = i + 1 i = t1 t2 = i*8 t3 = a[t2] if t3 < v goto L
Note the second and third instructions, which together perform i = i + 1. Typical intermediate code generators create code that does a lot of unnecessary copies. Those are fixed up in a later phase of the compiler, so they don't worry us here.
A single 3-address code instruction is typically represented as a structure with 4 fields: the instruction kind and 3 integers or strings.
Instruction "if x < y goto L" would be represented as follows.
< | x | y | L |
But it is more convenient to write them in symbolic form.
|