Description of an Abstract Machine and its Interpreter


Introduction

This describes a simple byte-code language designed as a target language for compilers. This is not intended to be a general purpose byte code. The languages that can be compiled into this byte code are simple, and the programs must be fairly small. (For example, there can be no more than 256 integer constants or global variables.)

Nonetheless, this abstract machine provides some features that are not used by the C- language. For example, it handles real numbers, which C- does not support.


Architecture

Global variables

This abstract machine maintains an array of global variables. Each member of the array occupies a given number of consecutive words. An integer occupies M_SIZE_INTEGER words, a real number occupies M_SIZE_REAL words and an array occupies M_SIZE_ARRAY words. The variables of the array are accessed by indicating a word number, starting at 0. For example, if the array starts with a real number, then an integer and then another integer, the third variable is found at index M_SIZE_REAL + M_SIZE_INTEGER.

Constant tables

The abstract machine has two constant tables, one holding integer constants and one holding real constants. Constants in the tables are referred to by their index, starting at 0. Each table has room for 256 constants.

The stack

The primary data structure is a stack. Operations such as addition and subtraction work on the stack. For example, the addition instructions pop two numbers from the stack and push their sum. Parameters and local variables are also stored in the stack.

Each function has a frame in the stack holding its local information. The frame has the following form.

Parameters and local variables occupy an amount of space according to their sizes, and are obtained by word number, starting at 0 for the first one. For example, if the first local variable is an array and the second is a real number, then the third local variable is found at index M_SIZE_ARRAY + M_SIZE_REAL.

Input flag

A flag tells whether the most recent input instruction failed. It can be tested by a conditional branch instruction.

Strings

In some places instructions must include null-terminated strings. That is a sequence of bytes ended by a byte that holds 0 (the null character).


Program structure

A program is a sequence of sections, each consisting of some number of bytes. Each section is either a constant section, a function section or a variable section. Each instruction that starts a section has a name that begins with MS.

Start function

You must indicate which function the interpreter should call initially.

Indicate that the interpreter should start with the function whose name is fname.
MS_START (one byte)
fname the start function name, a null-terminated string.

Constant sections

A constant section is used to define an integer or real constant. It has one of the following forms.

Create an integer constant n and put it into the integer constant table at the next available index.
MS_INTEGER_CONSTANT (one byte)
n an integer as a null-terminated string in decimal, possibly with a '-' sign as first character.

Create a real constant r and put it into the real constant table at the next available index.
MS_REAL_CONSTANT (one byte)
r a real constant as a null-terminated string in decimal, in a form suitable for conversion to real by atof

Function sections

A function section is used to define a function. It has the following form.

Create a function called fname.
MS_FUNCTION (one byte)
fname null-terminated string, the name of the function
body A sequence of instructions that are performed by the function, starting with the first of those instructions.
MS_END (one byte)

Global variable sections

Global variable sections are used for defining global variables or arrays.

Create an integer global variable. It is added to the end of the current global variables and occupies M_SIZE_INTEGER words.
MS_INTEGER_GLOBAL (one byte)

Create a real global variable. It is added to the end of the current global variables and occupies M_SIZE_REAL words.
MS_REAL_GLOBAL (one byte)

Create a global array of integers. It is added to the end of the current global variables and occupies M_SIZE_ARRAY words. (The array is stored on the side, not directly in the array of variables. Only a pointer to the array is stored in the array of variables.)
MS_INTEGER_ARRAY_GLOBAL (one byte)
k (one byte, the index in the integer constant table of the number of integers in the array)

Create a global array of reals. It is added to the end of the current global variables and occupies M_SIZE_ARRAY words. (The array is stored on the side, not directly in the array of variables. Only a pointer to the array is stored in the array of variables.)
MS_REAL_ARRAY_GLOBAL (one byte)
k (one byte, the index in the integer constant table of the number of reals in the array)



Executable instructions

These instructions are used in the body of a function. The one-byte instructions consist of only the operation code, one byte long. The two-byte instructions have an operation code followed by a one-byte parameter.

One byte instructions: Stack manipulation
Instruction Stack before Stack after Remark
M_POP_INTEGER
x
  Pop one integer from the stack.
M_POP_REAL
x
  Pop one real number from the stack.
M_POP_ARRAY
x
  Pop one array from the stack.

One byte instructions: Integer arithmetic
Instruction Stack before Stack after Remark
M_INTEGER_ADD
n
m
m+n
Pop two integers and push their sum.
M_INTEGER_SUBTRACT
n
m
m-n
Pop two integers and push their difference.
M_INTEGER_MULTIPLY
n
m
m*n
Pop two integers and push their product.
M_INTEGER_DIVIDE
n
m
m/n
Pop two integers and push their integer quotient. See M_INTEGER_MOD for details. It is an error if n = 0.
M_INTEGER_MOD
n
m
m mod n
Pop two integers m and n and push the remainder when m is divided by n. It is an error if n = 0. If this instruction produces remainder r and the integer divide instruction produces quotient q then, m = qn + r. If n > 0 then 0 <= r < n. If n < 0 then n < r <= 0.

One byte instructions: Real arithmetic
Instruction Stack before Stack after Remark
M_REAL_ADD
y
x
x+y
Pop two real numbers and push their sum.
M_REAL_SUBTRACT
y
x
x-y
Pop two real numbers and push their difference.
M_REAL_MULTIPLY
y
x
x*y
Pop two real numbers and push their product.
M_REAL_DIVIDE
y
x
x/y
Pop two real numbers and push their quotient. It is an error if y = 0.0.

One byte instructions: Comparisons
Instruction Stack before Stack after Remark
M_COMPARE_INTEGERS
n
m
c Compare integers m and n and push -1 if m < n, 0 if m = n and 1 if m > n. The value c pushed is an integer.
M_COMPARE_REALS
y
x
c Compare integers x and y and push -1 if x < y, 0 if x = y and 1 if x > y. The value c pushed is an integer.

One byte instructions: Arrays
Instruction Stack before Stack after Remark
M_MAKE_INTEGER_ARRAY
n
A
Pop integer n and push a new array A of n integers.
M_MAKE_REAL_ARRAY
n
A
Pop integer n and push a new array A of n real numbers.
M_DELETE_ARRAY
A
  Pop array A and return the memory that it occupies to the free space pool.
M_INDEX
n
A
A[n]
Pop array A and index n and push A[n] where indexing is from 0. (The first member of A is A[0].) The value pushed is an integer if A is an array of integers, and is a real number if A is an array of reals.
M_STORE_INTEGER_INDEXED
x
n
A
  Store A[n] = x, where x is an integer and A is an array of integers. Indexing is from 0. (The first member of A is A[0].)
M_STORE_REAL_INDEXED
x
n
A
  Store A[n] = x, where x is a real number and A is an array of reals. Indexing is from 0. (The first member of A is A[0].)

One byte instructions: Control
Instruction Stack before Stack after Remark
M_RETURN_INTEGER
n
  Return to the calling function, with return value n (an integer).
M_RETURN_REAL
r
  Return to the calling function, with return value r (a real number).
M_RETURN     Return to the calling function, without a return value r.

One byte instructions: Input and output
Instruction Stack before Stack after Remark
M_READ_INTEGER  
v
Read an integer v from standard input and push it. On success set the input failure flag false. On failure set the input failure flage true and push 0.
M_READ_REAL  
v
Read a real number v from standard input and push it. On success set the input failure flag false. On failure set the input failure flage true and push 0.0.
M_READ_CHAR  
v
Read a character from standard input and push its ascii code v (an integer). At end of file push 0. Set the input failure flag false on success and true if a real number could not be read.
M_WRITE_INTEGER
v
  Pop integer v and write it in decimal to standard output.
M_WRITE_REAL
v
  Pop real number v and write it in decimal to standard output.
M_WRITE_CHAR
v
  Pop v and write the character whose ascii code is v to standard output.

Two byte instructions
Instruction Stack before Stack after Remark
M_PUSH_INTEGER
n
 
n
Push integer n onto the stack.
M_PUSH_INTEGER_CONSTANT
k
 
n
Push integer constant number k onto the stack.
M_PUSH_REAL_CONSTANT
k
 
r
Push real constant number k onto the stack.
M_ALLOC
k
    Push k empty words onto the stack.
M_DEALLOC
k
    Pop k words from the stack.
M_FETCH_LOCAL_INTEGER
k
  n Push the value n of the integer variable located at offset k in the local variable section of the current stack frame.
M_FETCH_LOCAL_REAL
k
  r Push the value r of the real variable located at offset k in the local variable section of the current stack frame.
M_FETCH_LOCAL_ARRAY
k
  A Push the array A stored at at offset k in the local variable section of the current stack frame.
M_STORE_LOCAL_INTEGER
k
n   Pop integer n from the stack and store n into the local variable at offset k in the local variable section of the current stack frame.
M_STORE_LOCAL_REAL
k
r   Pop real number r from the stack and store r into the real variable at offset k in the local variable section of the current stack frame.
M_STORE_LOCAL_ARRAY
k
A   Pop array A from the stack and store A into the local variable at offset k in the local variable section of the current stack frame.
M_FETCH_PARAM_INTEGER
k
  n Push the value n of the integer variable located at offset k in the parameter section of the current stack frame.
M_FETCH_PARAM_REAL
k
  r Push the value r of the real variable located at offset k in the parameter section of the current stack frame.
M_FETCH_PARAM_ARRAY
k
  A Push the array A stored at at offset k in the parameter section of the current stack frame.
M_STORE_PARAM_INTEGER
k
n   Pop integer n from the stack and store n into the variable at offset k in the parameter section of the current stack frame.
M_STORE_PARAM_REAL
k
r   Pop real number r from the stack and store r into the real variable at offset k in the parameter section of the current stack frame.
M_STORE_PARAM_ARRAY
k
A   Pop array A from the stack and store A into the variable at offset k in the parameter section of the current stack frame.
M_FETCH_GLOBAL_INTEGER
k
  n Push the value n of the integer variable located at offset k in the global variable array.
M_FETCH_GLOBAL_REAL
k
  r Push the value r of the real variable located at offset k in the global variable array.
M_FETCH_GLOBAL_ARRAY
k
  A Push the array A stored at at offset k in the global variable array.
M_STORE_GLOBAL_INTEGER
k
n   Pop integer n from the stack and store n into the variable at offset k in the global variable array.
M_STORE_GLOBAL_REAL
k
r   Pop real number r from the stack and store r into the real variable at offset k in the global variable array.
M_STORE_GLOBAL_ARRAY
k
A   Pop array A from the stack and store A into the variable at offset k in the global variable array.
M_LABEL
k
    Label the next non-label instructions as label k. See the branch instructions (M_GOTO, etc.)
M_GOTO
k
    Jump to label k. The label is marked using a M_LABEL instruction. The jump can be either forward or backward.
M_GOTO_IF_ZERO
k
n   Jump to label k if the top n of the stack is 0. The stack must have an integer on top.
M_GOTO_IF_NOT_ZERO
k
n   Jump to label k if the top n of the stack is not 0. The stack must have an integer on top.
M_GOTO_IF_POSITIVE
k
n   Jump to label k if the top n of the stack is a positive integer.
M_GOTO_IF_NOT_POSITIVE
k
n   Jump to label k if the top n of the stack is not a positive integer.
M_GOTO_IF_NEGATIVE
k
n   Jump to label k if the top n of the stack is a negative integer.
M_GOTO_IF_NOT_NEGATIVE
k
n   Jump to label k if the top n of the stack is not a negative integer.
M_GOTO_IF_FAILED
k
    Jump to label k if the last input instruction failed.
M_GOTO_IF_EOF
k
    Jump to label k if the standard input is at the end of file.

Other instructions
Instruction Stack before Stack after Remark
M_CALL
k
name
param n
...
param 1
result

Value k is one byte and name is a null-terminated string. Call the function whose name is name. That function can be defined anywhere in the byte code, possibly after the function that performs this call.

k must be the number of words in the list of parameters. The immediate consequence of this instruction is to push a new stack frame onto the stack and begin running function name. When the function returns, the parameters (k words) are popped and the result of the function (if there is one) is left on top of the stack.


Using the interpreter

The interpreter is available as a C program called machine.c. If the executable code for machine.c is in file machine, then command line

  machine myfile.m
runs the byte-code program found in file myfile. An assembler and disassembler are also available. Get the following files.