13.1. Garbage Collection


Mark/sweep garbage collectors

Your interpreter will need to create new AST nodes as it goes. Many of those nodes will become inaccessible, and should be recycled so that the interpreter can reuse them.

A mark/sweep garbage collector recovers inaccessible nodes in two phases.

  1. Each node has a mark bit, which is 0 when the garbage collector is not running.

    The mark phase traverses all memory that is in use and sets the mark bit of each node that it encounters to 1.

  2. The sweep phase looks at all of the space for nodes. It recycles each node that has its mark bit set to 0, and sets all mark bits to 0 as it goes.


The free-space list

You will need a global array of nodes, the node space, that you use for allocating nodes. All nodes that you use will be in this array.

Choose a reasonable size for the node space array, such as 10,000. It is not necessary to reallocate it if you run out of space.

Initially, create a linked list, the free-space list, of all of the nodes in the node array, by linking nodes through their s1 fields.

Write a function that allocates a node by removing the first node in the free-space list. Use that to allocate nodes.

Test your interpreter with that node allocation function before writing the garbage collector.


The in-use list

Create a global list of nodes (linked through the s1 field) that keeps track of nodes that are currently in use inside functions.

Whenever a function stores a pointer to a tree into one of its local variables, it must add that pointer to the in-use list.

Whenever a function returns, it must remove each of the pointers that it added to the in-use list.

Please create functions for those jobs.

Important. Make sure that you do not change the value of any type AST variable. If you do, you will need to remove its old value from the in-use list and add its new value.


The mark phase

The mark phase needs to look at every node that is in use.

Write a function markTree(T) that marks all nodes in tree T by traversing T.

Create a function, markEverything, that marks all trees that are currently in use. That includes

  1. All trees in the symbol table.

  2. All trees in the in-use list.

Important. When you mark a tree, mark the entire tree, not just its root.


The sweep phase

Write a function sweep() that loops through the node space. It should rebuild the free-space list as it goes, adding a node to the front of that list if the node's mark bit is 0.

If a node's mark bit is 1, sweep() should set just the mark bit to 0.


Garbage collection

Write a function that does garbage collection by marking then sweeping.

If the node allocator sees that there are no more nodes in the free-space list, it should call the garbage collection function and then try again.

If there are no free nodes even after garbage collection, the interpreter should say what happened and abort.