Writing a BrainF**k Interpreter

Recently while going through learxinyminutes I stumbled upon the (very) short documentation of BrainF**k programming language. I’ve known about this esoteric language for a long time but never took out time to read more about it. Looking into the documentation it seemed that the language has just a handful of commands and would be a great starting point for writing interpreters.

In this case the documentation itself spells out everything we need to create our own BrainF**k interpreter. I’ll be doing this in Python, but feel free to use any programming language you’re comfortable in. Making this interpreter could also help you in getting comfortable with the language you’re currently learning.

Copying directly from the learnxinyminutes website, here is the list of eight commands we need to handle:

+ : Increments the value at the current cell by one
- : Decrements the value at the current cell by one
> : Moves the data pointer to the next cell (cell on the right)
< : Moves the data pointer to the previous cell (cell on the left)
. : Prints the ASCII value at the current cell (i.e. 65 = 'A')
, : Reads a single input character into the current cell
[ : If the value at the current cell is zero, skips to the corresponding ] . Otherwise, move to the next instruction
] : If the value at the current cell is zero, move to the next instruction. Otherwise, move backwards in the instructions to the corresponding [

The runtime also requires an array of 30000 cells all initialised to zero and a data pointer pointing towards the first cell of this array.

Apart from the data pointer, we’ll use another variable called program counter which will point towards the current instruction we need to execute, just like how the CPU operates.

Let’s start with the pointers and array, we’ll also create a variable named program which will accept the BrainF**k program as user input.

# initialize dc array with 30000 "0"s
data_array = [0]*30000

# make a list of commands
program = input().strip()

# program counter
pc = 0

# data counter
dc = 0

One more thing that we need, and what actually makes this interpreter something other than easy, is the loop stack. The square brackets [ and ] effectively represent while loops. One thing you would have noticed while looking at the definition for ] is that it can jump towards the corresponding [. To be able to do this we need to save the position of [, so that we can set our program counter to it. But, we can have nested loops, and thus whenever we get a ], we need to jump to the last [ we found. This can be achieved by using a stack data structure (python lists can be easily used as stacks).

# stack for storing loop counter
loop = []

Now let’s actually start interpreting programs, our program variable string contains the individual commands we need to execute. Instead of iterating character by character through the program string, we’ll create an infinite while loop and iterate through the program manually through our pc variable. This is done because if we use an indexed for loop, it would become difficult to jump through instructions (Can be done, but not required). To break out of the program, we’ll just rely on catching the IndexError exception in Python which will tell us that we have effectively reached the end of program string.

# run forever
# we can have potential infinite loops
while (1):
    try:
        cmd = program[pc]
    except IndexError:
        # we have reached the end
        break

Now the only thing left is actually executing the commands, we’ll do this using a series of if-elif statements. Depending upon the command, we’ll either increase/decrease the data counter, or increase/decrease the actual cell pointed by the data counter. The two command we need to focus on are [ and ], rest are pretty self-explanatory.

For the opening square bracket, we need to skip the loop if the current cell pointed to by the data pointer is zero, for achieving this we’ll just skip towards the closing square bracket using a while loop. If the value pointed to is not zero, we’ll store the current loop pointer and start executing the next instruction.

For the closing square bracket, if the current cell pointed to by the data pointer is not zero, then we have to jump to the corresponding opening bracket, this we’ll get from our loop stack, if not then we just continue with the next instruction (but we still need to pop the stack)

    if cmd == '+':
        data_array[dc] += 1
    elif cmd == '-':
        data_array[dc] -= 1
    elif cmd == '>':
        dc += 1
    elif cmd == '<':
        dc -= 1
    elif cmd == '.':
        asc = chr(data_array[dc])
        print (asc, end='', flush=True)
    elif cmd == ',':
        ch = input().strip()
        data_array[dc] = ord(ch)
    elif cmd == '[':
        if data_array[dc] == 0:
            while (program[pc] != ']'):
                pc += 1
            continue
        else:
            # store the loop program counter
            # maintain a stack for nested loops
            loop.append(pc)
    if cmd == ']':
        if data_array[dc] != 0:
            pc = loop.pop()
            continue
        loop.pop()

    # move to next command
    pc += 1

And that’s it. Here’s the whole program in its entirety :

And here’ the script running: