[SUMMARY] Chip-8 (#88)

As some of you may know, the annual ICFP contest was just two weekends ago and
this year's problem involved the creation of a small virtual machine, much as
this quiz does. We didn't plan that, but it turned out to be a pleasant
synergy.

The ICFP machine was really a non-ideal environment for Ruby. The performance
issues there really called for C speed. In contrast, Chip-8 programs seem to
fit Ruby emulation nicely.

I'm not going to cover it below, but do spend a little time playing with Sander
Land's complete emulator if you have a Windows box handy. It runs games from
the quiz-linked site and isn't much more code than some of the other solutions.

I do want to take a look at Boris Prinz's solution though, so let's get to it.
This first chunk of code isn't the actual quiz solution, but it's a wonderful
tool for seeing how solutions process Chip-8 files. Here's Boris's
disassembler:

  class Chip8Disassembler
    CODES = {
      /0000/ => 'exit',
      /1(...)/ => 'goto \1',
      /3(.)(..)/ => 'skip next if V\1 == 0x\2',
      /6(.)(..)/ => 'V\1 = 0x\2',
      /7(.)(..)/ => 'V\1 = V\1 + 0x\2',
      /8(.)(.)0/ => 'V\1 = V\2',
      /8(.)(.)1/ => 'V\1 = V\1 | V\2',
      /8(.)(.)2/ => 'V\1 = V\1 & V\2',
      /8(.)(.)3/ => 'V\1 = V\1 ^ V\2',
      /8(.)(.)4/ => 'V\1 = V\1 + V\2',
      /8(.)(.)5/ => 'V\1 = V\1 - V\2',
      /8(.)06/ => 'V\1 = V\1 >> 1',
      /8(.)(.)7/ => 'V\1 = V\2 - V\1',
      /8(.)0E/ => 'V\1 = V\1 << 1',
      /C(.)(..)/ => 'V\1 = rand() & 0x\2',
    }
  
    def self.code2text hexcode
      CODES.each do |re, subs|
        if hexcode =~ re
          return hexcode.sub(re, subs)
        end
      end
      '???'
    end
  
    def self.dump binary
      opcodes = binary.unpack "n*"
      opcodes.each_with_index do |code, waddr|
        code_hex = "%04X" % code
        printf("%03X: [%s] %s\n", waddr*2, code_hex, code2text(code_hex));
      end
    end
  end
  
  binary = File.read(ARGV[0])
  Chip8Disassembler.dump(binary)

If you skip to the end, you can see that this code just slurps the Chip-8
program passed as an argument and feeds the whole thing to dump().

The dump() method separates the program into opcodes with a simple call to
unpack(). The "n*" pattern for unpack() has it read two characters at a time as
an unsigned integer. It also dictates that the number is in "network
byte-order" which avoids endian issues on different architectures. From there
each opcode is traversed, turned into a hex code, and handed off to the
code2text() method for translation.

Inside code2text(), the passed hexcode is compared to each pattern of the CODES
Hash in turn. When a match is found, the key and value Hash pair are used to
perform a regular expression conversion from code to source statement and that
statement is returned.

Each of those statements is decorated with the code_hex and the address of the
instruction in the program file, then printed. The end result looks like this,
for the quiz test program:

  000: [6177] V1 = 0x77
  002: [6245] V2 = 0x45
  004: [7101] V1 = V1 + 0x01
  006: [8320] V3 = V2
  008: [8121] V1 = V1 | V2
  00A: [8122] V1 = V1 & V2
  00C: [8233] V2 = V2 ^ V3
  00E: [8134] V1 = V1 + V3
  010: [8235] V2 = V2 - V3
  012: [8106] V1 = V1 >> 1
  014: [8327] V3 = V2 - V3
  016: [830E] V3 = V3 << 1
  018: [64FF] V4 = 0xFF
  01A: [C411] V4 = rand() & 0x11
  01C: [32BB] skip next if V2 == 0xBB
  01E: [1000] goto 000
  020: [0000] exit

The hex codes from the quiz description are in brackets in the middle and you
can see the operations just to the right of that. Hopefully that makes it clear
how you break down instructions and what they are suppose to do.

Now we should be well equipped to read Boris's actual solution. Here's how it
begins:

  class Chip8Emulator
    attr_accessor :register
  
    VF = 15 # index of the carry/borrow register
  
    def initialize
      @register = [0] * 16 # V0..VF
    end
  
    # ...

I doubt there is anything tricky there. The registers are just an Array of
Integers which are made available through the accessor. We also see a
convenient VF constant defined for accessing that register.

Here's the main event loop of the emulator:

    # ...
    
    def exec(program)
      opcodes = program.unpack('n*')
      current = 0 # current opcode index
      loop do
        opcode = opcodes[current]
        
        # these are needed often:
        x = opcode >> 8 & 0x0F
        y = opcode >> 4 & 0x0F
        kk = opcode & 0xFF
        nnn = opcode & 0x0FFF
        
        case opcode >> 12 # first nibble
        when 0 then return
        when 1 then current = (nnn) / 2 and next
        when 3 then current += 1 if @register[x] == kk
        when 6 then @register[x] = kk
        when 7 then add(x, kk)
        when 8
          case opcode & 0x0F # last nibble
          when 0 then @register[x] = @register[y]
          when 1 then @register[x] |= @register[y]
          when 2 then @register[x] &= @register[y]
          when 3 then @register[x] ^= @register[y]
          when 4 then add(x, @register[y])
          when 5 then subtract(x, x, y)
          when 6 then shift_right(x)
          when 7 then subtract(x, y, x)
          when 0xE then shift_left(x)
          else raise "Unknown opcode: " + opcode.to_s(16)
          end
        when 0xC then random(x, kk)
        else raise "Unknown opcode: " + opcode.to_s(16)
        end
        current += 1 # next opcode
      end
    end
    
    # ...

This entire method basically comes right out of the quiz. The program is first
unpack()ed into opcodes, as we saw with the disassembler and a current opcode
pointer is initialized.

Next the code enters into an infinite loop() of opcode processing. Inside that
loop(), the current opcode is pulled and then split into the x, y, kk, and nnn
variables the quiz discusses.

The big case statement after that is the opcode instruction processor. Most of
those are trivial implementations of the quiz descriptions, but you can see that
the more complex operations are delegated to helper methods, which we will
examine in just a moment.

The last thing of note here is the jump implementation with the `and next` code
on the end of it. That forces the next iteration of the loop() to start
immediately, bypassing the current opcode pointer increment after the case
statement.

Let's see those helper methods:

    # ...
    
    def add(reg, value)
      result = @register[reg] + value
      @register[reg] = result & 0xFF
      @register[VF] = result >> 8 # carry
    end
  
    def subtract(reg, a, b)
      result = @register[a] - @register[b]
      @register[reg] = result & 0xFF
      @register[VF] = - (result >> 8) # borrow
    end
  
    def shift_right(reg)
      @register[VF] = @register[reg] & 0x01
      @register[reg] >>= 1
    end
  
    def shift_left(reg)
      @register[VF] = @register[reg] >> 7
      @register[reg] = (@register[reg] << 1) & 0xFF
    end
  
    def random(reg, kk)
      @register[reg] = rand(256) & kk
    end
    
    # ...

These are mainly just the operations that affect the VF register. Because they
require multiple steps, it was easier to move these into separate methods and
leave the event loop case statement clear. These definitions still come right
out of the quiz.

Finally, we have the last bit of code needed to dump registers and kick-off the
program run in the first place. Here's the code:

    # ...
    
    # show all registers
    def dump
      0.upto(VF) do |reg|
        printf("V%1X:%08b\n", reg, @register[reg])
      end
    end
  end
  
  if $0 == __FILE__
    ARGV.each do |program|
      emu = Chip8Emulator.new
      emu.exec(File.read(program))
      emu.dump
    end
  end

The dump() method is a simple print routine for register names and values,
iterating over all the registers. Below that we have the code the executes and
dumps each program provided as an argument, using the methods we have been
examining.

Boris's code did include unit tests. Each opcode was tested against a hand
crafted mini-program to check functionality. Here's the subtraction test, for
example:

    # ... other test code not shown ...
    
    def test_subtract
      @emu.exec("\x60\x00" + "\x61\x01" + "\x80\x15" + "\0\0")
      assert_equal [255, 1] + [0]*13 + [1], @emu.register
    end
    
    # ... other test code not shown ...

Don't miss the other solutions folks. Take some time to look them over. All
are quite clever. You will even find elements not covered in this summary, like
Alexandru E. Ungur's assembler!

My thanks to all who took tried their hand at building their own emulator. I
hope you found it a lot easier than I did building a good ICFP emulator.

In tomorrow's quiz, we will see just how far a simple method like upcase() can
really go in terms of useful application...