[SUMMARY] Verbal Arithmetic (#128)

I'm always impressed by the creativity of the solutions, but I think this week
stands out even more than usual. I literally spent hours going through the
solutions and learned some really great tricks from them. I wish I could take
you on the same tour of the code, but that would take this summary into the
range of a full text book in length.

Because I'm going to miss all of the following, let me point out some highlights
for your own explorations:

  * Though the brute-force solutions are slow, most of them handle any
    math equations Ruby can. That is an interesting advantage.
  * Andreas Launila sent in a fun preview of his Google Summer of Code
    project that looks to simplify many of these search problems we
    commonly use as quizzes.
  * Glen's solution is a nifty metaprogramming solution that customizes
    itself to the equation entered. It's lightning quick too.
  * Morton Goldberg solved the quiz with some genetic programming and
    that code is still quite a bit zippier than a brute-force search.

The solution I will show is from Eric I. It has an interesting state machine
design that tries to fail fast in an attempt to aggressively prune the search
space. It too finds solutions quite rapidly, though it only works for addition
problems.

Eric's code breaks the equation down into a small series of steps. Instead of
searching for a match for all numbers and then checking the result, this
solution checks as many little sub-criteria as possible. Does just this column
add up correctly, given what we know at this point? Is this digit a zero,
because it starts a term somewhere else?

These smaller tests lead to failures that allow the search to skip large groups
in the set of possible solutions. For example, if S can't be seven in just one
column, it's impossible to have any scenario where S is seven and all such
attempts can be safely skipped. That allows the code to zoom in on a correct
answer faster.

Now that we understand the logic, let's start tackling the code:

  require 'set'
  
  # State represents the stage of a partially solved word equation. It
  # keeps track of what digits letters map to, which digits have not yet
  # been assigned to letters, and the results of the last summed column,
  # including the resulting digit and any carry if there is one.
  class State
    attr_accessor :sum, :carry
    attr_reader :letters
  
    def initialize()
      @available_digits = Set.new(0..9)
      @letters = Hash.new
      @sum, @carry = 0, 0
    end
  
    # Return digit for letter.
    def [](letter)
      @letters[letter]
    end
  
    # The the digit for a letter.
    def []=(letter, digit)
      # if the letter is currently assigned, return its digit to the
      # available set
      @available_digits.add @letters[letter] if @letters[letter]
  
      @letters[letter] = digit
      @available_digits.delete digit
    end
  
    # Clear the digit for a letter.
    def clear(letter)
      @available_digits.add @letters[letter]
      @letters[letter] = nil
    end
  
    # Return the available digits as an array copied from the set.
    def available_digits
      @available_digits.to_a
    end
  
    # Tests whether a given digit is still available.
    def available?(digit)
      @available_digits.member? digit
    end
  
    # Receives the total for a column and keeps track of it as the
    # summed-to digit and any carry.
    def column_total=(total)
      @sum = total % 10
      @carry = total / 10
    end
  end
  
  # ...

This State object tracks progress through the equation, which will be solved
column by column. It has operations to track what each letter is currently
assigned to, assign letters as they are determined, examine which digits have
and have not been used, and track the sum of the last column plus any value
carried over to the next column. There's nothing too tricky in this data
structure code.

What we need to go with this, is an algorithm that drives this State object to a
solution. That code begins here:

  # ...
  
  # Step is an "abstract" base level class from which all the "concrete"
  # steps can be derived. It simply handles the storage of the next
  # step in the sequence. Subclasses should provide 1) a to_s method to
  # describe the step being performed and 2) a perform method to
  # actually perform the step.
  class Step
    attr_writer :next_step
  end
  
  # ...

This base Step is about as simple as things get. I merely provides a means of
storing the next step in the process.

Note that this class's abstract status and the required implementation for
subclasses are all handled through the documentation. That's perfectly
reasonable in a dynamic language like Ruby where we can count on duck typing to
resolve to the proper methods when the search is actually being performed.

Let's advance to a concrete implementation of the Step class:

  # ...
  
  # This step tries assigning each available digit to a given letter and
  # continuing from there.
  class ChooseStep < Step
    def initialize(letter)
      @letter = letter
    end
  
    def to_s
      "Choose a digit for \"#{@letter}\"."
    end
  
    def perform(state)
      state.available_digits.each do |v|
        state[@letter] = v
        @next_step.perform(state)
      end
      state.clear(@letter)
    end
  end
  
  # ...

This ChooseStep handles the digit guessing. It is created for some letter and
when perform() is triggered, it will try each unused in turn digit in that
position. After a new guess is set, the ChooseStep just hands off to a later
step to verify that the current guess works.

Here's another Step subclass:

  # ...
  
  # This step sums up the given letters and changes to state to reflect
  # the sum. Because we may have to backtrack, it stores the previous
  # saved sum and carry for later restoration.
  class SumColumnStep < Step
    def initialize(letters)
      @letters = letters
    end
  
    def to_s
      list = @letters.map { |l| "\"#{l}\"" }.join(', ')
      "Sum the column using letters #{list} (and include carry)."
    end
  
    def perform(state)
      # save sum and carry
      saved_sum, saved_carry = state.sum, state.carry
  
      state.column_total =
        state.carry +
        @letters.inject(0) { |sum, letter| sum + state[letter] }
      @next_step.perform(state)
  
      # restore sum and carry
      state.sum, state.carry = saved_sum, saved_carry
    end
  end
  
  # ...

This SumColumnStep will be added whenever guesses had been made for an entire
column. It's job is to add up that column and update the State with this new
total. You can see that it must save old State values and restore them when
backtracking.

Once we know a column total, we can use that to set a letter from the solution
side of the equation:

  # ...
  
  # This step determines the digit for a letter given the last column
  # summed. If the digit is not available, then we cannot continue.
  class AssignOnSumStep < Step
    def initialize(letter)
      @letter = letter
    end
  
    def to_s
      "Set the digit for \"#{@letter}\" based on last column summed."
    end
  
    def perform(state)
      if state.available? state.sum
        state[@letter] = state.sum
        @next_step.perform(state)
        state.clear(@letter)
      end
    end
  end
  
  # ...

This AssignOnSumStep is added for letters in the solution of the equation. It
will set the value of that letter to the calculated sum of the column, provided
that is a legal non-duplicate digit choice.

When we have assigned that letter, we need to verify that the whole column makes
sense mathematically:

  # ...
  
  # This step will occur after a column is summed, and the result must
  # match a letter that's already been assigned.
  class CheckOnSumStep < Step
    def initialize(letter)
      @letter = letter
    end
  
    def to_s
      "Verify that last column summed matches current " +
        "digit for \"#{@letter}\"."
    end
  
    def perform(state)
      @next_step.perform(state) if state[@letter] == state.sum
    end
  end
  
  # ...

Now, if we did all the guessing, summing, and assigning everything probably adds
up. But as we continue through the equation, some numbers will already be
filled in. Sums created using those may not balance with the total digit. This
CheckOnSumStep watches for such a case.

If the sum doesn't check out, this class causes backtracking. Note that all it
has to do is not forward to the following steps which will cause recursion to
unwind the stack until it has another option.

One last check can trim the search space further:

  # ...
  
  # This step will occur after a letter is assigned to a digit if the
  # letter is not allowed to be a zero, because one or more terms begins
  # with that letter.
  class CheckNotZeroStep < Step
    def initialize(letter)
      @letter = letter
    end
  
    def to_s
      "Verify that \"#{@letter}\" has not been assigned to zero."
    end
  
    def perform(state)
      @next_step.perform(state) unless state[@letter] == 0
    end
  end
  
  # ...

This CheckNotZeroStep is used to ensure that a leading letter in a term is
non-zero. Again, it fails to forward calls when this is not the case.

One more step is needed to catch correct solutions:

  # ...
  
  # This step represents finishing the equation. The carry must be zero
  # for the perform to have found an actual result, so check that and
  # display a digit -> letter conversion table and dispaly the equation
  # with the digits substituted in for the letters.
  class FinishStep < Step
    def initialize(equation)
      @equation = equation
    end
  
    def to_s
      "Display a solution (provided carry is zero)!"
    end
  
    def perform(state)
      # we're supposedly done, so there can't be anything left in carry
      return unless state.carry == 0
  
      # display a letter to digit table on a single line
      table = state.letters.invert
      puts
      puts table.keys.sort.map { |k| "#{table[k]}=#{k}" }.join(' ')
  
      # display the equation with digits substituted for the letters
      equation = @equation.dup
      state.letters.each { |k, v| equation.gsub!(k, v.to_s) }
      puts
      puts equation
    end
  end
  
  # ...

This method first ensures that we are successful by validating that we have no
remaining carry value. If that's true, our equation balanced out.

The rest of the work here is just in printing the found result. Nothing tricky
there.

We're now ready to get into the application code:

  # ...
  
  # Do a basic test for the command-line arguments validity.
  unless ARGV[0] =~ Regexp.new('^[a-z]+(\+[a-z]+)*=[a-z]+$')
    STDERR.puts "invalid argument"
    exit 1
  end
  
  # Split the command-line argument into terms and figure out how many
  # columns we're dealing with.
  terms = ARGV[0].split(/\+|=/)
  column_count = terms.map { |e| e.size }.max
  
  # Build the display of the equation a line at a time. The line
  # containing the final term of the sum has to have room for the plus
  # sign.
  display_columns = [column_count, terms[-2].size + 1].max
  display = []
  terms[0..-3].each do |term|
    display << term.rjust(display_columns)
  end
  display << "+" + terms[-2].rjust(display_columns - 1)
  display << "-" * display_columns
  display << terms[-1].rjust(display_columns)
  display = display.join("\n")
  puts display
  
  # AssignOnSumStep which letters cannot be zero since they're the first
  # letter of a term.
  nonzero_letters = Set.new
  terms.each { |e| nonzero_letters.add(e[0, 1]) }
  
  # A place to keep track of which letters have so-far been assigned.
  chosen_letters = Set.new
  
  # ...

This code validates the input and breaks it into terms. After that, the big
chunk of code here displays the equation in a pretty format, like the examples
from the quiz description.

The rest of the code begins to divide up the input as needed to build the proper
steps. The first tactic is to locate and letters that must be nonzero, because
they start a term. A set is also prepared to hold letters that have be given
values at any point in the process.

Here's the heart of the process code:

  # ...
  
  # Build up the steps needed to solve the equation.
  steps = []
  column_count.times do |column|
    index = -column - 1
    letters = [] # letters for this column to be added
  
    terms[0..-2].each do |term| # for each term that's being added...
      letter = term[index, 1]
      next if letter.nil? # skip term if no letter in column
      letters << letter # note that this letter is part of sum
  
      # if the letter does not have a digit, create a ChooseStep
      unless chosen_letters.member? letter
        steps << ChooseStep.new(letter)
        chosen_letters.add(letter)
        steps << CheckNotZeroStep.new(letter) if
          nonzero_letters.member? letter
      end
    end
  
    # create a SumColumnStep for the column
    steps << SumColumnStep.new(letters)
  
    summed_letter = terms[-1][index, 1] # the letter being summed to
  
    # check whether the summed to letter should already have a digit
    if chosen_letters.member? summed_letter
      # should already have a digit, check that summed digit matches it
      steps << CheckOnSumStep.new(summed_letter)
    else
      # doesn't already have digit, so create a AssignOnSumStep for
      # letter
      steps << AssignOnSumStep.new(summed_letter)
      chosen_letters.add(summed_letter)
  
      # check whether this letter cannot be zero and if so add a
      # CheckNotZeroStep
      steps << CheckNotZeroStep.new(summed_letter) if
        nonzero_letters.member? summed_letter
    end
  end
  
  # ...

This code breaks down the provided equation into the steps we've seen defined up
to this point. Though it's a fair bit of code, it's pretty straightforward and
very well commented. In short:

  1. Values are selected for the numbers in each column as needed.
  2. Columns are summed
  3. Sums are assigned and or validated as needed.

With the setup complete, here's the code that kicks the solver into action:

  # ...
  
  # should be done, so add a FinishStep
  steps << FinishStep.new(display)
  
  # print out all the steps
  # steps.each_with_index { |step, i| puts "#{i + 1}. #{step}" }
  
  # let each step know about the one that follows it.
  steps.each_with_index { |step, i| step.next_step = steps[i + 1] }
  
  # start performing with the first step.
  steps.first.perform(State.new)

Here the FinishStep is added, all steps are linked, and the perform() call is
made to get the ball rolling. You can uncomment the second chunk of code to
have a human-readable explanation of the steps added to the output.

My thanks to all the super clever solvers who tackled this problem. I was blown
away with the creativity.

Tomorrow we will put Ruby Quiz to work helping some friends of ours...