What's the deal? Is it Sodoku, Sudoku, Su Doku, or what?
I wasn't aware of all the name variations when I wrote this quiz. It looks like
the actual number puzzle is usually called Sudoko, now that I've looked into it.
However, some seem to consider Sodoku a slightly different form of the game
which can have multiple solutions, while "true" Sudoku should have exactly one.
This fact complicated solutions a little, so I thought it was worth bringing up
before we dig into them.
The solutions to this quiz are quite varied and interesting. We have raw speed,
clever algorithms, java ports, and even not-quite-correct approaches. I learned
a lot digging trough the code and I encourage others to do the same.
I would say the overriding theme this time though was the sheer amount of code
that came in. Most of the solutions were over two hundred lines, a few even
over three hundred. This probably tells us that the problem was tricky, but I
also found some plenty wordy expressions in there. Let's look at some of those,
since I believe that rethinking code can be very instructive. Here's a method
from the first submission of David Brady's solution:
# Loads the @board array from a string matching the example above.
def load(str)
line_num = 0
str.each_line do |line|
line.gsub! '+', ''
line.gsub! '-', ''
line.gsub! '|', ''
line.gsub! ' ', ' '
line.gsub! '_', '0'
line.strip!
if line.length > 0
l = line.split
fail "Line length was #{l.length}" unless l.length == 9
@board[line_num] = l.collect {|x| x.to_i}
line_num += 1
end
end
fail "Board is not valid." unless self.valid?
end
This code parses the puzzle input format and uses it to load the initial @board,
which is just and Array (rows) of Arrays (columns). It's currently doing this
by cleaning up the lines and reading each integer. Can we smooth that out a
little?
Often with Ruby, having to use an index is a sign that you're not attacking the
problem in the easiest way. Each index is being used to completely replace a
line, so let's see if we can just append them directly:
# Loads the @board array from a string matching the example above.
def load(str)
@board = Array.new
str.each_line do |line|
line.gsub! '+', ''
line.gsub! '-', ''
line.gsub! '|', ''
line.gsub! ' ', ' '
line.gsub! '_', '0'
line.strip!
if line.length > 0
l = line.split
fail "Line length was #{l.length}" unless l.length == 9
@board << l.collect {|x| x.to_i}
end
end
fail "Board is not valid." unless self.valid?
end
That only saved one line I guess, but conceptually it's getting easier for me to
follow already. That's always a win, I think.
Let's tackle the text processing. My first instinct was:
# Loads the @board array from a string matching the example above.
def load(str)
@board = Array.new
str.each_line do |line|
line.delete! '|+-'
line.tr! '_', '0'
line.squeeze!
line.strip!
if line.length > 0
l = line.split
fail "Line length was #{l.length}" unless l.length == 9
@board << l.collect {|x| x.to_i}
end
end
fail "Board is not valid." unless self.valid?
end
Again, only two lines trimmed, but I'm actually helping myself to understand
what exactly this code is doing and that's far more important to me.
I now see that we're stripping non-digit or underscore characters. We're
leaving the spaces however, so we can later split() on whitespace to get each
cell. That gets me thinking: We don't have to split() on whitespace. If I can
get it down to just what I'm after, we can split() on characters:
# Loads the @board array from a string matching the example above.
def load(str)
@board = Array.new
str.each_line do |line|
line.delete! '^0-9_'
if line.length > 0
l = line.split('')
fail "Line length was #{l.length}" unless l.length == 9
@board << l.collect {|n| Integet(n) rescue 0 }
end
end
fail "Board is not valid." unless self.valid?
end
Note that I did change the collect() to be more obvious, since I removed
translation of '_' to '0'. This isn't strictly needed, as String.to_i() will
return 0 if it can't form a number, but I'm trying not to damage the readability
of this code with my changes.
We're getting close, I can tell, but there's another trick or two left. We have
to check the line length to see if it's one of the lines that contains cells or
just a border row. If we skip border rows altogether, we could the switch our
focus from deleting the unwanted data to grabbing wanted data. (Gavin Kistner
originally pointed this out on Ruby Talk.) Let's see what that does for us:
# Loads the @board array from a string matching the example above.
def load(str)
@board = Array.new
str.each_line do |line|
next unless line =~ /\d|_/
@board << line.scan(/[\d_]/).collect {|n| Integet(n) rescue 0 }
fail "Length was #{@board.last.length}" unless @board.last.length == 9
end
fail "Board is not valid." unless self.valid?
end
Moving the line verification to after I load @board also allowed me to do away
with the extra variable, as you can see.
To me, that removes a lot of the wordiness of the original code, without
sacrificing clarity or functionality. It's probably a touch more efficient too,
since we trimmed quite a few operations.
I believe Adam Shelly's parse routine could benefit from similar simplifications
if you want to try your own hand at a little refactoring.
Here's another chunk of code (from Horndude77's solution) just crying out for
some help:
def solve
#is there a better way to do this? it seems messy
# and redundant.
changed = true
while(changed && @unknown.length>0)
changed = false
changed = eliminateall ? true : changed
changed = checkallrows ? true : changed
changed = eliminateall ? true : changed
changed = checkallcols ? true : changed
changed = eliminateall ? true : changed
changed = checkallboxes ? true : changed
end
puts self
if(@unknown.length>0)
puts "I can't solve this one"
end
end
I told myself that was too easy and had the following knee-jerk reaction:
def solve
changed = true
while(changed && @unknown.length>0)
changed = eliminateall || checkallrows || eliminateall ||
checkallcols || eliminateall || checkallboxes
end
puts self
if(@unknown.length>0)
puts "I can't solve this one"
end
end
That's cute, and may even work if the underlying algorithms aren't sensitive to
the call order, but it is not identical in function to the original code. The
original method calls all of those methods and just tracks to see if any one of
them returned true. The second version will short-circuit the call chain as
soon as a method returns true. We'll have to be a bit more clever to avoid
that:
def solve
changed = true
while(changed && @unknown.length>0)
changed = %w{ eliminateall checkallrows eliminateall
checkallcols eliminateall checkallboxes }.map do |m|
send(m)
end.include?(true)
end
puts self
if(@unknown.length>0)
puts "I can't solve this one"
end
end
That should be equivalent to the original, I believe, minus some repetition.
My point in showing the above examples wasn't to pick on anyone and I apologize
if I gave any offense. I just wanted to explore a little idiomatic Ruby through
some examples.
Back to Sudoku itself. Let's look at a solution. Here's the beginning of
Dominik Bathon's solver class:
class SudokuSolver
# sudoku is an array of arrays, containing the rows, which contain the
# cells (all non valid entries are interpreted as open)
def initialize(sudoku)
# determine @n / @sqrt_n
@n = sudoku.size
@sqrt_n = Math.sqrt(@n).to_i
raise "wrong sudoku size" unless @sqrt_n * @sqrt_n == @n
# populate internal representation
@arr = sudoku.collect { |row|
# ensure correct width for all rows
(0...@n).collect { |i|
# fixed cell or all values possible for open cell
((1..@n) === row[i]) ? [row[i]] : (1..@n).to_a
}
}
# initialize fix arrays
# they will contain all fixed cells for all rows, cols and boxes
@rfix=Array.new(@n) { [] }
@cfix=Array.new(@n) { [] }
@bfix=Array.new(@n) { [] }
@n.times { |r| @n.times { |c| update_fix(r, c) } }
# check for non-unique numbers
[@rfix, @cfix, @bfix].each { |fix| fix.each { |x|
unless x.size == x.uniq.size
raise "non-unique numbers in row, col or box"
end
} }
end
# ...
This constructor takes an Array of Arrays, which is simply the board setup read
from the input file. After finding the board size, you can see the method
builds its internal Array (rows) of Arrays (columns) of Arrays (possible numbers
for that cell). Known cells are set to a one element member with the known
value, while other cells are set to an Array of all the possible numbers.
Next, we see that the code also builds representations for rows, columns, and
boxes and repeatedly calls update_fix(), we assume to populate them.
The method ends with a puzzle validation check, ensuring that there are no
duplicate numbers in rows, columns or boxes.
Jumping a little out of order now, let's examine the private methods used by the
constructor:
# ...
private
# returns the box index of row r and col c
def rc2box(r, c)
(r - (r % @sqrt_n)) + (c / @sqrt_n)
end
# if row r, col c contains a fixed cell, it is added to the fixed arrays
def update_fix(r, c)
if @arr[r][c].size == 1
@rfix[r] << @arr[r][c][0]
@cfix[c] << @arr[r][c][0]
@bfix[rc2box(r, c)] << @arr[r][c][0]
end
end
# ...
From here we can see that the rows, columns, and boxes tracking variables only
receive a number that has been narrowed down to a single possibility. Because
of that simplification, these are just two dimensional Arrays. Note that each
new-found number is just appended to the Array. These will not be in the same
order as they really appear in the puzzle, but since they're just used to verify
uniqueness it doesn't matter.
The first method, rc2box() just uses math to locate which box we're in, given a
row and column.
Back to the public methods:
# ...
public
# returns the internal representation as array of arrays
def to_a
@arr.collect { |row| row.collect { |x|
(x.size == 1) ? x[0] : nil
} }
end
# returns a simple string representation
def to_s
fw = @n.to_s.size
to_a.collect { |row| row.collect { |x|
(x ? x.to_s : "_").rjust(fw)
}.join " " }.join "\n"
end
# returns whether the puzzle is solved
def finished?
@arr.each { |row| row.each { |x| return false if x.size > 1 } }
true
end
# ...
The above methods allow you to query the solver for an Array representation, a
String representation, or just to find out if it is finished being solved yet.
Starting with to_a(), you can see that it basically just flattens the third
dimension of Arrays either into a known number choice, or nil for unknowns. The
next method, to_s(), calls to_a(), stringifies, and join()s the results.
On to the actual solving code:
# ...
# for each cell remove the possibilities, that are already used in the
# cell's row, col or box
# return if successful
def reduce
success = false
@n.times { |r| @n.times { |c|
if (sz = @arr[r][c].size) > 1
@arr[r][c] = @arr[r][c] -
(@rfix[r] | @cfix[c] | @bfix[rc2box(r, c)])
raise "impossible to solve" if @arr[r][c].empty?
# have we been successful
if @arr[r][c].size < sz
success = true
update_fix(r, c)
end
end
} }
success
end
# ...
This method is a simple, but important, piece of the solving task. It simply
walks cell by cell reducing the possibilities by what we already know. It uses
the Array union operator (|) to combine all known numbers for the row, column
and box of this cell. All of those numbers are then removed from the
possibilities using the Array difference operator (-). When any cell shrinks in
choices, update_fix() is called again to notify row, column, and box of the
change. As long as a single cell lost a single possibility this method returns
true to report progress.
Here's another solving method:
# ...
# find open cells with unique elements in their row, col or box
# return if successful
# reduce must return false when this method is called (if the
# possibilities aren't reduced, bad things may happen...)
def deduce
success = false
[:col_each, :row_each, :box_each].each { |meth|
@n.times { |i|
u = uniqs_in(meth, i)
unless u.empty?
send(meth, i) { |x|
if x.size > 1 && ((u2 = u & x).size == 1)
success = true
u2
else
nil
end
}
# change only one row/col/box at a time
return success if success
end
}
}
success
end
# ...
Another way to be sure of a cell is to find a unique possibility in the row,
column, or box. In other words, if two is a possibility in the fifth cell of a
row, but not a possibility in any other cell of the row, we know it belongs in
the fifth cell and we can place it.
This code hunts for that using iterators to get all the cells in a row, column,
or box and the helper method uniqs_in(), which performs the search I just
explained. When a unique option is found, the code places it and returns true
to indicate progress.
Here are all four private helper methods:
# ...
private
# yields each cell of row r and assigns the result of the yield unless
# it is nil
def row_each(r)
@n.times { |c|
if (res = yield(@arr[r][c]))
@arr[r][c] = res
update_fix(r, c)
end
}
end
# yields each cell of col c and assigns the result of the yield unless
# it is nil
def col_each(c)
@n.times { |r|
if (res = yield(@arr[r][c]))
@arr[r][c] = res
update_fix(r, c)
end
}
end
# yields each cell of box b and assigns the result of the yield unless
# it is nil
def box_each(b)
off_r, off_c = (b - (b % @sqrt_n)), (b % @sqrt_n) * @sqrt_n
@n.times { |i|
r, c = off_r + (i / @sqrt_n), off_c + (i % @sqrt_n)
if (res = yield(@arr[r][c]))
@arr[r][c] = res
update_fix(r, c)
end
}
end
# find unique numbers in possibility lists of a row, col or box
# each_meth must be :row_each, :col_each or :box_each
def uniqs_in(each_meth, index)
h = Hash.new(0)
send(each_meth, index) { |x|
x.each { |n| h[n] += 1 } if x.size > 1
nil # we didn't change anything
}
h.select { |k, v| v == 1 }.collect { |k, v| k }
end
# ...
The iterators are pretty obvious. The only gotcha to their use is that the
block is expected to return true or false, indicating if the cell was updated.
This allows the iterator to call update_fix() and keep the internal
representations in sync.
The uniqs_in() method just uses those iterators to fill a Hash with seen counts
and then returns all keys that were only seen once.
Finally, we start to see it all come together with the next method:
# ...
public
# tries to solve the sudoku with reduce and deduce
# returns one of :impossible, :solved, :unknown
def solve
begin
until finished?
progress = false
while reduce
progress = true
end
progress = true if deduce
return :unknown unless progress
end
:solved
rescue
:impossible
end
end
# ...
This method just combines calls to to the previously seen reduce() and deduce()
to see if it can use process of elimination to solve the problem. It loops as
long as either method reports some progress. It will eventually return :solved,
if finished?() declares the puzzle done, or :unknown if it runs out of
reductions and deductions. :impossible is returned in the event of a problem.
The above can solve some puzzles quickly and efficiently, but it's not a
complete solution. When it won't go any father, it's time for some guess work:
# ...
# solves the sudoku using solve and if that fails, it tries to guess
# returns one of :impossible, :solved, :multiple_solutions
def backtrack_solve
if (res = solve) == :unknown
# find first open cell
r, c = 0, 0
@rfix.each_with_index { |rf, r|
break if rf.size < @n
}
@arr[r].each_with_index { |x, c|
break if x.size > 1
}
partial = to_a
solutions = []
# try all possibilities for the open cell
@arr[r][c].each { |guess|
partial[r][c] = guess
rsolver = SudokuSolver.new(partial)
case rsolver.backtrack_solve
when :multiple_solutions
initialize(rsolver.to_a)
return :multiple_solutions
when :solved
solutions << rsolver
end
}
if solutions.empty?
return :impossible
else
initialize(solutions[0].to_a)
return solutions.size > 1 ? :multiple_solutions : :solved
end
end
res
end
end
# ...
Note that this method begins by calling solve(). If it yields a complete
solution, the rest of the work can be skipped. Even if it doesn't though, it
should have reduced the possibilities, making the coming job easier.
The next bit of code locates a cell to start guessing with. Rows are scanned to
find one that hasn't been completely filled in, then columns are scanned to find
a cell with more than one possibility. Note how those two iterations
purposefully clobber the local variables (r and c), so they will hold the final
address of the cell when scanning is done.
Finally, we're to the guess work. An Array is prepared to hold solutions and
the current known cells are retrieved with a call to to_a(). Then, each
possibility for the selected cell is inserted and a new solver is built and run.
This amounts to recursion of the entire process. The results of these guesses
are examined by a case statement and added to the solutions Array when found.
The case statement ignores :impossible returns, since these are just wrong
guesses.
Finally the method checks to see if any solutions were found, and returns the
proper Symbol for the results.
The last little chunk of code handles input and output for the solution:
# ...
if $0 == __FILE__
# read a sudoku from stdin
sudoku = []
while sudoku.size < 9
row = gets.scan(/\d|_/).map { |s| s.to_i }
sudoku << row if row.size == 9
end
# solve
begin
solver = SudokuSolver.new(sudoku)
puts "Input:", solver
case solver.backtrack_solve
when :solved
puts "Solution:"
when :multiple_solutions
puts "There are multiple solutions!", "One solution:"
else
puts "Impossible:"
end
puts solver
rescue => e
puts "Error: #{e.message}"
end
end
Look at that four line input read up there. Its similar to what we reduced the
other code to at the beginning of this summary. The output is very straight
forward. It just makes good use of to_s() in the solver object to print the
board before and after.
My thanks to all who sent in their solutions, sometimes many, many times.
Next week's Ruby Quiz is shamelessly stolen from another source of Ruby
challenges. Stay tuned to see the kind of problems Dave Thomas cooks up...