What's the deal? Is it Sodoku, Sudoku, Su Doku, or what?

I wasn't aware of all the name variations when I wrote this quiz. It looks like

the actual number puzzle is usually called Sudoko, now that I've looked into it.

However, some seem to consider Sodoku a slightly different form of the game

which can have multiple solutions, while "true" Sudoku should have exactly one.

This fact complicated solutions a little, so I thought it was worth bringing up

before we dig into them.

The solutions to this quiz are quite varied and interesting. We have raw speed,

clever algorithms, java ports, and even not-quite-correct approaches. I learned

a lot digging trough the code and I encourage others to do the same.

I would say the overriding theme this time though was the sheer amount of code

that came in. Most of the solutions were over two hundred lines, a few even

over three hundred. This probably tells us that the problem was tricky, but I

also found some plenty wordy expressions in there. Let's look at some of those,

since I believe that rethinking code can be very instructive. Here's a method

from the first submission of David Brady's solution:

# Loads the @board array from a string matching the example above.

def load(str)

line_num = 0

str.each_line do |line|

line.gsub! '+', ''

line.gsub! '-', ''

line.gsub! '|', ''

line.gsub! ' ', ' '

line.gsub! '_', '0'

line.strip!

if line.length > 0

l = line.split

fail "Line length was #{l.length}" unless l.length == 9

@board[line_num] = l.collect {|x| x.to_i}

line_num += 1

end

end

fail "Board is not valid." unless self.valid?

end

This code parses the puzzle input format and uses it to load the initial @board,

which is just and Array (rows) of Arrays (columns). It's currently doing this

by cleaning up the lines and reading each integer. Can we smooth that out a

little?

Often with Ruby, having to use an index is a sign that you're not attacking the

problem in the easiest way. Each index is being used to completely replace a

line, so let's see if we can just append them directly:

# Loads the @board array from a string matching the example above.

def load(str)

@board = Array.new

str.each_line do |line|

line.gsub! '+', ''

line.gsub! '-', ''

line.gsub! '|', ''

line.gsub! ' ', ' '

line.gsub! '_', '0'

line.strip!

if line.length > 0

l = line.split

fail "Line length was #{l.length}" unless l.length == 9

@board << l.collect {|x| x.to_i}

end

end

fail "Board is not valid." unless self.valid?

end

That only saved one line I guess, but conceptually it's getting easier for me to

follow already. That's always a win, I think.

Let's tackle the text processing. My first instinct was:

# Loads the @board array from a string matching the example above.

def load(str)

@board = Array.new

str.each_line do |line|

line.delete! '|+-'

line.tr! '_', '0'

line.squeeze!

line.strip!

if line.length > 0

l = line.split

fail "Line length was #{l.length}" unless l.length == 9

@board << l.collect {|x| x.to_i}

end

end

fail "Board is not valid." unless self.valid?

end

Again, only two lines trimmed, but I'm actually helping myself to understand

what exactly this code is doing and that's far more important to me.

I now see that we're stripping non-digit or underscore characters. We're

leaving the spaces however, so we can later split() on whitespace to get each

cell. That gets me thinking: We don't have to split() on whitespace. If I can

get it down to just what I'm after, we can split() on characters:

# Loads the @board array from a string matching the example above.

def load(str)

@board = Array.new

str.each_line do |line|

line.delete! '^0-9_'

if line.length > 0

l = line.split('')

fail "Line length was #{l.length}" unless l.length == 9

@board << l.collect {|n| Integet(n) rescue 0 }

end

end

fail "Board is not valid." unless self.valid?

end

Note that I did change the collect() to be more obvious, since I removed

translation of '_' to '0'. This isn't strictly needed, as String.to_i() will

return 0 if it can't form a number, but I'm trying not to damage the readability

of this code with my changes.

We're getting close, I can tell, but there's another trick or two left. We have

to check the line length to see if it's one of the lines that contains cells or

just a border row. If we skip border rows altogether, we could the switch our

focus from deleting the unwanted data to grabbing wanted data. (Gavin Kistner

originally pointed this out on Ruby Talk.) Let's see what that does for us:

# Loads the @board array from a string matching the example above.

def load(str)

@board = Array.new

str.each_line do |line|

next unless line =~ /\d|_/

@board << line.scan(/[\d_]/).collect {|n| Integet(n) rescue 0 }

fail "Length was #{@board.last.length}" unless @board.last.length == 9

end

fail "Board is not valid." unless self.valid?

end

Moving the line verification to after I load @board also allowed me to do away

with the extra variable, as you can see.

To me, that removes a lot of the wordiness of the original code, without

sacrificing clarity or functionality. It's probably a touch more efficient too,

since we trimmed quite a few operations.

I believe Adam Shelly's parse routine could benefit from similar simplifications

if you want to try your own hand at a little refactoring.

Here's another chunk of code (from Horndude77's solution) just crying out for

some help:

def solve

#is there a better way to do this? it seems messy

# and redundant.

changed = true

while(changed && @unknown.length>0)

changed = false

changed = eliminateall ? true : changed

changed = checkallrows ? true : changed

changed = eliminateall ? true : changed

changed = checkallcols ? true : changed

changed = eliminateall ? true : changed

changed = checkallboxes ? true : changed

end

puts self

if(@unknown.length>0)

puts "I can't solve this one"

end

end

I told myself that was too easy and had the following knee-jerk reaction:

def solve

changed = true

while(changed && @unknown.length>0)

changed = eliminateall || checkallrows || eliminateall ||

checkallcols || eliminateall || checkallboxes

end

puts self

if(@unknown.length>0)

puts "I can't solve this one"

end

end

That's cute, and may even work if the underlying algorithms aren't sensitive to

the call order, but it is not identical in function to the original code. The

original method calls all of those methods and just tracks to see if any one of

them returned true. The second version will short-circuit the call chain as

soon as a method returns true. We'll have to be a bit more clever to avoid

that:

def solve

changed = true

while(changed && @unknown.length>0)

changed = %w{ eliminateall checkallrows eliminateall

checkallcols eliminateall checkallboxes }.map do |m|

send(m)

end.include?(true)

end

puts self

if(@unknown.length>0)

puts "I can't solve this one"

end

end

That should be equivalent to the original, I believe, minus some repetition.

My point in showing the above examples wasn't to pick on anyone and I apologize

if I gave any offense. I just wanted to explore a little idiomatic Ruby through

some examples.

Back to Sudoku itself. Let's look at a solution. Here's the beginning of

Dominik Bathon's solver class:

class SudokuSolver

# sudoku is an array of arrays, containing the rows, which contain the

# cells (all non valid entries are interpreted as open)

def initialize(sudoku)

# determine @n / @sqrt_n

@n = sudoku.size

@sqrt_n = Math.sqrt(@n).to_i

raise "wrong sudoku size" unless @sqrt_n * @sqrt_n == @n

# populate internal representation

@arr = sudoku.collect { |row|

# ensure correct width for all rows

(0...@n).collect { |i|

# fixed cell or all values possible for open cell

((1..@n) === row[i]) ? [row[i]] : (1..@n).to_a

}

}

# initialize fix arrays

# they will contain all fixed cells for all rows, cols and boxes

@rfix=Array.new(@n) { [] }

@cfix=Array.new(@n) { [] }

@bfix=Array.new(@n) { [] }

@n.times { |r| @n.times { |c| update_fix(r, c) } }

# check for non-unique numbers

[@rfix, @cfix, @bfix].each { |fix| fix.each { |x|

unless x.size == x.uniq.size

raise "non-unique numbers in row, col or box"

end

} }

end

# ...

This constructor takes an Array of Arrays, which is simply the board setup read

from the input file. After finding the board size, you can see the method

builds its internal Array (rows) of Arrays (columns) of Arrays (possible numbers

for that cell). Known cells are set to a one element member with the known

value, while other cells are set to an Array of all the possible numbers.

Next, we see that the code also builds representations for rows, columns, and

boxes and repeatedly calls update_fix(), we assume to populate them.

The method ends with a puzzle validation check, ensuring that there are no

duplicate numbers in rows, columns or boxes.

Jumping a little out of order now, let's examine the private methods used by the

constructor:

# ...

private

# returns the box index of row r and col c

def rc2box(r, c)

(r - (r % @sqrt_n)) + (c / @sqrt_n)

end

# if row r, col c contains a fixed cell, it is added to the fixed arrays

def update_fix(r, c)

if @arr[r][c].size == 1

@rfix[r] << @arr[r][c][0]

@cfix[c] << @arr[r][c][0]

@bfix[rc2box(r, c)] << @arr[r][c][0]

end

end

# ...

From here we can see that the rows, columns, and boxes tracking variables only

receive a number that has been narrowed down to a single possibility. Because

of that simplification, these are just two dimensional Arrays. Note that each

new-found number is just appended to the Array. These will not be in the same

order as they really appear in the puzzle, but since they're just used to verify

uniqueness it doesn't matter.

The first method, rc2box() just uses math to locate which box we're in, given a

row and column.

Back to the public methods:

# ...

public

# returns the internal representation as array of arrays

def to_a

@arr.collect { |row| row.collect { |x|

(x.size == 1) ? x[0] : nil

} }

end

# returns a simple string representation

def to_s

fw = @n.to_s.size

to_a.collect { |row| row.collect { |x|

(x ? x.to_s : "_").rjust(fw)

}.join " " }.join "\n"

end

# returns whether the puzzle is solved

def finished?

@arr.each { |row| row.each { |x| return false if x.size > 1 } }

true

end

# ...

The above methods allow you to query the solver for an Array representation, a

String representation, or just to find out if it is finished being solved yet.

Starting with to_a(), you can see that it basically just flattens the third

dimension of Arrays either into a known number choice, or nil for unknowns. The

next method, to_s(), calls to_a(), stringifies, and join()s the results.

On to the actual solving code:

# ...

# for each cell remove the possibilities, that are already used in the

# cell's row, col or box

# return if successful

def reduce

success = false

@n.times { |r| @n.times { |c|

if (sz = @arr[r][c].size) > 1

@arr[r][c] = @arr[r][c] -

(@rfix[r] | @cfix[c] | @bfix[rc2box(r, c)])

raise "impossible to solve" if @arr[r][c].empty?

# have we been successful

if @arr[r][c].size < sz

success = true

update_fix(r, c)

end

end

} }

success

end

# ...

This method is a simple, but important, piece of the solving task. It simply

walks cell by cell reducing the possibilities by what we already know. It uses

the Array union operator (|) to combine all known numbers for the row, column

and box of this cell. All of those numbers are then removed from the

possibilities using the Array difference operator (-). When any cell shrinks in

choices, update_fix() is called again to notify row, column, and box of the

change. As long as a single cell lost a single possibility this method returns

true to report progress.

Here's another solving method:

# ...

# find open cells with unique elements in their row, col or box

# return if successful

# reduce must return false when this method is called (if the

# possibilities aren't reduced, bad things may happen...)

def deduce

success = false

[:col_each, :row_each, :box_each].each { |meth|

@n.times { |i|

u = uniqs_in(meth, i)

unless u.empty?

send(meth, i) { |x|

if x.size > 1 && ((u2 = u & x).size == 1)

success = true

u2

else

nil

end

}

# change only one row/col/box at a time

return success if success

end

}

}

success

end

# ...

Another way to be sure of a cell is to find a unique possibility in the row,

column, or box. In other words, if two is a possibility in the fifth cell of a

row, but not a possibility in any other cell of the row, we know it belongs in

the fifth cell and we can place it.

This code hunts for that using iterators to get all the cells in a row, column,

or box and the helper method uniqs_in(), which performs the search I just

explained. When a unique option is found, the code places it and returns true

to indicate progress.

Here are all four private helper methods:

# ...

private

# yields each cell of row r and assigns the result of the yield unless

# it is nil

def row_each(r)

@n.times { |c|

if (res = yield(@arr[r][c]))

@arr[r][c] = res

update_fix(r, c)

end

}

end

# yields each cell of col c and assigns the result of the yield unless

# it is nil

def col_each(c)

@n.times { |r|

if (res = yield(@arr[r][c]))

@arr[r][c] = res

update_fix(r, c)

end

}

end

# yields each cell of box b and assigns the result of the yield unless

# it is nil

def box_each(b)

off_r, off_c = (b - (b % @sqrt_n)), (b % @sqrt_n) * @sqrt_n

@n.times { |i|

r, c = off_r + (i / @sqrt_n), off_c + (i % @sqrt_n)

if (res = yield(@arr[r][c]))

@arr[r][c] = res

update_fix(r, c)

end

}

end

# find unique numbers in possibility lists of a row, col or box

# each_meth must be :row_each, :col_each or :box_each

def uniqs_in(each_meth, index)

h = Hash.new(0)

send(each_meth, index) { |x|

x.each { |n| h[n] += 1 } if x.size > 1

nil # we didn't change anything

}

h.select { |k, v| v == 1 }.collect { |k, v| k }

end

# ...

The iterators are pretty obvious. The only gotcha to their use is that the

block is expected to return true or false, indicating if the cell was updated.

This allows the iterator to call update_fix() and keep the internal

representations in sync.

The uniqs_in() method just uses those iterators to fill a Hash with seen counts

and then returns all keys that were only seen once.

Finally, we start to see it all come together with the next method:

# ...

public

# tries to solve the sudoku with reduce and deduce

# returns one of :impossible, :solved, :unknown

def solve

begin

until finished?

progress = false

while reduce

progress = true

end

progress = true if deduce

return :unknown unless progress

end

:solved

rescue

:impossible

end

end

# ...

This method just combines calls to to the previously seen reduce() and deduce()

to see if it can use process of elimination to solve the problem. It loops as

long as either method reports some progress. It will eventually return :solved,

if finished?() declares the puzzle done, or :unknown if it runs out of

reductions and deductions. :impossible is returned in the event of a problem.

The above can solve some puzzles quickly and efficiently, but it's not a

complete solution. When it won't go any father, it's time for some guess work:

# ...

# solves the sudoku using solve and if that fails, it tries to guess

# returns one of :impossible, :solved, :multiple_solutions

def backtrack_solve

if (res = solve) == :unknown

# find first open cell

r, c = 0, 0

@rfix.each_with_index { |rf, r|

break if rf.size < @n

}

@arr[r].each_with_index { |x, c|

break if x.size > 1

}

partial = to_a

solutions = []

# try all possibilities for the open cell

@arr[r][c].each { |guess|

partial[r][c] = guess

rsolver = SudokuSolver.new(partial)

case rsolver.backtrack_solve

when :multiple_solutions

initialize(rsolver.to_a)

return :multiple_solutions

when :solved

solutions << rsolver

end

}

if solutions.empty?

return :impossible

else

initialize(solutions[0].to_a)

return solutions.size > 1 ? :multiple_solutions : :solved

end

end

res

end

end

# ...

Note that this method begins by calling solve(). If it yields a complete

solution, the rest of the work can be skipped. Even if it doesn't though, it

should have reduced the possibilities, making the coming job easier.

The next bit of code locates a cell to start guessing with. Rows are scanned to

find one that hasn't been completely filled in, then columns are scanned to find

a cell with more than one possibility. Note how those two iterations

purposefully clobber the local variables (r and c), so they will hold the final

address of the cell when scanning is done.

Finally, we're to the guess work. An Array is prepared to hold solutions and

the current known cells are retrieved with a call to to_a(). Then, each

possibility for the selected cell is inserted and a new solver is built and run.

This amounts to recursion of the entire process. The results of these guesses

are examined by a case statement and added to the solutions Array when found.

The case statement ignores :impossible returns, since these are just wrong

guesses.

Finally the method checks to see if any solutions were found, and returns the

proper Symbol for the results.

The last little chunk of code handles input and output for the solution:

# ...

if $0 == __FILE__

# read a sudoku from stdin

sudoku = []

while sudoku.size < 9

row = gets.scan(/\d|_/).map { |s| s.to_i }

sudoku << row if row.size == 9

end

# solve

begin

solver = SudokuSolver.new(sudoku)

puts "Input:", solver

case solver.backtrack_solve

when :solved

puts "Solution:"

when :multiple_solutions

puts "There are multiple solutions!", "One solution:"

else

puts "Impossible:"

end

puts solver

rescue => e

puts "Error: #{e.message}"

end

end

Look at that four line input read up there. Its similar to what we reduced the

other code to at the beginning of this summary. The output is very straight

forward. It just makes good use of to_s() in the solver object to print the

board before and after.

My thanks to all who sent in their solutions, sometimes many, many times.

Next week's Ruby Quiz is shamelessly stolen from another source of Ruby

challenges. Stay tuned to see the kind of problems Dave Thomas cooks up...