[SUMMARY] Crosswords (#10)

The summary for this week's quiz should be:

  Go read all submitted solutions until you understand them.

However, since I would probably be lynched for that, I'll try to be more
specific. Here's some of what you're missing out on if you stop at just this
summary.

  Markus Koenig wrote a very nice solution in 20 minutes. The code
  is very approachable and worth a look.
    
  Jamis Buck gave his standard great solution, along with a tricky
  test file to try solutions on.
  
  T. Onoma passively builds puzzles, much like cellular automata
  work. The solution is commented out and thus pretty approachable.
  
  Andrew Johnson provided a very compact, yet not overly
  complicated solution.
  
  Clark took a different approach than most, building up the
  display normally, and then removing the double borders.
  
  Brian Schroeder went above and beyond the call, as usual. (I'll be
  blamed for that.) He actually attempted to make this puzzle into
  something useful, by fitting words into the layout to make an
  actual crossword challenge.
  
  Pit Capitain did some heavy modification of the core classes to
  setup a solution. Of particular note are the methods Array#weave()
  and String#gub2!(), the latter of which handles 2D Regexps.
  
  Finally, Harry Ohlsen chimed in with something that wasn't
  actually a solution, but was an interesting example of real
  world usage.

Do look these over, if you haven't already.

Now, let's break one down. Here's the setup from Jim Freeze's solution:

  #!/usr/bin/env ruby
  
  class CrossWordPuzzle
    CELL_WIDTH = 6
    CELL_HEIGHT = 4
  
    attr_accessor :cell_width, :cell_height
  
    def initialize(file)
    @file = file
    @cell_width = CELL_WIDTH
    @cell_height = CELL_HEIGHT
  
    build_puzzle
    end

···

#######
  private
  #######
  
    def build_puzzle
    parse_grid_file
    drop_outer_filled_boxes
    create_numbered_grid
    end
  
    # ...

Nothing tricky there. First, initialize some constants and variables. After
that, the private method build_puzzle() outlines the process. Let's dig deeper
into each of those steps:

    # ... private methods continued ...
    
    def parse_grid_file
    @grid = File.read(@file).split(/\n/)
    @grid.collect! { |line| line.split }
    @grid_width = @grid.first.size
    @grid_height = @grid.size
    end
  
    # ...

Step one. Again, pretty simple. Read the layout file. Break it down by row at
each "\n" character and by square at each space. (Note: This solution does
require the spaces from the quiz description.) Find the dimensions of the
puzzle.

    # ... private methods continued ...
    
    def drop_outer_filled_boxes
    loop {
      changed = 0
      changed += _drop_outer_filled_boxes(@grid)
      changed += _drop_outer_filled_boxes(t = @grid.transpose)
      @grid = t.transpose
      break if 0 == changed
    }
    end
  
    def _drop_outer_filled_boxes(ary)
    changed = 0
    ary.collect! { |row|
      r = row.join
      changed += 1 unless r.gsub!(/^X|X$/, ' ').nil?
      changed += 1 unless r.gsub!(/X | X/, ' ').nil?
      r.split(//)
    }
    changed
    end
  
    # ...

These two methods handle step two, dropping filled border squares. Working here
in what is still the character-by-character layout makes things easier. Jim
uses a simple transpose() to make essentially 2D search and replaces. More than
one submission capitalized on this technique, but it still wows dummies like me.

The search and replace logic is two-fold: Turn all Xs at the beginning or end
of the line into spaces and turn all Xs next to spaces into spaces. Repeat this
until there are no more changes. This causes the edges to creep in until all
filled border squares have been eliminated.

    # ... private methods continued ...
    
    def create_numbered_grid
    mark_boxes(@grid)
    grid_prime = @grid.transpose
    mark_boxes(grid_prime)
  
    count = '0'
    @numbered_grid = []
    @grid.each_with_index { |row, i|
      r = []
      row.each_with_index { |col, j|
      r << case col + grid_prime[j][i]
         when /#/ then count.succ!.dup
         else col
         end
      }
      @numbered_grid << r
    }
    end
  
    # place '#' in boxes to be numbered
    def mark_boxes(grid)
    grid.collect! { |row|
      r = row.join
      r.gsub!(/([X ])([\#_]{2,})/) { "#{$1}##{$2[1..-1]}" }
      r.gsub!(/^([\#_]{2,})/) { |m| m[0]=?#; m }
      r.split(//)
    }
    end
  
    # ...

Here's the third step, numbering squares. The approach here is much the same as
step two. A combination of transpose() and gsub!() are used to mark squares at
the beginning of words with a # character. Words are defined as a run of #
and/or _ characters at the beginning of a line or after a filled box or open
space.

With #s in place, it's a simple matter to replace them with an actual number.
The code is a little arcane there, because #s have to be checked in the normal
"@grid" and in "grid_prime", the transposed grid.

Now that the grid has been doctored into the desired format, we need to wrap
cells in borders and space, then stringifying them. Here's the code for that:

    # ... private methods continued ...
    
    def cell(data)
    c = []
    case data
    when 'X'
      @cell_height.times { c << ['#'] * @cell_width }
    when ' '
      @cell_height.times { c << [' '] * @cell_width }
    when /\d/
      tb = ['#'] * @cell_width
      n = sprintf("\#%-*s\#", @cell_width-2, data).split(//)
      m = sprintf("#%-#{@cell_width-2}s#", ' ').split(//)
      c << tb << n
      (@cell_height-3).times { c << m }
      c << tb
    when '_'
      tb = ['#'] * @cell_width
      m = ['#'] + [' ']*(@cell_width-2) + ['#']
      c << tb
      (@cell_height-2).times { c << m }
      c << tb
    end
    c
    end
  
    def overlay(sub, mstr, x, y)
    sub.each_with_index { |row, i|
      row.each_with_index { |data, j|
      mstr[y+i][x+j] = data unless '#' == mstr[y+i][x+j]
      }
    }
    end
  
    def to_s
    puzzle_width = (@cell_width-1) * @grid_width + 1
    puzzle_height = (@cell_height-1) * @grid_height + 1
  
    s = Array.new(puzzle_height) { Array.new(puzzle_width) << [] }
  
    @numbered_grid.each_with_index { |row, i|
      row.each_with_index { |data, j|
       overlay(cell(data), s, j*(@cell_width-1), i*(@cell_height-1))
      }
    }
    s.collect! { |row| row.join }.join("\n")
    end
    public :to_s
  
  end#class CrossWordPuzzle

The method to_s() drives the conversion process. It walks the doctored up grid
calling cell() to do the formatting and overlay() to place it in the puzzle.

cell() just adds # borders and space as defined by the quiz, based on the cell
type it is called on.

overlay() is clever. It just happily draws cells. However, it's called with
placements close enough together to "overlay" the borders, reducing them to a
single line.

(Note: This "collapsing borders" technique is common in many aspects of
programming. Examine the output of the mysql command line tool, GNU Chess, or a
hundred other tools. It's also common for GUI libraries to combine borders of
neighboring elements.)

With an array of the entire puzzle assembled, to_s() finishes with few calls to
join().

The trivial "main" program combines build and display:

  cwp = CrossWordPuzzle.new(ARGV.shift)
  puts cwp.to_s

Jim Mernard's solution was remarkably similar to the solution I just showed,
which just proves that all guys named Jim program the same. That's why I have
to go by James, my solution was terrible.

My thanks go out to all people puzzled by this quiz, especially those who felt
the need to do something about it.

Tomorrows quiz involves building a "WOPR" and teaching it not to launch nuclear
missiles...

Hello James,

Nice summary, thanks you again. Did you benchmark the solutions, I would be
interested in how fast/slow some approaches where. Especially those that rely
heavily on string manipulation, vs. those that build up the layout in a
streaming fashion.

(Note: This "collapsing borders" technique is common in many aspects of
programming. Examine the output of the mysql command line tool, GNU Chess,
or a hundred other tools. It's also common for GUI libraries to combine
borders of neighboring elements.)

One remark here, is that it is common for filling algorithms not to draw each
border twice, but to draw only borders on two sides of the elements. Especially
if you have a line drawing algorithm (e.g. Bresenham) and you draw adjacent
lines of polygons twice, it is 1. a waste of energy and 2. rounding errors may
lead to jitter, such that the lines do not overlap perfectly.

(On a second read I'm not that shure that I understood you perfectly, so if I
was ranting about something you already said, I'm sorry for the noise.)

Regards,

Brian

···

--
Brian Schröder
http://ruby.brian-schroeder.de/

Hello James,

Nice summary, thanks you again. Did you benchmark the solutions, I would be
interested in how fast/slow some approaches where. Especially those that rely
heavily on string manipulation, vs. those that build up the layout in a
streaming fashion.

I didn't benchmark this time no. Everything I ran was "faster than I blink" zippy. :wink:

(Note: This "collapsing borders" technique is common in many aspects of
programming. Examine the output of the mysql command line tool, GNU Chess,
or a hundred other tools. It's also common for GUI libraries to combine
borders of neighboring elements.)

One remark here, is that it is common for filling algorithms not to draw each
border twice, but to draw only borders on two sides of the elements. Especially
if you have a line drawing algorithm (e.g. Bresenham) and you draw adjacent
lines of polygons twice, it is 1. a waste of energy and 2. rounding errors may
lead to jitter, such that the lines do not overlap perfectly.

(On a second read I'm not that shure that I understood you perfectly, so if I
was ranting about something you already said, I'm sorry for the noise.)

I should have written that note more clearly. I wasn't trying to say that drawing the borders twice was the technique to always use (though it works fine in many cases). I was trying to say that the need to collapse borders like that comes up a lot, however you accomplish it. My bad.

James Edward Gray II

···

On Dec 9, 2004, at 8:55 AM, Brian Schröder wrote:

Out of interest, I downloaded the zipfile of solutions from the Quiz
site:

  http://www.grayproductions.net/ruby_quiz/quiz10_sols.zip

and modified the 10 solutions to iterate over ARGV 25.times generating
puzzles for each file in the argument list. Each was called as:

  time ruby crossword.rb ../../files/* > OUT

where ../../files contained the following layouts (from the zipfile):

  big_layout.txt size => 15x15
  ruby-quiz.layout size => 18x15
  simple_layout.txt size => 7x6

and ruby is ruby 1.8.2 (2004-07-20) [i686-linux] with oniguruma.

The results tabulate as follows (this is not a speedy machine):

                           real user sys
  Markus_Koenig 3.028 3.000 0.030
  Andrew_Johnson 4.062 4.040 0.020
  T._Onoma 5.175 5.130 0.040
  James_Edward_Gray_II 5.674 5.590 0.080
  Jim_Menard 6.485 6.450 0.040
  Brian_Schroeder 7.425 7.210 0.040
  Jamis_Buck 8.378 8.350 0.030
  Jim_Freeze 11.989 11.970 0.020
  Pit_Capitain 13.949 13.910 0.040
  Clark 15.304 15.270 0.030

Note: speed was in no way a criteria/goal of the exercise, and this
is only a *very* rudimentary benchmark --- large grids (100x100), or grids
with more (or less) complicated fill patterns may very well alter the
ordering significantly. Any attempts to derive meaning from these results
will be met with a withering stare and quite possibly a rolling of the
eyes. You have been warned.

regards,
andrew

···

On Thu, 9 Dec 2004 23:55:37 +0900, Brian Schröder <ruby@brian-schroeder.de> wrote:

Hello James,

Nice summary, thanks you again. Did you benchmark the solutions, I would be
interested in how fast/slow some approaches where. Especially those that rely
heavily on string manipulation, vs. those that build up the layout in a
streaming fashion.

Well, you're right. I can't imagine anybody wants to solve a crossword of the
size whose layout would take noteworthy time to calculate. I only got the
feeling that all this transpose, gsub, etc... stuff should take a lot of time.
On the other hand ruby performance is nearly always different from what I
expect.

Regards,

Brian

···

On Fri, 10 Dec 2004 00:47:09 +0900 James Edward Gray II <james@grayproductions.net> wrote:

On Dec 9, 2004, at 8:55 AM, Brian Schröder wrote:

> Hello James,
>
> Nice summary, thanks you again. Did you benchmark the solutions, I
> would be
> interested in how fast/slow some approaches where. Especially those
> that rely
> heavily on string manipulation, vs. those that build up the layout in a
> streaming fashion.

I didn't benchmark this time no. Everything I ran was "faster than I
blink" zippy. :wink:

--
Brian Schröder
http://ruby.brian-schroeder.de/

Thanks Andrew. I appreciate this. My solution faired a good bit better than I
thought it would. That's encouraging, not only are there some optimizations
that could easily be done, but now I know that the tactic I employed here may
actually be a good approach for other things in the future.

Double Thanks,
T.

···

On Thursday 09 December 2004 02:47 pm, Andrew Johnson wrote:

Out of interest, I downloaded the zipfile of solutions from the Quiz
site:

  http://www.grayproductions.net/ruby_quiz/quiz10_sols.zip

and modified the 10 solutions to iterate over ARGV 25.times generating
puzzles for each file in the argument list. Each was called as:

  time ruby crossword.rb ../../files/* > OUT

where ../../files contained the following layouts (from the zipfile):

  big_layout.txt size => 15x15
  ruby-quiz.layout size => 18x15
  simple_layout.txt size => 7x6

and ruby is ruby 1.8.2 (2004-07-20) [i686-linux] with oniguruma.

The results tabulate as follows (this is not a speedy machine):

                           real user sys
  Markus_Koenig 3.028 3.000 0.030
  Andrew_Johnson 4.062 4.040 0.020
  T._Onoma 5.175 5.130 0.040
  James_Edward_Gray_II 5.674 5.590 0.080
  Jim_Menard 6.485 6.450 0.040
  Brian_Schroeder 7.425 7.210 0.040
  Jamis_Buck 8.378 8.350 0.030
  Jim_Freeze 11.989 11.970 0.020
  Pit_Capitain 13.949 13.910 0.040
  Clark 15.304 15.270 0.030

Note: speed was in no way a criteria/goal of the exercise, and this
is only a *very* rudimentary benchmark --- large grids (100x100), or grids
with more (or less) complicated fill patterns may very well alter the
ordering significantly. Any attempts to derive meaning from these results
will be met with a withering stare and quite possibly a rolling of the
eyes. You have been warned.

I don't expect mine to be anywhere near "fast" considering the approach I
took. But just for the sake of inquiry I put a quick benchmark together.
Should be trivial for others to fill in with their own. (see attached)

My Results:

  Tom's Solution 1000 runs:
                       user system total real
  Default Layout: 3.390000 0.210000 3.600000 ( 3.716914)
    Jim's Layout: 21.490000 1.470000 22.960000 ( 23.339115)

Of course this will very from machine to machine.

Perhaps this will make it easier to throw together some comparison benchmarks?

T.

benchit.rb (1.04 KB)

···

On Thursday 09 December 2004 10:52 am, Brian Schröder wrote:

Well, you're right. I can't imagine anybody wants to solve a crossword of
the size whose layout would take noteworthy time to calculate. I only got
the feeling that all this transpose, gsub, etc... stuff should take a lot
of time. On the other hand ruby performance is nearly always different from
what I expect.

trans. (T. Onoma) wrote:

···

On Thursday 09 December 2004 10:52 am, Brian Schröder wrote:
> Well, you're right. I can't imagine anybody wants to solve a crossword of
> the size whose layout would take noteworthy time to calculate. I only got
> the feeling that all this transpose, gsub, etc... stuff should take a lot
> of time. On the other hand ruby performance is nearly always different from
> what I expect.

I don't expect mine to be anywhere near "fast" considering the approach I took. But just for the sake of inquiry I put a quick benchmark together. Should be trivial for others to fill in with their own. (see attached)

My Results:

  Tom's Solution 1000 runs:
                       user system total real
  Default Layout: 3.390000 0.210000 3.600000 ( 3.716914)
    Jim's Layout: 21.490000 1.470000 22.960000 ( 23.339115)

Ouch. Of course, I don't plan to generate 1,000 crossword puzzles any time soon. I focused on elegant code, not needless optimizations :slight_smile:

Jim
--
Jim Menard, jimm@io.com, http://www.io.com/~jimm

I suspect we're misreading those results. I believe that's only T.'s solution on two different layouts. Since neither Jim submitted a layout, I'll assume that's a typo meant to be Jamis.

James Edward Gray II

···

On Dec 9, 2004, at 10:36 AM, Jim Menard wrote:

My Results:
  Tom's Solution 1000 runs:
                       user system total real
  Default Layout: 3.390000 0.210000 3.600000 ( 3.716914)
    Jim's Layout: 21.490000 1.470000 22.960000 ( 23.339115)

Ouch. Of course, I don't plan to generate 1,000 crossword puzzles any time soon. I focused on elegant code, not needless optimizations :slight_smile:

Nay, nay. That's _my_ solution running your layout. Not your solution.

Feel better? :slight_smile:

T.

···

On Thursday 09 December 2004 11:36 am, Jim Menard wrote:

trans. (T. Onoma) wrote:
> On Thursday 09 December 2004 10:52 am, Brian Schröder wrote:
> > Well, you're right. I can't imagine anybody wants to solve a crossword
> > of the size whose layout would take noteworthy time to calculate. I
> > only got the feeling that all this transpose, gsub, etc... stuff should
> > take a lot of time. On the other hand ruby performance is nearly always
> > different from what I expect.
>
> I don't expect mine to be anywhere near "fast" considering the approach I
> took. But just for the sake of inquiry I put a quick benchmark together.
> Should be trivial for others to fill in with their own. (see attached)
>
> My Results:
>
> Tom's Solution 1000 runs:
> user system total real
> Default Layout: 3.390000 0.210000 3.600000 ( 3.716914)
> Jim's Layout: 21.490000 1.470000 22.960000 ( 23.339115)

Ouch. Of course, I don't plan to generate 1,000 crossword puzzles any time
soon. I focused on elegant code, not needless optimizations :slight_smile:

Oops. Yes, I did mean Jamis' layout.

T.

···

On Thursday 09 December 2004 11:42 am, James Edward Gray II wrote:

On Dec 9, 2004, at 10:36 AM, Jim Menard wrote:
>> My Results:
>> Tom's Solution 1000 runs:
>> user system total real
>> Default Layout: 3.390000 0.210000 3.600000 ( 3.716914)
>> Jim's Layout: 21.490000 1.470000 22.960000 ( 23.339115)
>
> Ouch. Of course, I don't plan to generate 1,000 crossword puzzles any
> time soon. I focused on elegant code, not needless optimizations :slight_smile:

I suspect we're misreading those results. I believe that's only T.'s
solution on two different layouts. Since neither Jim submitted a
layout, I'll assume that's a typo meant to be Jamis.