[SUMMARY] Scrabble Stems (#12)

This quiz wasn't too tough compared to last week's quiz, but I think that's a
plus. It's good to have an easy problem now and then.

What makes this problem simple is the usual algorithm tradeoff, we can sacrifice
memory (in some cases it's speed, or even both) for an easy to code solution.
There are a lot of stems, but we can generate them all and store them with 50
MBs of RAM, which is not too rare these days.

All submitted solutions made this trade. Here's an easy to follow version by
Carlos:

  DICT = "/usr/share/dict/words"
  CUTOFF = ARGV[0].to_i
  
  STEMS = {}
  
  File.open(DICT) do |f|
    f.each do |word|
      word.chomp!
      next if word.length != 7
      word.downcase!
      letters = word.split(//).sort!
      uniques = letters.uniq
      word = letters.join
      uniques.each do |letter|
        stem = word.sub(letter, "")
        (STEMS[stem] ||= {})[letter] = 1
      end
    end
  end
  
  result = STEMS.delete_if { |k,v| v.size < CUTOFF }.
      sort_by { |k,v| v.size }.
      reverse!.
      collect! { |k,v| [k, v.size] }
  
  result.each do |stem, combining| puts "#{stem} #{combining}" end

The code starts by setting up a DICTIONARY, CUTOFF, and a place to store the
STEMS we find.

After that, we have a line by line read of the dictionary, which is where most
of the work is done. The code inside "each" drops the newline, checks to make
sure we only keep seven letter words, and normalizes case. The next step is to
split the words into letters and sort them to ease comparisons. Then we
generate all the unique letters and remove those from the word one at a time to
find all the stems, adding each of those to STEMs.

At this point we've found all the stems in the dictionary for all seven letter
words. The next chunk of code removes results below the cutoff we wanted, sorts
the results and prepares them for display.

The final line of the program writes out the results, line by line.

Of course, we can't always afford to sacrifice the RAM. And this problem could
be solved without as much memory. The trick for that is to generate and verify
stems one at a time. This might be needed if the search space was larger.

As always, the solutions were varied and educational. Some highlights:

  Glenn Parker's solution focuses on the more general problem
  of generating stems, instead of just the Scrabble usage of
  finding bingos.
  
  Jannis Harder submitted a solution for finding "nonsense"
  stems, not in the dictionary.
  
  Dennis Ranke sent in a tricky solution that uses bit shifting
  to track stem counts while it works.

But they're all worth a look, I think.

My thanks the stem solvers and to everyone who is helping Ruby Quiz stay so darn
interesting. I could never keep up if I wasn't getting such great help from the
community.

No quiz this week, as I'll be spending a little time off with the family
enjoying the holidays. I hope everyone else can do the same, holiday or no.
The quiz will be back next week to kick off a cryptic new year...