Here's my solution. I used the sowpods wordlist available here:
http://www.isc.ro/commands/lists.html (preprocessed to be all-lowercase,
though it doesn't really matter as long as the case is consistent).
It takes entirely too long to run (12 seconds on my P4/3GHz); I'll poke
at it some more if I get the time and see what's slowing it down.
···
--------------------------------------------------------------------------
$stems = {}
$seen = {}
$cutoff = ARGV[0].to_i
IO.foreach("sowpods.txt") {|word|
next if word.length != 8
word.chomp!
letters = word.split(//).sort
alpha = letters.join
next if $seen[alpha]
$seen[alpha] = true
remove = letters.uniq
remove.each {|i|
stem = alpha.sub(i, '')
$stems[stem] ||= {}
$stems[stem][i] = true
}
}
$stem_length = {}
$stems.each {|k,v| $stems[k] = v.keys.sort}
$stems.reject! {|k,v|
n = v.length
$stem_length[k] = n
n < $cutoff
}
results = $stems.keys.sort_by {|i| -$stem_length[i]}
results.each {|s| p [s, $stem_length[s], $stems[s].join]}
Here's mine ... doesn't seem to fair too badly. Using the YAWL list
(264,061 words), it finds 25 bingo-stems that combine with n >= 20 letters.
(about 16 seconds on a 666MHz).
regards,
andrew
#!/usr/bin/ruby -w
n = ARGV[0].to_i
wfile = ARGV[1]
w7sigs = {}
w6sigs = {}
File.open(wfile,'r') do |f|
f.each do |line|
next unless line.size == 8
line.chomp!
line.downcase!
sig = line.unpack('aaaaaaa').sort.join("")
next if w7sigs[sig]
w7sigs[sig] = 1
end
end
w7sigs.each_key do |k|
7.times do |i|
ns = k.dup
ll = ns.slice!(i,1)
w6sigs[ns] ||= []
w6sigs[ns] << ll
end
end
w6sigs.each {|k,v| w6sigs[k].uniq!}
w6sigs.reject!{|k,v|v.size < n}
w6sigs.sort_by{|a|a[1].size}.reverse.each do |it|
puts "#{it[0]} #{it[1].size}"
end
__END__
Hi,
Nothing special here: http://www.dave.burt.id.au/ruby/scrabble_stems.rb
It's quite ridiculous, actually. It makes a tree of about 45MB out of a
2.6MB input file, so it can tell you a word each stem can make with each
letter you can add to it.
Takes a minute for words above 20 on my machine (1.8GHz) using the
dictionary I mentioned. YAWL or something.
Just thought I might weigh in.
Season's greetings,
Dave
Here's my solution. It runs in 4 seconds on my
3932 bogomips machine, using /usr/share/dict/american-english (96,273 words).
My major optimizations (and I didn't fiddle with minor ones) are limiting the "anagram space" with a standard "anagram" representation, and generating the stems by subtraction from bingos instead of generating all gazillion stems and checking them with all letters.
bingo_stems.rb (1.56 KB)