After I posted my solution, I took a shower, scribbled a few diagrams on the foggy glass walls and figured out how to massively speed things up. I hence post my new solution. It's still a unidirectional depth-first search, but it keeps track of visitation depth on nodes, preventing it from revisiting words that were previously reached more quickly.
For long chains, it's a MASSIVE speed up. For short ones, it's "only" a 2x improvement.
Compare results to the above:
[Slim:~/Sites/Ruby] gavinkis% ruby wordchains-gk2.rb crate primp 20
Chain between 'crate' and 'primp', no longer than 20 links:
...
--> 15.59s (after loading dictionary)
[Slim:~/Sites/Ruby] gavinkis% ruby wordchains-gk2.rb crate primp 12
Chain between 'crate' and 'primp', no longer than 12 links:
...
--> 2.76s (after loading dictionary)
[Slim:~/Sites/Ruby] gavinkis% ruby wordchains-gk2.rb crate primp 10
Chain between 'crate' and 'primp', no longer than 10 links:
...
--> 2.54s (after loading dictionary)
[Slim:~/Sites/Ruby] gavinkis% ruby wordchains-gk2.rb crate primp 8
Chain between 'crate' and 'primp', no longer than 8 links:
...
--> 1.96s (after loading dictionary)
[Slim:~/Sites/Ruby] gavinkis% ruby wordchains-gk2.rb crate primp 6
Chain between 'crate' and 'primp', no longer than 6 links:
(no such chain exists)
--> 0.78s (after loading dictionary)
I've finally been able to discover the much-sought-after link between kittens and rabies! (My previous code was unable to find this chain after leaving it for about half an hour.)
[Sliver:~/Desktop/12dicts] gkistner$ ruby wordchain-gk2.rb kitten rabies 20
Searching in 8121 words with 6 letters
Chain between 'kitten' and 'rabies', no longer than 20 links:
kitten
bitten
batten
batted
tatted
totted
tooted
booted
boobed
bobbed
robbed
rubbed
rubber
rubier
rubies
rabies
--> 29.08s (after loading dictionary)
Now I actually feel like my code has a chance of competing with the 'big boys'. Yay!
James, could you run some speed comparisons of various solutions in your summary analysis?
#!/usr/local/bin/ruby
require 'set'
class String
LETTERS = ('a'..'z').to_a
# Finds and returns an array that links the current word to _dest_word_,
# where each link in the chain is a word that differs from the previous
# only by a single character.
···
On Aug 28, 2005, at 9:45 AM, Gavin Kistner wrote:
For example, using a pair of words that is particularly hard for my algorithm to find the chain for:
[Slim:~/Sites/Ruby] gavinkis% ruby -d wordchains.rb crate primp 12
Finding chain between crate and primp using ./12dicts-4/2of12inf.txt, going no deeper than 12
...
425.621661 seconds elapsed
[Slim:~/Sites/Ruby] gavinkis% ruby -d wordchains.rb crate primp 10
Finding chain between crate and primp using ./12dicts-4/2of12inf.txt, going no deeper than 10
...
17.003073 seconds elapsed
[Slim:~/Sites/Ruby] gavinkis% ruby -d wordchains.rb crate primp 8
Finding chain between crate and primp using ./12dicts-4/2of12inf.txt, going no deeper than 8
...
3.069903 seconds elapsed
[Slim:~/Sites/Ruby] gavinkis% ruby -d wordchains.rb crate primp 6
Finding chain between crate and primp using ./12dicts-4/2of12inf.txt, going no deeper than 6
...
(no path found)
1.527844 seconds elapsed
#
# The _visitation_map_ parameter is a hash containing all legal words as
# keys, and that should be initialized with the values mapping to the
# deepest depth allowable.
def chain_to( dest_word, visitation_map, depth=1 )
return nil if depth > $max_length
# Find variations on myself which haven't been reached by a shorter path
# and update the visitation map at the same time
links = Set.new
0.upto( self.length-1 ){ |i|
old_char = self[ i ]
LETTERS.each{ |new_char|
if new_char != old_char
test_word = self.dup
test_word[ i ] = new_char
#Following returns nil if the word isn't in the dictionary
shortest_path = visitation_map[ test_word ]
if shortest_path && shortest_path > depth
#I've gotten to this word faster than anyone else
#Put my score in the high score board, and use this word again
visitation_map[ test_word ] = depth
links << test_word
end
end
}
}
path_from_me = nil
if links.include?( dest_word )
#Sweet, I have a direct route!
path_from_me = [ self ]
else
links.each{ |test_word|
path = test_word.chain_to( dest_word, visitation_map, depth + 1 )
if path
total_length = depth + path.length + 1
#Only use the found path if it's shorter than one found already
if total_length <= $max_length
warn "Found a chain of length #{total_length}" if $DEBUG
path_from_me = path
$max_length = total_length
end
end
}
if path_from_me
path_from_me.unshift( self )
end
end
path_from_me
end
end
start_word = ARGV[0] || 'crave'
end_word = ARGV[1] || 'primp'
$max_length = Integer( ARGV[2] || start_word.length * 3 )
dict = ARGV[3] || '/usr/share/dict/words'
#dict = ARGV[3] || '2of12inf.txt'
desired_length = start_word.length
unless end_word.length == desired_length
msg = "Error: '#{start_word}' and '#{end_word}' are not the same length"
msg << "(#{start_word.length} vs. #{end_word.length})"
raise msg
end
# Load words of the right length
avail_words = {}
File.open( dict, 'r' ){ |f|
w = f.read.split(/[\r\n]+/)
# No capital words, or words ending with % (12dicts)
w.reject!{ |word| word.length != desired_length or /[^a-z]/ =~ word }
w.each{ |word| avail_words[ word ] = $max_length }
}
avail_words[ start_word ] = 1
puts "Searching in #{avail_words.length} words with #{desired_length} letters"
unless avail_words.include?( end_word )
raise "Error: '#{end_word}' is not included in #{dict}"
end
print "Chain between '#{start_word}' and '#{end_word}', "
puts "no longer than #{$max_length} links:"
start_time = Time.new
if path = start_word.chain_to( end_word, avail_words )
puts path.join( "\n" )
puts end_word
else
puts "(no such chain exists)"
end
end_time = Time.new
puts "--> %.2fs (after loading dictionary)\n " % [ end_time-start_time ]