[QUIZ] Word Blender (#108)

E:\ruby\programs\rubyquiz\quiz108\other submissions>ruby texttwistGame.rb
./dict.rb:20:in `initialize': No such file or directory -
reducedwordlist3.txt (
Errno::ENOENT)
        from ./dict.rb:20:in `open'
        from ./dict.rb:20:in `get_dict'
        from texttwistGame.rb:37

fixed that, but then:

E:\ruby\programs\rubyquiz\quiz108\other submissions>ruby texttwistGame.rb
texttwistGame.rb:17:in `pick_word': undefined method `delete_if' for
nil:NilClas
s (NoMethodError)
        from texttwistGame.rb:38

I'm too tired to go bug hunting. Check out my submission crop.rb to see how
I did the windows file writing. Now that I've submitted my program I'm
going back to sleep. :slight_smile:

···

On 1/7/07, Fedor Labounko <fedor.labounko@gmail.com> wrote:

It just so happened that I was learning ruby last summer and wrote a
program to cheat for me at TextTwist. Needless to say the game got boring
really fast, but it was neat writing the program. I've modified it a bit to
instead play the game, but I'm a fairly new user and would appreciate any
feedback, both ruby related and general programming.

To run this you'll need a dictionary file of words, each on a new line,
called wordlist.txt and you need to require 'dict' and run Dict.reduce(3,6)
before you run the program. This will create a reduced dictionary file of
only words of lengths between 3 and 6. I didn't make this part of my
program's general execution as I figured doing this once was enough and I
could do it manually.

It's got a simple text interface that looks sort of like the TextTwist gui
layout on Yahoo. Sort of. The dictionary I use, which I did not attach
because its size (3 megs, I believe it was the full Scrabble dictionary)
makes for a much harder game than the Yahoo TextTwist as some of the words
are really obscure.

Also, I have the problem that when running this on Windows it doesn't
allow me to manipulate files with Ruby, giving me a Permission Denied error
on File.open. Any idea why this might be? This is if I try to run
Dict.reduce(3,6) for example.

Here's the solution I wrote while considering this quiz (it requires Unix):

#!/usr/bin/env ruby -w

require "io/wait"

# game date cache
CACHE_FILE = ".game_words"

if File.exist? CACHE_FILE # load from cache
   word_list = File.open(CACHE_FILE) { |file| Marshal.load(file) }
else # build word list
   # prepare data structure
   words_by_signature = Hash.new { |words, sig| words[sig] = Array.new }

   # read dictionary
   File.foreach(ARGV.shift || "/usr/share/dict/words") do |word|
     word.downcase!
     word.delete!("^a-z")

     next unless word.length.between? 3, 6

     (words_by_signature[word.split("").sort.join] << word).uniq!
   end

   # prepare recursive signature search
   def choices(sig, seen = Hash.new { |all, cur| all[cur] = true; false }, &blk)
     sig.length.times do |i|
       shorter = sig[0...i] + sig[(i+1)...sig.length]
       unless seen[shorter]
         blk[shorter]
         choices(shorter, seen, &blk) unless shorter.length == 3
       end
     end
   end

   # prepare game data structure
   word_list = Hash.new

   # build game choices
   words_by_signature.keys.grep(/\A.{6}\Z/) do |possible|
     word_list[possible] = words_by_signature[possible]

     choices(possible) do |shorter_signature|
       if words_by_signature.include? shorter_signature
         word_list[possible].push(*words_by_signature[shorter_signature])
       end
     end
   end

   # cache for faster loads
   File.open(CACHE_FILE, "w") { |file| Marshal.dump(word_list, file) }
end

### game interface (requires Unix) ###
TERMINAL_STATE = `stty -g`
system "stty raw -echo cbreak"
at_exit { system "stty #{TERMINAL_STATE}" }
clear = `clear`

# a raw mode savvy puts
def out(*args) print(*(args + ["\r\n"])) end

# for easy selection
words = word_list.keys

rounds = 0
loop do
   # select letters
   letters = current = words[rand(words.size)]
   while word_list.include? letters
     letters = letters.split("").sort_by { rand }.join
   end
   letters.gsub!(/.(?=.)/, '\0 ')

   # round data
   advance = false
   matches = Array.new
   current_match = String.new
   start = Time.now
   message = nil
   last_update = start - 1

   # round event loop
   until Time.now >= start + 2 * 60
     # game display
     if last_update <= Time.now - 1
       print clear

       out "Your letters: #{letters}"
       out " Time left: #{120 - (Time.now - start).round} seconds"
       out " Your words: #{matches.join(', ')}"
       out
       unless message.nil?
         out message
         out
       end
       print current_match
       $stdout.flush

       last_update = Time.now
     end

     # input handler
     if $stdin.ready?
       char = $stdin.getc
       case char
      when ?a..?z, ?A..?Z # read input
        current_match << char.chr.downcase
        message = nil
         last_update = start - 1
       when ?\b, 127 # backspace/delete
         current_match = current_match[0..-2]
        message = nil
         last_update = start - 1
       when ?\r, ?\n # test entered word
         if word_list[current].include? current_match
           matches << current_match
           matches = matches.sort_by { |word| [word.size, word] }
           if not advance and current_match.length == 6
             advance = true
             message = "You will advance to the next round!"
           else
             message = nil
           end
         else
           message = "Unknown word."
         end
         current_match = String.new
         last_update = start - 1
       end
     end
   end

   # round results
   print clear
   missed = word_list[current] - matches
   unless missed.empty?
     out "Other words using \"#{letters}:\""
     out missed.sort_by { |word| [word.size, word] }.join(", ")
     out
   end
   if advance
     rounds += 1

     out "You made #{matches.size} word#{'s' if matches.size != 1}, ",
         "including at least one six letter word. Nice work."
     out "Press any key to begin the next round."

     $stdin.getc
   else
     out "You made #{matches.size} word#{'s' if matches.size != 1}, ",
         "but failed to find a six letter word."

     break # end game
   end
end

# game results
out "You completed #{rounds} round#{'s' if rounds != 1}. Thanks for playing."

__END__

James Edward Gray II

There is a bug in my previous submission, here it is again:

#== Synopsis
#This is the solution to Ruby Quiz #108 described on
http://www.rubyquiz.com/quiz108.html.

···

#
#== Usage
# text_twist.rb dictionary_file
#
#== Author
# Chunyun Zhao(chunyun.zhao@gmail.com)
#
class Dictionary
  MIN_LEN, MAX_LEN = 3, 6
  attr_reader :max_letters
  def initialize(dict_file, min=MIN_LEN, max=MAX_LEN)
    @min_len = min
    @max_len = max
    @words = Hash.new {|hash,key|hash[key]=[]}
    File.foreach(dict_file) {|word|add_word(word.strip)}
    @max_letters = @words.keys.select {|key| key.size==@max_len}
  end
  def word_list(letters)
    words=[]
    permutate(letters).select {|letters|
      letters.size.between? @min_len, @max_len
    }.uniq.each {|key|
      words += @words[key]
    }
    words.sort_by {|word| word.size}
  end
  private
  def add_word(word)
    if (@min_len..@max_len)===word.size && word=~/^[a-z]+$/i
      word.downcase!
      @words[word.split(//).sort] << word
    end
  end
  def permutate(letters)
    _letters = letters.dup
    result = []
    while letter = _letters.shift
      permutate(_letters).each do |perm|
        result << [letter] + perm
      end
      result << [letter]
    end
    result
  end
end

Task = Struct.new(:letters, :words)

class GameUi
  def initialize(dict)
    @dictionary = dict
    @history_tasks = []
    @rounds = 1
    @score = 0
  end

  def run
    while run_task
      answer = ask("\nProceed to next round?")
      break if answer !~ /^y/i
      @rounds += 1
    end
    puts "\nYou've cleared #{cleared=@cleared?@rounds:@rounds-1} round#{'s'
if cleared > 1}, and your total score is #{@score}."
  end

  private
  def next_task
    letters = @dictionary.max_letters[rand(@dictionary.max_letters.size)]
    retry if @history_tasks.include?(letters)
    task = Task.new(letters, @dictionary.word_list(letters))
    @history_tasks << task
    task
  end

  def run_task
    @task = next_task
    @found = []
    @cleared = false
    puts "\nRound #{@rounds}. Letters: #{@task.letters*', '}. Hint: number
of matching words: #{@task.words.size}"
    while !(word=ask("Enter your word:")).empty?
      if @found.include? word
        puts "Word already found!"
      elsif @task.words.include? word
        @found << word
        @score += word.size
        puts "Good job! You scored #{word.size} points!"
        if word.size == @task.letters.size
          @cleared = true
          puts "\nBingo! Round #@rounds cleared. You found #{@found.size}
word#{'s' if @found.size > 1}. "
          break
        end
      else
        puts "Wrong word!"
      end
    end
    puts "Missed words: #{(@task.words-@found)*', '}."
    @cleared
  end

  def ask question
    print question, " (Hit enter to exit)=> "
    gets.strip.downcase
  end
end
if __FILE__ == $0
  if ARGV.size != 1
    puts "Usage: #{File.basename(__FILE__)} dictionary_file"
    exit
  end
  GameUi.new(Dictionary.new(ARGV.shift)).run
end

I was really impressed when I first saw this. It doesn't quite work if you
want to exclude reusing the same letter more than once
("hhh".scan(/^[hello]{3,6}$/) => ["hhh"]) but it comes so close to something
I've only ever thought about implementing as a recursive method.
Unfortunately I don't know much about this but now I wonder if it's possible
to find all partial permutations of a word with a regexp.

···

On 1/7/07, Daniel Finnie <danfinnie@optonline.net> wrote:

# Find words that use the same letters
selectedWords = dict.scan(/^[#{baseWord}]{3,6}$/)

Hi Ken,

Ken Bloom wrote:

#load dictionary
matcher=Set.new
allwords=Array.new

thiswordmatcher=Set.new

I'm interested in why you used a set for matcher and an array for allwords. Was this a kindof random decision (many of mine are!) or is there something that allwords needed to be able to do and so is an array or vice versa?

Thanks,
Dan

It just so happened that I was learning ruby last summer and wrote a program to cheat for me at TextTwist. Needless to say the game got boring really fast, but it was neat writing the program. I've modified it a bit to instead play the game, but I'm a fairly new user and would appreciate any feedback, both ruby related and general programming.

Looking pretty good to me.

Also, I have the problem that when running this on Windows it doesn't allow me to manipulate files with Ruby, giving me a Permission Denied error on File.open. Any idea why this might be?

Well, you pass open() three arguments:

   File.open("reducedwordlist#{i}.txt", "w", File::CREAT) do |fout|

I'm pretty sure that form has the third argument being the permissions of the new file. Take out the File::CREAT, the "w" mode string handles that automatically, and see if that fixes it up.

Hope that helps.

James Edward Gray II

···

On Jan 7, 2007, at 6:47 AM, Fedor Labounko wrote:

That's neat, and is a good general way of checking for inclusion (so you can
extend it past characters in a string which might not have an .include?
method). You probably don't want that .uniq in there though as that excludes
you from matching 'hell' out of 'hello', for example.

···

On 1/11/07, Martin DeMello <martindemello@gmail.com> wrote:

This just solves the find-all-subwords problem:

target = ARGV[0]
dict = ARGV[1] || 'sowpods'

reduced = target.split(//).sort.uniq.join
primes = [2, 3, 5, 7, 11, 13]
factors =
reduced.split(//).each_with_index {|e, i|
  factors[e[0]] = primes[i]
}

target_num = 1
target.each_byte {|i| target_num *= factors[i]}

IO.foreach(dict) {|word|
  word.chomp!
  next unless (word =~ /^[#{reduced}]+$/) &&
    (word.length < 7) && (word.length > 2)
  p = 1
  word.each_byte {|i| p *= factors[i]}
  puts word if target_num % p == 0
}

My mistake, that's the list of *obscure* accepted 6 letter words. I'll find
a better list this afternoon.

···

On 1/5/07, Jason Mayer <slamboy@gmail.com> wrote:

On 1/5/07, Matthew Moss <matthew.moss.coder@gmail.com> wrote:

> I'd be interested in trying this, but I've avoided dictionary-type
> quizzes in the past for lack of a good dictionary file. Does anyone
> have links to decent word/dictionary files? Or perhaps does Mac OS X
> come with one?

Official scrabble list of accepted 6 letter words:
TWL-Only Six-Letter Words

E:\ruby\programs\rubyquiz\quiz108\other submissions>ruby texttwistGame.rb
./dict.rb:20:in `initialize': No such file or directory -
reducedwordlist3.txt (
Errno::ENOENT)
        from ./dict.rb:20:in `open'
        from ./dict.rb:20:in `get_dict'
        from texttwistGame.rb:37

fixed that, but then:

E:\ruby\programs\rubyquiz\quiz108\other submissions>ruby texttwistGame.rb
texttwistGame.rb:17:in `pick_word': undefined method `delete_if' for
nil:NilClas
s (NoMethodError)
        from texttwistGame.rb:38

I'm too tired to go bug hunting. Check out my submission crop.rb to see
how
I did the windows file writing. Now that I've submitted my program I'm
going back to sleep. :slight_smile:

Thanks Jason. It seems that the problem is that as long as a file by that
name exists, Windows doesn't let me overwrite it. I'm using the FILE::CREAT
flag but I don't think that should be causing this. It seems to be saying I
don't have the permission to modify the file with ruby, though I haven't
done much research into why that is so and how to fix it.

Here is a version of the regex that works around that:
regex = /^([a-z])(?!\1)([a-z])(?!\1|\2)([a-z])(?!\1|\2|\3)([a-z])(?!\1|\2|\3|\4)([a-z])(?!\1|\2|\3|\4|\5)([a-z])$/

Some examples:
>> regex.match 'abcdef'
=> #<MatchData:0xb7b46a18>
>> regex.match 'abcdefg'
=> nil
>> regex.match 'abcddf'
=> nil
>> regex.match 'abbdtf'
=> nil
>> regex.match 'Abbdtf'
=> nil
>> regex.match 'awesom'
=> #<MatchData:0xb7b3a5b0>
>> regex.match 'hollow'
=> nil

Fedor Labounko wrote:

···

On 1/7/07, Daniel Finnie <danfinnie@optonline.net> wrote:

# Find words that use the same letters
selectedWords = dict.scan(/^[#{baseWord}]{3,6}$/)

I was really impressed when I first saw this. It doesn't quite work if you
want to exclude reusing the same letter more than once
("hhh".scan(/^[hello]{3,6}$/) => ["hhh"]) but it comes so close to something
I've only ever thought about implementing as a recursive method.
Unfortunately I don't know much about this but now I wonder if it's possible
to find all partial permutations of a word with a regexp.

I remembered a mistake that i had done. Well not actually a mistake. I just
wrote some 10 lines of code that did some extra work that i never actually
used later in the program. I deleted those lines now, which resulted in a
unidimensional array (it was bidimensional before). This version should be
more readable and should make more sense though there's no extra
functionallity. So the version on rubyquiz.com should be replaced with this
one, for the sake of future readers if it's not too much of a trouble.

Thank you. And thank you for rubyquiz.com. It's fantastic!

texttwist-1.1.rb (2.26 KB)

···

On Sunday 07 January 2007 15:04, Ionut Artarisi wrote:

This is my first ruby program that actually does all it's supposed to do.
It doesn't have a lot of style, but hopefully my next programs will look
better. The wordlist is from wordlist.sourceforge.net and the first part of
the program reads the word.lst file and puts all the words in an array.

Hope it does what it's supposed to do, as i couldn't see the texttwist on
yahoo.
I'm also attaching the Unnoficial Jargon File Wordlist word.lst file.

-Ionuţ Arţărişi

Here's a revision to Daniel's approach that seems to work well:

# Open and read the dictionary.
dict = IO.read("/usr/share/dict/words").scan(/^[a-z]{3,6}$/)

# Pick a random word with 6 letters.
baseWord = dict.grep(/^[a-z]{6}$/).rand_elem

# Find words that use the same letters
sortWord = baseWord.scan(/./).sort.to_s
selectedWords = dict.select {|w|
Regexp.new(w.scan(/./).sort.join('.*')).match(sortWord) }

I started by just extracting only the 3-6 letter words.

Then I sort the base word so the letters are in order. Let's say the
baseWord is "parlor". Then sortWord would be "aloprr".

Now for each word in the dictionary, sort it in letter order and
create a regex. Suppose the word is "roar". The sorted version is
"aorr" and the regex is "a.*o.*r.*r". If that regex matches the
sortWord, we found a valid subword.

This could be sped up by precompiling the regexes I would guess.

Bob

···

On 1/7/07, Fedor Labounko <fedor.labounko@gmail.com> wrote:

On 1/7/07, Daniel Finnie <danfinnie@optonline.net> wrote:

> # Find words that use the same letters
> selectedWords = dict.scan(/^[#{baseWord}]{3,6}$/)

I was really impressed when I first saw this. It doesn't quite work if you
want to exclude reusing the same letter more than once
("hhh".scan(/^[hello]{3,6}$/) => ["hhh"]) but it comes so close to something
I've only ever thought about implementing as a recursive method.
Unfortunately I don't know much about this but now I wonder if it's possible
to find all partial permutations of a word with a regexp.

allwords needs array indexing to pick a random word from the array.
The sets, however, need duplicate removal and/or need include? to operate
in O(1).

--Ken

···

On Wed, 10 Jan 2007 11:17:24 +0900, Daniel Finnie wrote:

Hi Ken,

Ken Bloom wrote:

#load dictionary
matcher=Set.new
allwords=Array.new

thiswordmatcher=Set.new

I'm interested in why you used a set for matcher and an array for
allwords. Was this a kindof random decision (many of mine are!) or is
there something that allwords needed to be able to do and so is an array
or vice versa?

--
Ken Bloom. PhD candidate. Linguistic Cognition Laboratory.
Department of Computer Science. Illinois Institute of Technology.
http://www.iit.edu/~kbloom1/

Aah, thanks so much. I must have misread that before, never noticed the bit
about permissions. I tried changing that line before but it still didn't
work since I was trying to modify the file I created with the weird
permission (whatever permission option File::CREAT would be interpreted as).
Erasing and starting over it all works as expected, creating and overwriting
the file on subsequent calls. Thanks!

···

On 1/10/07, James Edward Gray II <james@grayproductions.net> wrote:

Well, you pass open() three arguments:

   File.open("reducedwordlist#{i}.txt", "w", File::CREAT) do |fout|

I'm pretty sure that form has the third argument being the
permissions of the new file. Take out the File::CREAT, the "w" mode
string handles that automatically, and see if that fixes it up.

Hope that helps.

James Edward Gray II

No, I do need the uniq since I want one prime per unique letter in the
target word. I then calculate its signature by a loop over the entire
word, not the reduced word, so "hell" is indeed matched for "hello"
but not for, say, "whelps".

martin

···

On 1/11/07, Fedor Labounko <fedor.labounko@gmail.com> wrote:

On 1/11/07, Martin DeMello <martindemello@gmail.com> wrote:
>
> This just solves the find-all-subwords problem:
>
> target = ARGV[0]
> dict = ARGV[1] || 'sowpods'
>
> reduced = target.split(//).sort.uniq.join
> primes = [2, 3, 5, 7, 11, 13]
> factors =
> reduced.split(//).each_with_index {|e, i|
> factors[e[0]] = primes[i]
> }
>
> target_num = 1
> target.each_byte {|i| target_num *= factors[i]}
>
> IO.foreach(dict) {|word|
> word.chomp!
> next unless (word =~ /^[#{reduced}]+$/) &&
> (word.length < 7) && (word.length > 2)
> p = 1
> word.each_byte {|i| p *= factors[i]}
> puts word if target_num % p == 0
> }

That's neat, and is a good general way of checking for inclusion (so you can
extend it past characters in a string which might not have an .include?
method). You probably don't want that .uniq in there though as that excludes
you from matching 'hell' out of 'hello', for example.

http://www.scrabulous.com/twl_dictionary.php
Complete list of accepted scrabble words. You'll need to go through and
nuke any long words... I think learn Ruby in 21 days has a regex on day 8
that can assist people with this :slight_smile:

···

On 1/5/07, Jason Mayer <slamboy@gmail.com> wrote:

On 1/5/07, Jason Mayer <slamboy@gmail.com> wrote:
>
> On 1/5/07, Matthew Moss <matthew.moss.coder@gmail.com > wrote:
>
> > I'd be interested in trying this, but I've avoided dictionary-type
> > quizzes in the past for lack of a good dictionary file. Does anyone
> > have links to decent word/dictionary files? Or perhaps does Mac OS X
> > come with one?
>
> Official scrabble list of accepted 6 letter words:
> TWL-Only Six-Letter Words
>

My mistake, that's the list of *obscure* accepted 6 letter words. I'll
find a better list this afternoon.

Oops, that doesn't match {3,6} just {6}.

>> regex = /^([a-z])(?!\1)([a-z])(?!\1|\2)([a-z])(?:(?!\1|\2|\3)([a-z]))?(?:(?!\1|\2|\3|\4)([a-z]))?(?:(?!\1|\2|\3|\4|\5)([a-z]))?$/
=> [a-z]1[a-z]12[a-z]:123[a-z]:1234[a-z]:12345[a-z]
>> regex.match 'hhh'
=> nil
>> regex.match 'hal'
=> #<MatchData:0xb7cfb7d0>
>> regex.match 'sos'
=> nil
>> regex.match 'sauce'
=> #<MatchData:0xb7c12bac>
>> regex.match 'hatoff'
=> nil
>>

That's a better one.

Daniel Finnie wrote:

···

Here is a version of the regex that works around that:
regex = /^([a-z])(?!\1)([a-z])(?!\1|\2)([a-z])(?!\1|\2|\3)([a-z])(?!\1|\2|\3|\4)([a-z])(?!\1|\2|\3|\4|\5)([a-z])$/

Some examples:
>> regex.match 'abcdef'
=> #<MatchData:0xb7b46a18>
>> regex.match 'abcdefg'
=> nil
>> regex.match 'abcddf'
=> nil
>> regex.match 'abbdtf'
=> nil
>> regex.match 'Abbdtf'
=> nil
>> regex.match 'awesom'
=> #<MatchData:0xb7b3a5b0>
>> regex.match 'hollow'
=> nil

Fedor Labounko wrote:

On 1/7/07, Daniel Finnie <danfinnie@optonline.net> wrote:

# Find words that use the same letters
selectedWords = dict.scan(/^[#{baseWord}]{3,6}$/)

I was really impressed when I first saw this. It doesn't quite work if you
want to exclude reusing the same letter more than once
("hhh".scan(/^[hello]{3,6}$/) => ["hhh"]) but it comes so close to something
I've only ever thought about implementing as a recursive method.
Unfortunately I don't know much about this but now I wonder if it's possible
to find all partial permutations of a word with a regexp.

Very interesting, you create a matching in the opposite direction, matching
against the picked word vs against the dictionary of words, very nice. I
like this more than Daniel's approach since it doesn't involve creating
Regexps which grow as n^2 compared to the length of our word. This ends up
going through all the words (length 3 to 6 anyhow) in the dictionary to find
the matches. Finding all permutations would go through the dictionary
roughly the number of subwords times, taking log of the length of the
dictionary number of steps each time. For all but the largest of words with
lots of subwords would the latter be slower, but the Regexp approach is very
nice and simple, and I can't decide what I like more.

It does bother me a bit though that I don't know how long each regexp takes
to run to get its match as I know that some Regexps do take quite a while to
match or exclude some words, just not sure if this one would be one of them.

···

On 1/9/07, Bob Showalter <showaltb@gmail.com> wrote:

On 1/7/07, Fedor Labounko <fedor.labounko@gmail.com> wrote:
> On 1/7/07, Daniel Finnie <danfinnie@optonline.net> wrote:
>
> > # Find words that use the same letters
> > selectedWords = dict.scan(/^[#{baseWord}]{3,6}$/)
>
> I was really impressed when I first saw this. It doesn't quite work if
you
> want to exclude reusing the same letter more than once
> ("hhh".scan(/^[hello]{3,6}$/) => ["hhh"]) but it comes so close to
something
> I've only ever thought about implementing as a recursive method.
> Unfortunately I don't know much about this but now I wonder if it's
possible
> to find all partial permutations of a word with a regexp.
>

Here's a revision to Daniel's approach that seems to work well:

# Open and read the dictionary.
dict = IO.read("/usr/share/dict/words").scan(/^[a-z]{3,6}$/)

# Pick a random word with 6 letters.
baseWord = dict.grep(/^[a-z]{6}$/).rand_elem

# Find words that use the same letters
sortWord = baseWord.scan(/./).sort.to_s
selectedWords = dict.select {|w|
Regexp.new(w.scan(/./).sort.join('.*')).match(sortWord) }

I started by just extracting only the 3-6 letter words.

Then I sort the base word so the letters are in order. Let's say the
baseWord is "parlor". Then sortWord would be "aloprr".

Now for each word in the dictionary, sort it in letter order and
create a regex. Suppose the word is "roar". The sorted version is
"aorr" and the regex is "a.*o.*r.*r". If that regex matches the
sortWord, we found a valid subword.

This could be sped up by precompiling the regexes I would guess.

Bob

The regexes for me are fairly fast. For my regex that weeds out words with repeated characters (I posted it before and it is fairly lengthy, so I won't post it again unless by request), a 10,000 word dictionary (the one in my /usr/share/dict/words) takes less than .5 seconds to pick the baseword and all related words. Under .2 seconds if it doesn't weed out words with repeated characters.

One thing I can see to optimize in your method is all the sorting. I haven't tested the performance yet, but sets might be the answer.

require 'set'

class String
  def letters
    split(//).to_set
  end
end

baseWordSet = baseWord.letters
dict.select {|x| x.letters.subset?(baseWordSet)}

It's more readable, IMHO, at least.

Dan

Bob Showalter wrote:

···

On 1/7/07, Fedor Labounko <fedor.labounko@gmail.com> wrote:

On 1/7/07, Daniel Finnie <danfinnie@optonline.net> wrote:

> # Find words that use the same letters
> selectedWords = dict.scan(/^[#{baseWord}]{3,6}$/)

I was really impressed when I first saw this. It doesn't quite work if you
want to exclude reusing the same letter more than once
("hhh".scan(/^[hello]{3,6}$/) => ["hhh"]) but it comes so close to something
I've only ever thought about implementing as a recursive method.
Unfortunately I don't know much about this but now I wonder if it's possible
to find all partial permutations of a word with a regexp.

Here's a revision to Daniel's approach that seems to work well:

# Open and read the dictionary.
dict = IO.read("/usr/share/dict/words").scan(/^[a-z]{3,6}$/)

# Pick a random word with 6 letters.
baseWord = dict.grep(/^[a-z]{6}$/).rand_elem

# Find words that use the same letters
sortWord = baseWord.scan(/./).sort.to_s
selectedWords = dict.select {|w|
Regexp.new(w.scan(/./).sort.join('.*')).match(sortWord) }

I started by just extracting only the 3-6 letter words.

Then I sort the base word so the letters are in order. Let's say the
baseWord is "parlor". Then sortWord would be "aloprr".

Now for each word in the dictionary, sort it in letter order and
create a regex. Suppose the word is "roar". The sorted version is
"aorr" and the regex is "a.*o.*r.*r". If that regex matches the
sortWord, we found a valid subword.

This could be sped up by precompiling the regexes I would guess.

Bob

That makes sense :).

On a side note, I did make some methods for getting a rand value from any enumerable (they can be integrated as instance methods):

def rand_value_each(enum)
  index = rand(enum.size)
  curr = 0
  enum.each {|x| return .flatten.last if curr == index; curr += 1 }
end
  
def rand_value_to_a(enum)
  [rand_value_ary(enum.to_a)].flatten.last
end
  
def rand_value_ary(ary)
  ary[rand(ary.size)]
end

Dan
  
Ken Bloom wrote:

···

On Wed, 10 Jan 2007 11:17:24 +0900, Daniel Finnie wrote:

Hi Ken,

Ken Bloom wrote:

#load dictionary
matcher=Set.new
allwords=Array.new
thiswordmatcher=Set.new

I'm interested in why you used a set for matcher and an array for allwords. Was this a kindof random decision (many of mine are!) or is there something that allwords needed to be able to do and so is an array or vice versa?

allwords needs array indexing to pick a random word from the array.
The sets, however, need duplicate removal and/or need include? to operate
in O(1).

--Ken