[quiz] 1-800-the-quiz (#20)

The three rules of Ruby Quiz:

1. Please do not post any solutions or spoiler discussion for this quiz until
48 hours have passed from the time on this message.

2. Support Ruby Quiz by submitting ideas as often as you can:

http://www.rubyquiz.com/

3. Enjoy!

···

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Many companies like to list their phone numbers using the letters printed on
most telephones. This makes the number easier to remember for customers. A
famous example being 1-800-PICK-UPS.

This week's quiz is to write a program that will show a user possible matches
for a list of provided phone numbers.

Your script should behave as a standard Unix filter, reading from files
specified as command-line arguments or STDIN when no files are given. Each line
of these files will contain a single phone number.

For each phone number read, your filter should output all possible word
replacements from a dictionary. Your script should try to replace every digit
of the provided phone number with a letter from a dictionary word; however, if
no match can be made, a single digit can be left as is at that point. No two
consecutive digits can remain unchanged and the program should skip over a
number (producing no output) if a match cannot be made.

Your script should allow the user to set a dictionary with the -d command-line
option, but it's fine to use a reasonable default for your system. The
dictionary is expected to have one word per line.

All punctuation and whitespace should be ignored in both phone numbers and the
dictionary file. The program should not be case sensative, letting "a" == "A".
Output should be capital letters and digits separated at word boundaries with a
single dash (-), one possible word encoding per line. For example, if your
program is fed the number:

  873.7829

One possible line of output is

  USE-RUBY

According to my dictionary.

The number encoding on my phone is:

  2 = A B C
  3 = D E F
  4 = G H I
  5 = J K L
  6 = M N O
  7 = P Q R S
  8 = T U V
  9 = W X Y Z

Feel free to use that, or the encoding on your own phone.

I wrote a solution with a stdin/out interface and a webrick interface.
You can test the webrick interface at http://v-jix.homeip.net:2005/ .
My dictionary is 2of4brif.txt . I'm going to post my code after the
no-spoiler period.

···

--
Jannis Harder

Hello Group,

i once again found the time to do the ruby quiz. I liked the quiz
because it was short, and on the other hand harder than I thought. I
skipped the part about skipping letters in my first try, and when I had
to add it it made me think quite a bit. (I first skipped letters instead
of numbers, because I got confused in my datastructure.)

Now, heres my solution:

I build a tree index of the dictionary indexed by numbers and search
this tree.

Thanks for the quiz James!

class DictionaryNode < Array
  # Terminal info
  attr_reader :words

  def initialize
    super()
    @words = []
  end
end

# A tree-index of the dictionary. Indexed by phone key number.
# This is at the same time node and Dictionary.
class Dictionary
  def initialize(encoding)
    super()
    @encoding = {}
    @inverse_encoding = {}

    encoding.each do | k, v |
      @encoding[k] = v.split(/\s+/).map{|c| c[0]}
    end

    # Create map from characters to numbers
    @inverse_encoding = @encoding.inject({}) { | r, (k, v) |
      v.each do | l | r[l] = k end
      r
    }
    @root = DictionaryNode.new
  end

  # Helper method for rekursive adding of words to the dictionary
  private
  def add_recursive(node, word, suffix)
    if suffix.empty?
      node.words << word
      return node
    end
    add_recursive(node[@inverse_encoding[suffix[0]]] ||= DictionaryNode.new, word, suffix[1..-1])
  end

  # Add words to the dictionary
  public
  def add(word)
    suffix = word.unpack('C*')
    add_recursive(@root, word, suffix)
    self
  end
  
  # Load a wordlist from a file, which contains one word per line.
  # Ignores punctuation and whitespace.
  def load_wordlist(file)
    file.each do |w|
      w.gsub!(/[^A-Za-z]/, '')
      next if w.empty?
      w.upcase!
      self.add(w)
    end
    self
  end

  private
  # Search words and return (in the block) words and the unmatched rest of the number
  def sub_find_noskip(node, number, &block)
    # Return words found so far
    block[node.words.map{|w|w.dup}, number] unless node.words.empty?
    # No more digits, so stop searching here
    return node if number.empty?
    # Search for longer words
    sub_find_noskip(node[number[0]], number[1..-1], &block) if node[number[0]]
  end

  # Search words and return (in the block) words and the unmatched rest of the number.
  # Allows to skip parts of the words, returning the skipped positions as a binary array.
  def sub_find(node, number, skipped = [], &block)
    # Return words found so far
    block[node.words.map{|w|w.dup}, number, skipped] unless node.words.empty?
    # No more digits, so stop searching here
    return node if number.empty?
    # Search for longer words
    sub_find(node[number[0]], number[1..-1], skipped + [false], &block) if node[number[0]]
    # If previous digit was not skipped, allow to skip this one
    sub_find(node, number[1..-1], skipped + [true], &block) if !skipped[-1]
  end

  public
  # Skipping makes this a bit ugly
  def find(number, options)
    result = []
    if options.allow_skips
      sub_find(@root, number) do | words, rest_number, skipped |
        # Interleave skipped numbers
        needle = []
        skipped.zip(number).each_with_index do |(s,n), i|
          needle << [n, i] if s
        end
        words.each do | w |
          needle.each do | (n, i) | w.insert(i, n.to_s) end
        end
        
        if rest_number.empty?
          result.concat(words)
        else
          find(rest_number, options).each do | sentence |
            words.each do | w |
              result << w + '-' + sentence
            end
          end
        end
      end
    else
      sub_find_noskip(@root, number) do | words, rest_number |
        if rest_number.empty?
          result.concat(words)
        else
          find(rest_number, options).each do | sentence |
            words.each do | w |
              result << w + '-' + sentence
            end
          end
        end
      end
    end
    result
  end
end

encodings = {
  :james => {
    2 => 'A B C',
    3 => 'D E F',
    4 => 'G H I',
    5 => 'J K L',
    6 => 'M N O',
    7 => 'P Q R S',
    8 => 'T U V',
    9 => 'W X Y Z'},

  :logic => {
    0 => 'A B',
    1 => 'C D',
    2 => 'E F',
    3 => 'G H',
    4 => 'I J K',
    5 => 'L M N',
    6 => 'O P Q',
    7 => 'R S T',
    8 => 'U V W',
    9 => 'X Y Z'
  }
}

require 'optparse'

class PhonewordOptions < OptionParser
  attr_reader :dictionary, :encoding, :format, :allow_skips, :help, :encoding_help
  def initialize
    super()
    @dictionary = '/usr/share/dict/words'
    @encoding = :james
    @format = :plain
    @allow_skips = true
    @help = false
    @encoding_help = false
    self.on("-d", "--dictionary DICTIONARY", String) { | v | @dictionary = v }
    self.on("-e", "--encoding ENCODING", String,
            "How the alphabet is encoded to phonenumbers. james or logic are supported.") { | v | @encoding = v.downcase.to_sym }
    self.on("-p", "--plain", 'One result per found number, no other information') { @format = :plain }
    self.on("-v", "--verbose", 'Prefix the result with the number') { @format = :verbose }
    self.on("-s", "--skips", "--allow_skips", "--allow-skips", 'Allow to skip one adjacent number while matching. ',
            'Gives lots of ugly results, but james asked for it.') { @allow_skips = true }
    self.on("-c", "--no-skips", "Don't leave numbers in the detected words") { @allow_skips = false }
    self.on("-?", "--help") { @help = true }
    self.on("--supported-encodings", "--encoding-help", "List the supported encodings") { @encoding_help = true }
  end
end

options = PhonewordOptions.new
options.parse!(ARGV)

if options.help
  puts options
  exit
end

if options.encoding_help or !encodings[options.encoding]
  puts "Possible encodings:"
  puts encodings.to_a.sort_by{|(k,v)|k.to_s}.map{|(k,v)| "#{k}:\n"+v.map{|(n,e)|" #{n}: #{e}"}.sort.join("\n")}
  exit
end
dictionary = Dictionary.new(encodings[options.encoding]).load_wordlist(File.open(options.dictionary))

output = {
  :plain => lambda do | number, sentence | sentence end,
  :verbose => lambda do | number, sentence | "#{number.ljust(15)}: #{sentence}" end }

ARGF.each do | number |
  number.strip!
  dictionary.find(number.gsub(/[^0-9]/, '').unpack('C*').map{|n|n - ?0}, options).each do | sentence |
    puts output[options.format][number, sentence]
  end
end

Attached is my solution.

Enjoy

phonewords.rb (1.82 KB)

···

-----Original Message-----
From: Ruby Quiz [mailto:james@grayproductions.net]
Sent: Friday, February 18, 2005 6:58 AM
To: ruby-talk ML
Subject: [QUIZ] 1-800-THE-QUIZ (#20)

The three rules of Ruby Quiz:

1. Please do not post any solutions or spoiler discussion for this quiz until
48 hours have passed from the time on this message.

2. Support Ruby Quiz by submitting ideas as often as you can:

http://www.rubyquiz.com/

3. Enjoy!

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Many companies like to list their phone numbers using the letters printed on most telephones. This makes the number easier to
remember for customers. A famous example being 1-800-PICK-UPS.

This week's quiz is to write a program that will show a user possible matches for a list of provided phone numbers.

Your script should behave as a standard Unix filter, reading from files specified as command-line arguments or STDIN when no files
are given. Each line of these files will contain a single phone number.

For each phone number read, your filter should output all possible word replacements from a dictionary. Your script should try to
replace every digit of the provided phone number with a letter from a dictionary word; however, if no match can be made, a single
digit can be left as is at that point. No two consecutive digits can remain unchanged and the program should skip over a number
(producing no output) if a match cannot be made.

Your script should allow the user to set a dictionary with the -d command-line option, but it's fine to use a reasonable default for
your system. The dictionary is expected to have one word per line.

All punctuation and whitespace should be ignored in both phone numbers and the dictionary file. The program should not be case
sensative, letting "a" == "A".
Output should be capital letters and digits separated at word boundaries with a single dash (-), one possible word encoding per
line. For example, if your program is fed the number:

  873.7829

One possible line of output is

  USE-RUBY

According to my dictionary.

The number encoding on my phone is:

  2 = A B C
  3 = D E F
  4 = G H I
  5 = J K L
  6 = M N O
  7 = P Q R S
  8 = T U V
  9 = W X Y Z

Feel free to use that, or the encoding on your own phone.

Ooops, I didn't send the final version, and I did not include the link to my solution, so here is another try:

This version loads the dictionary a lot faster (3 times as fast as the old
version) because it does not create as many intermediate objects. Also I measured that upcasing and gsub'ing is faster on the long string than on the individual short strings.

Browse the solution online and in full color at:

http://ruby.brian-schroeder.de/quiz/phoneword/

or directly at

http://ruby.brian-schroeder.de/quiz/phoneword/browse/phoneword-rb.html

And to show it in all its glory (now loading the dictionary 3 times as fast:)

# Nodes in the Dictionary.
class DictionaryNode < Array
  # Terminal info
  attr_reader :words

  def initialize
    super()
    @words =
  end
end

# A tree-indexed version of the dictionary that allows efficent searching by number 2 alphabet mapping.
class Dictionary
  def initialize(encoding)
    super()
    @encoding = {}
    @inverse_encoding = {}

    encoding.each do | k, v |
      @encoding[k] = v.split(/\s+/).map{|c| c[0]}
    end

    # Create map from characters to numbers
    @inverse_encoding = @encoding.inject({}) { | r, (k, v) |
      v.each do | l | r[l] = k end
      r
    }
    @root = DictionaryNode.new
  end

  # Helper method for rekursive adding of words to the dictionary
  private
  def add_recursive(node, word, position)
    if word.length == position
      node.words << word
      return node
    end
    add_recursive(node[@inverse_encoding[word[position]]] ||= DictionaryNode.new, word, position + 1)
  end

  # Add words to the dictionary
  public
  def add(word)
    add_recursive(@root, word, 0)
    self
  end
  
  # Load a wordlist from a file, which contains one word per line.
  # Ignores punctuation and whitespace.
  def load_wordlist(file)
    file.read.gsub!(/[^A-Za-z\n]/, '').upcase!.each do |w|
      w.chomp!
      next if w.empty?
      self.add(w)
    end
    self
  end

  private
  # Search words and return (in the block) words and the unmatched rest of the number
  def sub_find_noskip(node, number, &block)
    # Return words found so far
    block[node.words.map{|w|w.dup}, number] unless node.words.empty?
    # No more digits, so stop searching here
    return node if number.empty?
    # Search for longer words
    sub_find_noskip(node[number[0]], number[1..-1], &block) if node[number[0]]
  end

  # Search words and return (in the block) words and the unmatched rest of the number.
  # Allows to skip parts of the words, returning the skipped positions as a binary array.
  def sub_find(node, number, skipped = , &block)
    # Return words found so far
    block[node.words.map{|w|w.dup}, number, skipped] unless node.words.empty?
    # No more digits, so stop searching here
    return node if number.empty?
    # Search for longer words
    sub_find(node[number[0]], number[1..-1], skipped + [false], &block) if node[number[0]]
    # If previous digit was not skipped, allow to skip this one
    sub_find(node, number[1..-1], skipped + [true], &block) if !skipped[-1]
  end

  public
  # Skipping makes this a bit ugly
  def find(number, options)
    result =
    if options.allow_skips
      sub_find(@root, number) do | words, rest_number, skipped |
        # Interleave skipped numbers
        needle =
        skipped.zip(number).each_with_index do |(s,n), i|
          needle << [n, i] if s
        end
        words.each do | w |
          needle.each do | (n, i) | w.insert(i, n.to_s) end
        end
        
        if rest_number.empty?
          result.concat(words)
        else
          find(rest_number, options).each do | sentence |
            words.each do | w |
              result << w + '-' + sentence
            end
          end
        end
      end
    else
      sub_find_noskip(@root, number) do | words, rest_number |
        if rest_number.empty?
          result.concat(words)
        else
          find(rest_number, options).each do | sentence |
            words.each do | w |
              result << w + '-' + sentence
            end
          end
        end
      end
    end
    result
  end
end

encodings = {
  :james => {
    2 => 'A B C',
    3 => 'D E F',
    4 => 'G H I',
    5 => 'J K L',
    6 => 'M N O',
    7 => 'P Q R S',
    8 => 'T U V',
    9 => 'W X Y Z'},

  :logic => {
    0 => 'A B',
    1 => 'C D',
    2 => 'E F',
    3 => 'G H',
    4 => 'I J K',
    5 => 'L M N',
    6 => 'O P Q',
    7 => 'R S T',
    8 => 'U V W',
    9 => 'X Y Z'
  }
}

require 'optparse'

class PhonewordOptions < OptionParser
  attr_reader :dictionary, :encoding, :format, :allow_skips, :help, :encoding_help
  def initialize
    super()
    @dictionary = '/usr/share/dict/words'
    @encoding = :james
    @format = :plain
    @allow_skips = true
    @help = false
    @encoding_help = false
    self.on("-d", "--dictionary DICTIONARY", String) { | v | @dictionary = v }
    self.on("-e", "--encoding ENCODING", String,
            "How the alphabet is encoded to phonenumbers. james or logic are supported.") { | v | @encoding = v.downcase.to_sym }
    self.on("-p", "--plain", 'One result per found number, no other information') { @format = :plain }
    self.on("-v", "--verbose", 'Prefix the result with the number') { @format = :verbose }
    self.on("-s", "--skips", "--allow_skips", "--allow-skips", 'Allow to skip one adjacent number while matching. ',
            'Gives lots of ugly results, but james asked for it.') { @allow_skips = true }
    self.on("-c", "--no-skips", "Don't leave numbers in the detected words") { @allow_skips = false }
    self.on("-?", "--help") { @help = true }
    self.on("--supported-encodings", "--encoding-help", "List the supported encodings") { @encoding_help = true }
  end
end

options = PhonewordOptions.new
options.parse!(ARGV)

if options.help
  puts options
  exit
end

if options.encoding_help or !encodings[options.encoding]
  puts "Possible encodings:"
  puts encodings.to_a.sort_by{|(k,v)|k.to_s}.map{|(k,v)| "#{k}:\n"+v.map{|(n,e)|" #{n}: #{e}"}.sort.join("\n")}
  exit
end

dictionary = Dictionary.new(encodings[options.encoding]).load_wordlist(File.open(options.dictionary))

output = {
  :plain => lambda do | number, sentence | sentence end,
  :verbose => lambda do | number, sentence | "#{number.ljust(15)}: #{sentence}" end }

ARGF.each do | number |
  number.strip!
  dictionary.find(number.gsub(/[^0-9]/, '').unpack('C*').map{|n|n - ?0}, options).each do | sentence |
    puts output[options.format][number, sentence]
  end
end

···

On Mon, 21 Feb 2005 05:40:00 +0900 Brian Schröder <ruby@brian-schroeder.de> wrote:

Hello Group,

i once again found the time to do the ruby quiz. I liked the quiz
because it was short, and on the other hand harder than I thought. I
skipped the part about skipping letters in my first try, and when I had
to add it it made me think quite a bit. (I first skipped letters instead
of numbers, because I got confused in my datastructure.)

Now, heres my solution:

I build a tree index of the dictionary indexed by numbers and search
this tree.

Thanks for the quiz James!

Couldn't reach it. Can you post le code? :smiley:

···

On Feb 19, 2005, at 5:14 PM, Jannis Harder wrote:

I wrote a solution with a stdin/out interface and a webrick interface.
You can test the webrick interface at http://v-jix.homeip.net:2005/ .
My dictionary is 2of4brif.txt . I'm going to post my code after the
no-spoiler period.

--
  Jordi

Just FYI, I get some pretty unusual output when I run this on the quiz phone number using /usr/share/dict/words as my dictionary. Here's a little of what I see:

$ echo '873.7829' | ruby phonewords.rb -d /usr/share/dict/words
8-Q
-7-TA
URF
-T
-Y
T
-3-PU
-Z
U
-ES
-AW
UR
-7-T
-Y
US
-Q
-A
U
-3-P
-2-Z
URD
-U
...

James Edward Gray II

···

On Feb 22, 2005, at 4:04 PM, Lee Marlow wrote:

Attached is my solution.

Must be something wrong with my old iBook, because here it takes aaaaages for anything to even appear.

Makes me wonder if I'm running it correctly:

echo 5467945 | ruby phoneword.rb -v -p -s -d /usr/share/dict/words

Or is the problem just that I/O intensive?

···

On Feb 20, 2005, at 3:50 PM, Brian Schröder wrote:

This version loads the dictionary a lot faster (3 times as fast as the old
version) because it does not create as many intermediate objects. Also I measured that upcasing and gsub'ing is faster on the long string than on the individual short strings.

--
  Jordi

Here's my code:
(wordizer.rb)
#!/usr/bin/env ruby
class Wordizer
  def initialize(dict,map=[nil,nil,"abc","def","ghi","jkl","mno","pqrs","tuv","wxyz"])
    @map=map.map do |z| #@map=map.map ... w00t
      if z
        "[#{z}]"
      else
        "[^\x00-\xFF]"
      end
    end
    case dict
    when String
      @dict=dict.split(/\s+/)
    when Array
      @dict=dict
    when File
      @dict=dict.readlines.map{|z|z.strip}
    end
  end
  def wordize(number,mulnum=false)
    number=number.to_s
    numa = number.split('').map{|z|@map[z.to_i]}
    positions=[[0,false]]
    words = [nil]*(number.size+1)
    until positions.empty?
      positions.uniq!
      pos,num = positions.shift
      words[pos]= nil if words[pos] and words[pos].empty?
      words[pos]||=@dict.grep(mkre(numa[pos..-1]))
           words[pos].map{|z|z.size if z}.uniq.each do |len|
        positions.push([pos+len,false]) if pos+len<=number.size
      end
      if ((not num) or mulnum)and pos<number.size
        words[pos]<<number[pos,1] if !positions.include?([pos+1,false])
           positions.push([pos+1,true])
        end
      end
       end
    out = recwalk(words,mulnum).compact.sort{ |a,b|
      ac = a.gsub(/[^-]/,'').size
      bc = b.gsub(/[^-]/,'').size
      if ac == bc
        a<=>b
      else
        ac<=>bc
      end
    }.map{|z|z.upcase!;if mulnum;z.gsub!(/([0-9])-(?=[0-9])/,'\1');end;z}
    out.delete(number) if mulnum
    out
  end
  private
  def mkre(number)
    cc=0
    re="#{number.shift}"
    number.each do |z|
      cc+=1
      re<<"(#{z}"
    end
    re<<(")?"*cc)
    /^#{re}$/i
  end
  def recwalk(words,mulnum)
    que=[[nil,0,false]]
    out=[]
    until que.empty?
      pre,pos,num,left = que.shift
      if pos == words.size-1
        out << pre
        next
      end
      words[pos].map do |z|
        newnum = (z =~ /[0-9]/)
        que << ["#{pre ? pre+'-' : ''}#{z}",pos+z.size,newnum] if mulnum or ((num and not newnum) or not num)
      end if words[pos]
      que.uniq!
    end
       out
  end
end
if __FILE__ == $0
  require 'optparse'
   dict="2of4brif.txt"
  map=[nil,nil,"abc","def","ghi","jkl","mno","pqrs","tuv","wxyz"]
  mulnum=false
  opts = OptionParser.new do |opts|
    opts.banner = "Usage: #$0 [options] [phone number file]"
    opts.on("-d","--dict TEXTFILE","Specify the dictionary") do |file|
         dict=File.expand_path(file)
     end
         opts.on("-m","--map MAPPING",
              "Specify a custom mapping for a number",
              " Format: number=characters",
              " Example: -m0 -m1 -m2=abc -m3=def ...") \
    do |mapping|
      if mapping !~ /^([0-9])(=(.*))$/
        $stderr.puts "#$0: invalid mapping"
        exit 1
      else
        map[$1.to_i]=$3
      end
    end
       opts.on("-n","--condig","Allow consecutive digits in the output") do
         mulnum=true
     end
       opts.on_tail("-h", "--help", "Show this message") do
      puts opts
      exit
    end
  end
   opts.parse!(ARGV)
      begin f = File.open(dict)
    ARGF.pos
  rescue
    $stderr.puts "#$0: #$!"
    exit 1
  end
   w = Wordizer.new(f,map)

    e.tr!("^0-9","")
    puts w.wordize(e,mulnum)
  end
  f.close
end
__END__

And the Server:
(server.rb)
#!/usr/bin/env ruby
require 'webrick'
require 'wordizer'
include WEBrick
PRE = '<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-US" lang="en-US">
  <head>
    <title>Phone Number Wordizer</title>
  </head>
  <body>
'
POST =' <form action="/" method="get">
      <div>Phone number: <input type="text" name="pn"%VAL% /></div>
      <div><input type="checkbox" name="condig"%C% />Allow consecutive digits</div>
      <div><input type="submit" name="action" value="Go!" /></div>
    </form>
    <div><small>by <a href="mailto:jannis@harderweb.de">Jannis Harder</a></small></div>
  </body>
</html>'

s = HTTPServer.new( :Port => (ARGV[0]||2005).to_i )

$inwork = []
$cache = [nil]*(ARGV[1]||150).to_i
f=File.open(File.expand_path(ARGV[1]||"2of4brif.txt"))
$w=Wordizer.new(f)
f.close

def msg(msg)
    " <p><strong>#{msg}</strong></p>\n"
end
def connum(condig,number)
  (condig ? 'a' : 'b')+number
end

s.mount_proc("/") do |req, res|
  res.body = PRE.dup
  if req.query["pn"]
    number = req.query["pn"].tr("^0-9","")
    condig = req.query["condig"]
    cnum = connum(condig,number)
    if number.size == 0
    elsif number.size > 15
      res.body << msg("Phone number too long.")
    elsif e = $cache.find{|z|z and z[0]==cnum}
      if e[1].empty?
        res.body << msg("No match found")
      else
        res.body << msg("Results:")
        res.body << " <div>"+e[1].join("</div>\n <div>")+"</div><p></p>\n"
      end
      $cache[$cache.index(e),1]=[]
      $cache << e
    else
      Thread.new(number) do
        $inwork << cnum
        $cache << [cnum, $w.wordize(number,condig)]
        $cache.shift
        $inwork.delete(number)
      end unless $inwork.include? cnum
      res['Refresh']="1;url=/?pn=#{WEBrick::HTTPUtils.escape(req.query['pn'])}#{
      req.query['condig'] ? '&condig=on' : ''}&action=Go%21"
      res.body << msg("Please wait...")
    end
  end
  res.body << POST.gsub(/(%VAL%|%C%)/) {
    case $1
    when "%VAL%"
    if req.query["pn"]
      ' value="'+WEBrick::HTMLUtils.escape(req.query["pn"])+'"'
    else
     ''
    end
    when "%C%"
      if req.query["condig"]
      ' checked="checked"'
      else
       ''
      end
    end
  }
  res['Content-Type'] = "text/html"
end
s.mount_proc("/favicon.ico") do
end

trap("INT"){ exit! }
s.start
__END__

···

--
Jannis Harder
iorcc - International Obfuscated Ruby Code Contest
http://iorcc.dyndns.org/ irc://irc.freenode.net/iorcc

Here is the new version. I just needed to chomp the dictionary words. I wrote and tested it using cygwin and my own test
dictionary, IO.readlines seems to keep the \n for each line on linux but not on cygwin.

The output against /usr/share/dict/words doesn't seem too helpful since each letter is listed as its own word.

Sorry it's so slow running against a large dictionary.

phonewords.rb (1.82 KB)

···

-----Original Message-----
From: James Edward Gray II [mailto:james@grayproductions.net]
Sent: Wednesday, February 23, 2005 1:23 PM
To: ruby-talk ML
Subject: Re: [SOLUTION] [QUIZ] 1-800-THE-QUIZ (#20)

On Feb 22, 2005, at 4:04 PM, Lee Marlow wrote:

Attached is my solution.

Just FYI, I get some pretty unusual output when I run this on the quiz phone number using /usr/share/dict/words as my dictionary.
Here's a little of what I see:

$ echo '873.7829' | ruby phonewords.rb -d /usr/share/dict/words 8-Q -7-TA URF -T -Y T -3-PU -Z U -ES -AW UR -7-T -Y US -Q -A U -3-P
-2-Z URD -U ...

James Edward Gray II

Hello Jordi,

you found a bug. Your number makes my code enter into an infinite
loop. I'll fix it and repost.

Thanks,

Brian

···

On Tue, 22 Feb 2005 06:26:16 +0900, Jordi Bunster <jordi@bunster.org> wrote:

On Feb 20, 2005, at 3:50 PM, Brian Schröder wrote:

> This version loads the dictionary a lot faster (3 times as fast as the
> old
> version) because it does not create as many intermediate objects. Also
> I measured that upcasing and gsub'ing is faster on the long string
> than on the individual short strings.

Must be something wrong with my old iBook, because here it takes
aaaaages for anything to even appear.

Makes me wonder if I'm running it correctly:

echo 5467945 | ruby phoneword.rb -v -p -s -d /usr/share/dict/words

Or is the problem just that I/O intensive?

--
        Jordi

Here's my code:
(wordizer.rb)

I must have gremlins today. I'm breaking all kinds of programs.

[snip beginning of code]

def wordize(number,mulnum=false)
   number=number.to_s
   numa = number.split('').map{|z|@map[z.to_i]}
   positions=[[0,false]]
   words = [nil]*(number.size+1)
   until positions.empty?
     positions.uniq!
     pos,num = positions.shift
     words[pos]= nil if words[pos] and words[pos].empty?
     words[pos]||=@dict.grep(mkre(numa[pos..-1]))
         words[pos].map{|z|z.size if z}.uniq.each do |len|
       positions.push([pos+len,false]) if pos+len<=number.size
     end
     if ((not num) or mulnum)and pos<number.size
       words[pos]<<number[pos,1] if !positions.include?([pos+1,false])

At first, copying and pasting this code gave me a syntax error. Changing the above to:

        words[pos]<<number[pos,1]
        if !positions.include?([pos+1,false])

Seemed to fix that issue.

[snip more code]

    begin f = File.open(dict)
   ARGF.pos
rescue
   $stderr.puts "#$0: #$!"
   exit 1
end

Then when I would try:

$ echo '873.7829' | ruby wordizer.rb -d /usr/share/dict/words

The above chunk of code would issue an "Illegal Seek" error. Commenting out ARGF.pos seem to get me over that hump.

However at that point, issuing the same command seems to cause an infinite loop. I couldn't figure out what was going on there, so I gave up.

Just thought I would share, in case I uncovered anything...

James Edward Gray II

···

On Feb 22, 2005, at 12:56 AM, Jannis Harder wrote:

Thanks to Jordi who found a testcase that broke my code I reworked the solution. Now I use a different approach to skipping numbers. I create the possible skips for a given number, ignore the skipped numbers, search a real solution for the rest and reinject the skipped numbers into the solution.

That is a lot nicer, and also the code is now cleaner. Additionally I
improved loading of the wordlist once again, made -v a true verbose
switch and added abit of description.

I hope I'm not annoying people with this long posts.

As always: The nice and colorfull solutions is at:

http://ruby.brian-schroeder.de/quiz/phoneword/

or directly at

http://ruby.brian-schroeder.de/quiz/phoneword/browse/phoneword-rb.html

Best regards,

Brian

#!/usr/bin/ruby

# Nodes in the Dictionary.
class DictionaryNode < Array
  # Terminal info
  attr_reader :words

  def initialize
    super(10)
    @words =
  end
end

# A tree-indexed version of the dictionary that allows efficent searching by number 2 alphabet mapping.
class Dictionary
  def initialize(encoding)
    super()
    @encoding = {}
    @inverse_encoding = {}

    encoding.each do | k, v |
      @encoding[k] = v.split(/\s+/).map{|c| c[0]}
    end

    # Create map from characters to numbers
    @inverse_encoding = @encoding.inject({}) { | r, (k, v) |
      v.each do | l | r[l] = k end
      r
    }
    @root = DictionaryNode.new
  end

  # Helper method for rekursive adding of words to the dictionary
  private
  def add_recursive(node, word, position)
    if word.length == position
      node.words << word
      return node
    end
    add_recursive(node[@inverse_encoding[word[position]]] ||= DictionaryNode.new, word, position + 1)
  end

  # Add words to the dictionary
  public
  def add(word)
    add_recursive(@root, word, 0)
    self
  end
  
  # Load a wordlist from a file, which contains one word per line.
  # Ignores punctuation and whitespace.
  def load_wordlist(file, options)
    $stderr.print "Loading dictionary... " if options.verbose
    start = Time.new
    file.read.gsub(/[^A-Za-z\n]/, '').upcase!.split($/).uniq!.each do |w|
      w.chomp!
      next if w.empty? or w.length <= options.min_length
      self.add(w)
    end
    $stderr.puts "built dictionary in %f seconds" % (Time.new-start).to_f if options.verbose
    self
  end

  private
  # Search words and return (in the block) words and the unmatched rest of the number
  def sub_find(node, number, &block)
    # Return words found so far
    block[node.words.map{|w|w.dup}, number] unless node.words.empty?
    # No more digits, so stop searching here
    return node if number.empty?
    # Search for longer words
    sub_find(node[number[0]], number[1..-1], &block) if node[number[0]]
  end

  private
  # Calculate all allowed skip patterns for a number of a given length
  def skips(s, length)
    return [s] if length == 0
    result = skips(s + [false], length-1)
    result.concat(skips(s + [true], length-1)) unless s[-1]
    result
  end

  public
  # Skipping makes this a bit ugly
  def find_noskip(number)
    result =
    sub_find(@root, number) do | words, rest_number |
      if rest_number.empty?
        result.concat(words)
      else
        find_noskip(rest_number).each do | sentence |
          words.each do | w |
            result << w + '-' + sentence
          end
        end
      end
    end
    result
  end

  # Skipping makes this a bit ugly
  def find(number)
    result =
    skips(, number.length).each do | skipped |

      # Create the injector that can inject the skipped numbers back into the word
      injector =
      skipped.zip(number).each_with_index do |(s,n), i|
        injector << [n.to_s, i] if s
      end

      # We search for words built from the unskipped digits
      unskipped_digits = number.zip(skipped).select{|(d, s)| !s}.map{|(d,s)|d}
      sentences = find_noskip(unskipped_digits)
      # Inject the skipped digits back into the found sentences
      sentences.each do | s |
        injector.each do | (n, i) | s.insert(i, n) end
      end

      result.concat(sentences)
    end
    result
  end
end

encodings = {
  :james => {
    2 => 'A B C',
    3 => 'D E F',
    4 => 'G H I',
    5 => 'J K L',
    6 => 'M N O',
    7 => 'P Q R S',
    8 => 'T U V',
    9 => 'W X Y Z'},

  :logic => {
    0 => 'A B',
    1 => 'C D',
    2 => 'E F',
    3 => 'G H',
    4 => 'I J K',
    5 => 'L M N',
    6 => 'O P Q',
    7 => 'R S T',
    8 => 'U V W',
    9 => 'X Y Z'
  }
}

require 'optparse'

class PhonewordOptions < OptionParser
  attr_reader :dictionary, :encoding, :format, :allow_skips, :help, :encoding_help, :verbose, :min_length
  def initialize
    super()
    @dictionary = '/usr/share/dict/words'
    @encoding = :james
    @format = :plain
    @allow_skips = true
    @help = false
    @encoding_help = false
    @verbose = false
    @ignore_non_alpha = false
    @min_length = 1
    self.on("-d", "--dictionary DICTIONARY", String) { | v | @dictionary = v }
    self.on("-e", "--encoding ENCODING", String,
            "How the alphabet is encoded to phonenumbers. james or logic are supported.") { | v | @encoding = v.downcase.to_sym }
    self.on("-p", "--plain", 'One result per found number, no other information. (Default)') { @format = :plain }
    self.on("-f", "--full", 'Prefix the result with the number') { @format = :full }
    self.on("-v", "--verbose", 'Make more noise') { @verbose = true }
    self.on("-s", "--skips", "--allow_skips", "--allow-skips", 'Allow to skip one adjacent number while matching. (Default)',
            'Gives lots of ugly results, but james asked for it.') { @allow_skips = true }
    self.on("-c", "--no-skips", "Don't leave numbers in the detected words") { @allow_skips = false }
    self.on("-m" "--min-length", "Minimum length of accepted words.",
              "Use this to ignore one-letter words that make the output quite uninteresting.", Integer) { | v | @min_length = v }
    self.on("-?", "--help") { @help = true }
    self.on("--supported-encodings", "--encoding-help", "List the supported encodings") { @encoding_help = true }
  end
end

options = PhonewordOptions.new
begin
  options.parse!(ARGV)
rescue => e
  puts e
  puts options
  exit
end

if options.help
  puts options
  exit
end

if options.encoding_help or !encodings[options.encoding]
  puts "Possible encodings:"
  puts encodings.to_a.sort_by{|(k,v)|k.to_s}.map{|(k,v)| "#{k}:\n"+v.map{|(n,e)|" #{n}: #{e}"}.sort.join("\n")}
  exit
end

dictionary = Dictionary.new(encodings[options.encoding]).load_wordlist(File.open(options.dictionary), options)

output = {
  :plain => lambda do | number, sentence | sentence end,
  :full => lambda do | number, sentence | "#{number.ljust(15)}: #{sentence}" end }

method = {true => :find, false => :find_noskip }

ARGF.each do | number |
  number.strip!
  number = number.gsub(/[^0-9]/, '').unpack('C*').map{|n|n - ?0}
  $stderr.puts "Searching for #{number}" if options.verbose
  dictionary.send(method[options.allow_skips], number).each do | sentence |
    puts output[options.format][number, sentence]
  end
end

···

On Tue, 22 Feb 2005 16:59:02 +0900 Brian Schröder <ruby.brian@gmail.com> wrote:

Hello Jordi,

you found a bug. Your number makes my code enter into an infinite
loop. I'll fix it and repost.

Thanks,

Brian

On Tue, 22 Feb 2005 06:26:16 +0900, Jordi Bunster <jordi@bunster.org> wrote:
>
> On Feb 20, 2005, at 3:50 PM, Brian Schröder wrote:
>
> > This version loads the dictionary a lot faster (3 times as fast as the
> > old
> > version) because it does not create as many intermediate objects. Also
> > I measured that upcasing and gsub'ing is faster on the long string
> > than on the individual short strings.
>
> Must be something wrong with my old iBook, because here it takes
> aaaaages for anything to even appear.
>
> Makes me wonder if I'm running it correctly:
>
> echo 5467945 | ruby phoneword.rb -v -p -s -d /usr/share/dict/words
>
> Or is the problem just that I/O intensive?
>
> --
> Jordi
>
>