[SUMMARY] Numbers Can Be Words (#133)

We all know this problem isn't tough at all. Myself and several others solved
it with a one-liner. We all enjoy a good one-liner, right?

Actually, I built this solution before the quiz was posted and when I shared it
with Morton, he, very politely, mentioned the G-word. The fact is that I wasn't
really trying to "golf" though. I was using a strategy of problem solving I
call Thinking With The Command-Line. Let me take you through my process to my
solution and beyond to show your what I mean.

First, it's important to remember that Ruby has a lot of command-line switches
that help with these quick tasks. It's not a sin to use these tools. They make
short work of jobs like this because the easy things should be easy. You're not
golfing when you use these, you're just telling Ruby this is a quick job and you
trust her to handle the details on this one.

Let's start with the basics. Obviously, we want to read over the dictionary and
print some words in it. Let's begin with just that much:

  $ ruby -pe 1 /usr/share/dict/words

The -p switch asks Ruby to wrap your program in a read and print loop over the
files given as arguments (or STDIN). That was really all I wanted, so I just
needed a program to be wrapped. That's where -e comes in. It let's you give
the code on the command-line and I provided the most trivial code I could think
of. It does nothing of course. It just gives Ruby something to wrap and let's
her do all of the work for me.

OK, so I'm now printing the dictionary, but I really want to print just some
words of the dictionary. I need to introduce some conditional that only prints
when I say it's OK to do so. For that, we move to -p's twin -n and actually
resort to writing a little code:

  $ ruby -ne 'print if true' /usr/share/dict/words

The -n switch gives us the same loop around our code, just minus the print()
statement. This lets me choose when I want to print() something.

The read loops that Ruby creates for us always stick the current line in $_. By
default, that exactly what print() spits out.

Great. That's about half of this task. Now I just need the if condition and
I'm done. Before I figure that out though, let's examine one other command-line
switch. I want to set a base for the code to use. It's true that I could just
drop a number in the code and change it as needed, but it would be better if the
number was separate. Ruby has a shortcut switch for that too:

  $ ruby -se 'p $base' -- -base=14

The -s switch adds some rudimentary variable processing to switches passes to
the program. Note that I said switches passed to the program, not to Ruby. I
used the -- switch above to end Ruby's switch processing and switch into the
program context. You can then see that the switch just sets a global for us.
That's fine for our purposes.

That means all we need is a Regexp that selects the words we want. I came up
with:

  /\A[\d\s#{("a".."z").to_a.join[0...($base.to_i - 10)]}]+\Z/i

That's really just one big character class describing the accepted characters.
The Ruby code inside it creates a String of the alphabet and pulls enough
letters off the front of it to match the current base. Note that we also allow
for digits and the whitespace that will be at the end of each line.

If we put all of that together, we pretty much have my solution:

  $ ruby -sne
  'print if $_ =~ /\A[\d\s#{("a".."z").to_a.join[0...($base.to_i - 10)]}]+\Z/'
  -- -base=12 /usr/share/dict/words

If you're not found of the Regexp, we could remove it. That involves two
changes:

  1. Convert our base into an Array of acceptable characters. We only want
      such code to run one time, so we will place it in a BEGIN { ... } block.
  2. Bring the characters in as an Array so that we can ease the testing of
      letters. Ruby's has switches for that too. We can use -a to split()
      the line of input and -F to provide the pattern to split() on. We will
      also add -l to remove the line ending for us.

Here's the code:

  $ ruby -slap
         -F'\b|\B'
         -e 'BEGIN { $hex = ("a".."z").to_a.first($base.to_i - 10) }'
         -e 'next unless $F.all? { |l| $hex.include? l.downcase }'
         -- -base=14 /usr/share/dict/words

Note that I snuck in another change in addition to those described. I switched
back to -p and just skipped word that aren't numbers.

The trick in this version is that -a causes each line of input to be split()
into the variable $F. I also added the -F switch with a pattern that will match
between each character to control how the split() works.

Of course, we could argue the point of if this is still a one-liner since I'm
now passing Ruby two lines of code, but I try not to loose a lot of sleep over
such things.

A final weakness of this solution is that it doesn't finish processing as soon
as it could. We can easily add that if you can tolerate one more line:

  $ ruby -slap
         -F'\b|\B'
         -e 'BEGIN { $hex = ("a".."z").to_a.first($base.to_i - 10) }'
         -e 'break unless $hex.include? $F.first.downcase'
         -e 'next unless $F.all? { |l| $hex.include? l.downcase }'
         -- -base=14 /usr/share/dict/words

Now might be the right time to consider putting all of this in a file,
especially if we wanted to do more with it. I'm done though, so I'll leave that
as an exercise for the interested reader.

My thanks to all who showed how easy this can really be.

Tomorrow we're back to simulations and drawing pretty pictures...