Silly regex question

Can someone help me make this code not suck?

require 'test/unit'

# EWWWWWWW
def process_string str
  result = Hash.new
  str.to_a.each do |line|
    line.scan(/important ([a-zA-Z0-9]+): /) do |key|
      result[key.first] = line
    end
  end
  result
end

class TestThis < Test::Unit::TestCase
  def example_string
    <<-END
    # comments
    important key1: some value's here
    important key2: some value's here
    important key3: some value's here
    important key4: some value's here

    other stuff we don't care about
    END
  end

  def test_process_string
    result = process_string example_string
    assert 4, result.size
    assert result.has_key?("key1")
    assert result.has_key?("key2")
    assert result.has_key?("key3")
    assert result.has_key?("key4")
  end
end

Can someone help me make this code not suck?

I guess it depends on what you mean by that...

require 'test/unit'

# EWWWWWWW
def process_string str
  result = Hash.new
  str.to_a.each do |line|
    line.scan(/important ([a-zA-Z0-9]+): /) do |key|
      result[key.first] = line
    end
  end
  result
end

def process_string str
   Hash[*str.scan(/^\s*(important ([a-zA-Z0-9]+): .+)$/).flatten.reverse]
end

James Edward Gray II

···

On Jan 3, 2006, at 4:22 PM, Joe Van Dyk wrote:

Joe Van Dyk wrote:

Can someone help me make this code not suck?

require 'test/unit'

# EWWWWWWW
def process_string str
result = Hash.new
str.to_a.each do |line|
   line.scan(/important ([a-zA-Z0-9]+): /) do |key|
     result[key.first] = line
   end
end
result
end

def process_string str
  result = {}
  str.each_line do |line|
    important_stuff = line.split(/important\s+/,2)[1] or next
    key, value = important_stuff.split ':'
    result[key] = value
  end
  result
end
# More readable? I dunno... you be the judge. I might prefer it over:
# line, key, value = line.match /important\s+(\w+):\s*(.*)$/
# Though the latter is much more explicit.

···

class TestThis < Test::Unit::TestCase
def example_string
   <<-END
   # comments
   important key1: some value's here
   important key2: some value's here
   important key3: some value's here
   important key4: some value's here

   other stuff we don't care about
   END
end

def test_process_string
   result = process_string example_string
   assert 4, result.size
   assert result.has_key?("key1")
   assert result.has_key?("key2")
   assert result.has_key?("key3")
   assert result.has_key?("key4")
end
end

IMO, my version's more readable. I'm going for readability here.

···

On 1/3/06, James Edward Gray II <james@grayproductions.net> wrote:

On Jan 3, 2006, at 4:22 PM, Joe Van Dyk wrote:

> Can someone help me make this code not suck?

I guess it depends on what you mean by that...

> require 'test/unit'
>
> # EWWWWWWW
> def process_string str
> result = Hash.new
> str.to_a.each do |line|
> line.scan(/important ([a-zA-Z0-9]+): /) do |key|
> result[key.first] = line
> end
> end
> result
> end

def process_string str
   Hash[*str.scan(/^\s*(important ([a-zA-Z0-9]+): .+)
$/).flatten.reverse]
end

def process_string str
   result = Hash.new
   str.scan(/^\s*(important ([a-zA-Z0-9]+): .+?)\s*$/) do |line, key|
     result[key] = line
   end
   result
end

# ... or ...

def process_string str
   str.inject(Hash.new) do |result, line|
     result[$1] = line if line =~ /^\s*important ([^:]+):confused:
     result
   end
end

James Edward Gray II

···

On Jan 3, 2006, at 4:41 PM, Joe Van Dyk wrote:

IMO, my version's more readable. I'm going for readability here.

Another silly regex question.

I have a regex that's getting to be more than 100 chars long. How can
I split it up on multiple lines?

Joe

Not that I'm the one that has to read it, but . . .
I like this one for readability.

···

On Wed, Jan 04, 2006 at 07:51:55AM +0900, James Edward Gray II wrote:

def process_string str
  str.inject(Hash.new) do |result, line|
    result[$1] = line if line =~ /^\s*important ([^:]+):confused:
    result
  end
end

--
Chad Perrin [ CCD CopyWrite | http://ccd.apotheon.org ]

unix virus: If you're using a unixlike OS, please forward
this to 20 others and erase your system partition.

You can put the 'x' option on the end of the regular expression.
From the Pickaxe:
ExtendedMode: Complex regular expressions can be difficult to read. The x option
allows you to insert spaces, newlines, and comments in the pattern to
make it more
readable.

e.g.
%r{some regex
with
multiple
lines
}x

···

On 1/3/06, Joe Van Dyk <joevandyk@gmail.com> wrote:

Another silly regex question.

I have a regex that's getting to be more than 100 chars long. How can
I split it up on multiple lines?

Here's a rails example for validating email addresses.

  validates_format_of :login, :with => /
    ^[-^!$#%&'*+\/=?`{|}~.\w]+
    @[a-zA-Z0-9]([-a-zA-Z0-9]*[a-zA-Z0-9])*
    (\.[a-zA-Z0-9]([-a-zA-Z0-9]*[a-zA-Z0-9])*)+$/x,
    :message => "must be a valid email address",
    :on => :create

ah, nice. Too bad vim doesn't highlight the comments. :frowning:

···

On 1/3/06, Wilson Bilkovich <wilsonb@gmail.com> wrote:

On 1/3/06, Joe Van Dyk <joevandyk@gmail.com> wrote:
> Another silly regex question.
>
> I have a regex that's getting to be more than 100 chars long. How can
> I split it up on multiple lines?
>
You can put the 'x' option on the end of the regular expression.
From the Pickaxe:
ExtendedMode: Complex regular expressions can be difficult to read. The x option
allows you to insert spaces, newlines, and comments in the pattern to
make it more
readable.

e.g.
%r{some regex
with
multiple
lines
}x

I was just perfecting my email address validator, mine allows multiple
email addresses in the email_address field:

validates_format_of :email_address, :with =>
/^\s*(?:(?:[^,@\s]+)@(?:(?:[-a-z0-9]+\.)+[a-z]{2,}\s*(,\s*|\z)))+$/i,
:allow_nil => true

It doesn't do as careful of an inspection as yours, although I've seen
some validations that are far more detailed. It'd be nice to have some
people contribute their suggestions for the ultimate email address
validation regular expression.

I also have this handy method in my class:

  def email_addresses
    self.email_address.split(',').map{|a| a.lstrip.rstrip }
  end

-Jeff

···

On Wed, Jan 04, 2006 at 08:27:58AM +0900, Dan Kohn wrote:

Here's a rails example for validating email addresses.

  validates_format_of :login, :with => /
    ^[-^!$#%&'*+\/=?`{|}~.\w]+
    @[a-zA-Z0-9]([-a-zA-Z0-9]*[a-zA-Z0-9])*
    (\.[a-zA-Z0-9]([-a-zA-Z0-9]*[a-zA-Z0-9])*)+$/x,
    :message => "must be a valid email address",
    :on => :create

Have you tried setting this?:

  :syntax on

···

On Wed, Jan 04, 2006 at 08:40:43AM +0900, Joe Van Dyk wrote:

ah, nice. Too bad vim doesn't highlight the comments. :frowning:

--
Chad Perrin [ CCD CopyWrite | http://ccd.apotheon.org ]

This sig for rent: a Signify v1.14 production from http://www.debian.org/

It already exists. Go buy the first edition of Mastering Regular Expressions. There 11 page regex that matches emails.

Brian

···

On Jan 4, 2006, at 7:25 AM, Jeffrey Moss wrote:

It doesn't do as careful of an inspection as yours, although I've seen
some validations that are far more detailed. It'd be nice to have some
people contribute their suggestions for the ultimate email address
validation regular expression.

I was just perfecting my email address validator, mine allows multiple
email addresses in the email_address field:

validates_format_of :email_address, :with =>
/^\s*(?:(?:[^,@\s]+)@(?:(?:[-a-z0-9]+\.)+[a-z]{2,}\s*(,\s*|\z)))+$/i,

:allow_nil => true

It doesn't do as careful of an inspection as yours, although I've seen
some validations that are far more detailed. It'd be nice to have some
people contribute their suggestions for the ultimate email address
validation regular expression.

I also have this handy method in my class:

  def email_addresses
    self.email_address.split(',').map{|a| a.lstrip.rstrip }
  end

Why do you lstrip and then rstrip? Won't a simple strip work for you?

···

On Wednesday 04 January 2006 06:25, Jeffrey Moss wrote:

-Jeff

On Wed, Jan 04, 2006 at 08:27:58AM +0900, Dan Kohn wrote:
> Here's a rails example for validating email addresses.
>
> validates_format_of :login, :with => /
> ^[-^!$#%&'*+\/=?`{|}~.\w]+
> @[a-zA-Z0-9]([-a-zA-Z0-9]*[a-zA-Z0-9])*
> (\.[a-zA-Z0-9]([-a-zA-Z0-9]*[a-zA-Z0-9])*)+$/x,
>
> :message => "must be a valid email address",
> :on => :create

> def email_addresses
> self.email_address.split(',').map{|a| a.lstrip.rstrip }
> end

Why do you lstrip and then rstrip? Won't a simple strip work for you?

YES! Nice to know! Hehe.

-Jeff