Another(!) Newbie question

Hi guys, i'm writing a simple irc bot to get to grips with what i've learned about ruby so far.
It works ok, but i'm looking for a simple way to format the output, at the moment the log files and stdout
look something like this:

:kyu!kyu@cpimps-34E00F11.elk.dialup.pol.co.uk PRIVMSG #matrix :yea that's it

I read the pickaxe section on regexp thinking that would point me in the right direction but i'm still not sure.

Any help would be great. Thanks.

    -kyu

Here is a very simple (maybe naive) regular expression to use to parse
this data:

/^:(.*)!(.*) (.*) #(.*) :(.*)$/

An example of using this:

def parse_irc_data(data)
  if data =~ /^:(.*)!(.*) (.*) #(.*) :(.*)$/
    puts "Part 1: #$1"
    puts "Part 2: #$2"
    puts "Part 3: #$3"
    puts "Part 4: #$4"
    puts "Part 5: #$5"
  end
end

parse_irc_data(":kyu!kyu@cpimps-34E00F11.elk.dialup.pol.co.uk PRIVMSG
#matrix :yea that's it")

The output:
Part 1: kyu
Part 2: kyu@cpimps-34E00F11.elk.dialup.pol.co.uk
Part 3: PRIVMSG
Part 4: matrix
Part 5: yea that's it

Regards,
Ryan Leavengood

kyu wrote:

···

Hi guys, i'm writing a simple irc bot to get to grips with what i've
learned about ruby so far.
It works ok, but i'm looking for a simple way to format the output, at
the moment the log files and stdout
look something like this:

:kyu!kyu@cpimps-34E00F11.elk.dialup.pol.co.uk PRIVMSG #matrix :yea that's
it

I read the pickaxe section on regexp thinking that would point me in the
right direction but i'm still not sure.

Any help would be great. Thanks.

    -kyu

As a general rule, remember that .* is "evil"[1]. There are times when
it is useful, but usally it is not what you want. The semantics are
too liberal introducing possible mismatches and even when correct it
is one of the slowest of possible alternatives. Usually the . can be
replaced with a negation of the character set of whatever follows. For
example, I'd write the above regex as:

/^:([^!]*)!(\S*)\s(\S*)\s#(\S*)\s:(.*)$/

Jacob Fugal

[1] http://perlmonks.org/index.pl?node_id=24640

···

On 4/20/05, Ryan Leavengood <mrcode@netrox.net> wrote:

Here is a very simple (maybe naive) regular expression to use to parse
this data:

/^:(.*)!(.*) (.*) #(.*) :(.*)$/

Ryan Leavengood wrote:

Here is a very simple (maybe naive) regular expression to use to parse
this data:

/^:(.*)!(.*) (.*) #(.*) :(.*)$/

An example of using this:

def parse_irc_data(data)
if data =~ /^:(.*)!(.*) (.*) #(.*) :(.*)$/
  puts "Part 1: #$1"
  puts "Part 2: #$2"
  puts "Part 3: #$3"
  puts "Part 4: #$4"
  puts "Part 5: #$5"
end

parse_irc_data(":kyu!kyu@cpimps-34E00F11.elk.dialup.pol.co.uk PRIVMSG
#matrix :yea that's it")

The output:
Part 1: kyu
Part 2: kyu@cpimps-34E00F11.elk.dialup.pol.co.uk
Part 3: PRIVMSG
Part 4: matrix
Part 5: yea that's it

Regards,
Ryan Leavengood

kyu wrote:

Hi guys, i'm writing a simple irc bot to get to grips with what i've
learned about ruby so far.
It works ok, but i'm looking for a simple way to format the output, at
the moment the log files and stdout
look something like this:

:kyu!kyu@cpimps-34E00F11.elk.dialup.pol.co.uk PRIVMSG #matrix :yea that's
it

I read the pickaxe section on regexp thinking that would point me in the
right direction but i'm still not sure.

Any help would be great. Thanks.

   -kyu
   

Thanks alot! It always seems so simple after someone answers your question, i'll re-read that section and hopefully figure out how it works a bit more.

    -kyu

Thanks for the information and link. I wrote the example for kyu like I
did to purposely keep it simple. But he should definitely use something
more like your regex in his final code. So kyu, if you are listening,
please do so :slight_smile:

Also that link is a good read for anyone who likes to use regular
expressions (which is probably every Ruby programmer.) I have a copy of
"Mastering Regular Expressions" (mentioned in the link) on a CD
somewhere...I need to find it and do some review.

Ryan

Jacob Fugal wrote:

···

As a general rule, remember that .* is "evil"[1]. There are times when
it is useful, but usally it is not what you want. The semantics are
too liberal introducing possible mismatches and even when correct it
is one of the slowest of possible alternatives. Usually the . can be
replaced with a negation of the character set of whatever follows. For
example, I'd write the above regex as:

/^:([^!]*)!(\S*)\s(\S*)\s#(\S*)\s:(.*)$/

Jacob Fugal

[1] http://perlmonks.org/index.pl?node_id=24640