Regular expression: Is this a bug or feature?

Hi ruby experts!

Is this intended behaviour?

irb(main):001:0> s1='a=1'
=> "a=1"
irb(main):002:0> s2='b=1'
=> "b=1"
irb(main):003:0> s1 =~ /a|b=(.)/
=> 0 <------ expression matches
irb(main):004:0> $1
=> nil <------ but where is argument?
irb(main):005:0> s2 =~ /a|b=(.)/
=> 0 <------ expression matches
irb(main):006:0> $1
=> "1" <------ this has been expected
irb(main):007:0> s1 =~ /(a|b)=(.)/
=> 0 <------ expression matches
irb(main):012:0> $2
=> "1" <------ this has been expected

Tested on ruby 1.8.2 (2004-12-22) [i686-linux]

Thanks for your help in advance
Martin.

Hi ruby experts!

Is this intended behaviour?

irb(main):001:0> s1='a=1'
=> "a=1"
irb(main):002:0> s2='b=1'
=> "b=1"

Call this part 1:

irb(main):003:0> s1 =~ /a|b=(.)/
=> 0 <------ expression matches
irb(main):004:0> $1
=> nil <------ but where is argument?

Part 2:

irb(main):005:0> s2 =~ /a|b=(.)/
=> 0 <------ expression matches
irb(main):006:0> $1
=> "1" <------ this has been expected

Part 3:

irb(main):007:0> s1 =~ /(a|b)=(.)/
=> 0 <------ expression matches
irb(main):012:0> $2
=> "1" <------ this has been expected

I'm not sure why you think it might be a bug. The '|' operator just
binds very loosely, so you have to group the "a|b" in parens. Note
that in part 1, The bit that matches is the left side of the '|',
namely 'a' (no parens), so there are no captures. In part 2, the
right side ('b=(.)') matches, so there's 1 capture. In part 3, it
matches the whole thing ('(a|b)=(.)'), so there are 2 captures.

Does this make sense?

Note that if you only want 1 capture, you can also use the shy
grouping operator (?:...), so:

  s1 =~ /(?:a|b)=(.)/

[$1, $2] #=> ["1", nil]

···

On 1/23/07, Martin Kahlert <mkcon@gmx.de> wrote:

Martin Kahlert wrote:

irb(main):003:0> s1 =~ /a|b=(.)/
=> 0 <------ expression matches
irb(main):004:0> $1
=> nil <------ but where is argument?

I assume this is the one you need explanation for? I think you simply
misinterpret the regexp. /a|b=(.)/ is a union between the two regexp /a/
and /b=(.)/. So in this case it matches only the first one which has no
bindings. The regexp you are probably looking for would be
/(?:a|b)=(.)/. Try that.

···

--
Posted via http://www.ruby-forum.com/\.

I always assumed 'a|b anything' means '(a|b) anything'.

Thanks for this clarification!

Regards
Martin

···

On Tue, Jan 23, 2007 at 05:43:53PM +0900, George Ogata wrote:

I'm not sure why you think it might be a bug. The '|' operator just
binds very loosely, so you have to group the "a|b" in parens. Note
that in part 1, The bit that matches is the left side of the '|',
namely 'a' (no parens), so there are no captures. In part 2, the
right side ('b=(.)') matches, so there's 1 capture. In part 3, it
matches the whole thing ('(a|b)=(.)'), so there are 2 captures.

Does this make sense?