Bug is ruby regexp

Hello,

I spotted this problem in ruby's regexp today:

$ irb(main):001:0> num = "10"
=> "10"
irb(main):002:0> if num =~ /[9-13]/
irb(main):003:1> puts "hello"
irb(main):004:1> end
SyntaxError: compile error
(irb):2: invalid regular expression: /[9-13]/
        from (irb):4
irb(main):005:0>

I have tested it in ruby 1.8 and 0.9.

Anyone else spotted this?

···

from :0

--
Nick Black
--------------------------------

Hello,

I spotted this problem in ruby's regexp today:

$ irb(main):001:0> num = "10"
=> "10"
irb(main):002:0> if num =~ /[9-13]/
irb(main):003:1> puts "hello"
irb(main):004:1> end
SyntaxError: compile error
(irb):2: invalid regular expression: /[9-13]/
       from (irb):4
       from :0
irb(main):005:0>

I have tested it in ruby 1.8 and 0.9.

Anyone else spotted this?

A character class ([...]) with a range of 9-1 is not valid in a regular expression because 1 does not come after 9 in your character encoding.

I believe you were trying to verify that num is between 9 and 13. Your regex would not do this even if it was legal. Character classes give multiple choices for a single character, not a group of characters.

Here are some ways to perform your check:

>> num = "10"
=> "10"
>> num =~ /\A(?:9|1[0123])\Z/
=> 0
>> num.to_i.between? 9, 13
=> true

Hope that helps.

James Edward Gray II

···

On Feb 2, 2007, at 9:54 AM, Nick Black wrote:

are there any filterable subscriptions to this board?

···

On 2/2/07, Nick Black <nickblack1@gmail.com> wrote:

Hello,

I spotted this problem in ruby's regexp today:

$ irb(main):001:0> num = "10"
=> "10"
irb(main):002:0> if num =~ /[9-13]/
irb(main):003:1> puts "hello"
irb(main):004:1> end
SyntaxError: compile error
(irb):2: invalid regular expression: /[9-13]/
       from (irb):4
       from :0
irb(main):005:0>

I have tested it in ruby 1.8 and 0.9.

Anyone else spotted this?

--
Nick Black
--------------------------------
http://www.blacksworld.net

--
Let go, and let GOD!
Robert McCorkle
www.doodleprints.com
(319) 651-3855

"Nick Black" <nickblack1@gmail.com> writes:

Hello,

I spotted this problem in ruby's regexp today:

$ irb(main):001:0> num = "10"
=> "10"
irb(main):002:0> if num =~ /[9-13]/
irb(main):003:1> puts "hello"
irb(main):004:1> end
SyntaxError: compile error
(irb):2: invalid regular expression: /[9-13]/
        from (irb):4
        from :0
irb(main):005:0>

I have tested it in ruby 1.8 and 0.9.

Anyone else spotted this?

Its not a bug. The problem is you are mixing up characters and numbers. Regular
expressions work on characters - they don't know the number "13" only the
characters 1 and 3. In your regexp, you have a character range of 9-1 and 3.
However, 9 is greater than 1, so the range doesn't make sense.

Assuming you want to match on only the numbers 9, 10, 11, 12 and 13, you have
two basic groupings - a single character '9' or two characters in which the
first is 1 and the second is in the range 0-3. A possible regexp could
therefore be

/^(?:9|1[0-3])$/

which says match "9" or "10" or "11" or "12" or "13", but don't put it into the
match variables ((?:...). Note that you probably don't need the ^ and $, but I
always like to get into the habit of using them where possible as it anchors
the regexp. If you don't anchor a regexp, you can get really really bad
performance due to loads of backtracking. However, this is more applicable when
matching strings of text - with only a couple of characters, its not really and
issue.

I remember seeing a post to the perl group some years ago where someone was
saying that using aregexp was causing their computer to hang. However, it
turned out the problem was due to not anchoring the regexp. The computer wasn't
hung, it was just taking a long long time to perform the matching. As soon as
the expression was anchored, the "hang" was eliminated.

HTH

Tim

···

--
tcross (at) rapttech dot com dot au

Hello,

I spotted this problem in ruby's regexp today:

$ irb(main):001:0> num = "10"
=> "10"
irb(main):002:0> if num =~ /[9-13]/
irb(main):003:1> puts "hello"
irb(main):004:1> end
SyntaxError: compile error
(irb):2: invalid regular expression: /[9-13]/
       from (irb):4
       from :0
irb(main):005:0>

I have tested it in ruby 1.8 and 0.9.

Anyone else spotted this?

A character class ([...]) with a range of 9-1 is not valid in a regular expression because 1 does not come after 9 in your character encoding.

I believe you were trying to verify that num is between 9 and 13. Your regex would not do this even if it was legal. Character classes give multiple choices for a single character, not a group of characters.

Here are some ways to perform your check:

>> num = "10"
=> "10"
>> num =~ /\A(?:9|1[0123])\Z/
=> 0
>> num.to_i.between? 9, 13
=> true

Hope that helps.

James Edward Gray II

or with a range:

>> num = "10"
=> "10"
>> (9..13) === num.to_i
=> true
>> num = "14"
=> "14"
>> (9..13) === num.to_i
=> false

You could also have Float values
>> num = "11.4"
=> "11.4"
>> (9..13) === num.to_i
=> true
>> (9..13) === num.to_f
=> true

>> num = "13.1"
=> "13.1"
>> (9..13) === num.to_i
=> true
>> (9..13) === num.to_f
=> false

Since num.to_i is 13.

-Rob

Rob Biedenharn http://agileconsultingllc.com
Rob@AgileConsultingLLC.com

···

On Feb 2, 2007, at 11:03 AM, James Edward Gray II wrote:

On Feb 2, 2007, at 9:54 AM, Nick Black wrote:

For comparative reference:

$ perl -e 'print "foo" if 9 =~ /[9-13]/'
Invalid range "9-1" in regex; marked by <-- HERE in m/[9-1 <-- HERE
3]/ at -e line 1.

$ python -c "import re; re.match('[9-13]', '9')"
Traceback (most recent call last):
  File "<string>", line 1, in ?
  File "/usr/lib/python2.4/sre.py", line 129, in match
    return _compile(pattern, flags).match(string)
  File "/usr/lib/python2.4/sre.py", line 227, in _compile
    raise error, v # invalid expression
sre_constants.error: bad character range

···

On 2/2/07, James Edward Gray II <james@grayproductions.net> wrote:

On Feb 2, 2007, at 9:54 AM, Nick Black wrote:
> I spotted this problem in ruby's regexp today: ...
>
> (irb):2: invalid regular expression: /[9-13]/
> from (irb):4
> from :0
> irb(main):005:0>

A character class ([...]) with a range of 9-1 is not valid in a
regular expression because 1 does not come after 9 in your character
encoding.