Anchored Regexp get stalled or hung

Dear rubyists!

I dare to post this question in spite of fact that lately there were
many posts about false RE bug reports.

I've tried to make following regexp

     a=/^-{150}/

and it turns out that such expression hangs ruby interpreter.
I've checked that re{m,n} expression allows by design as much as
32766 repetitions. Expression such as

     /-{32766}/

works fine, and

     /-{32767}/

produces error message:

     'too big quantifier in {,}: /-{32767})/'

But when regexp is anchored at front you can not specify more
than 127 repetitions:

     /^-{127}/

is ok, but

     /^-{128}/

hangs interpreter. Is it a bug or not?

···

--
Best regards
RNicz

rnicz wrote:

    /^-{128}/

hangs interpreter. Is it a bug or not?

Fwiw, I can reproduce the hang in 1.8.2 but not in 1.9.0. Maybe it's something that Oniguruma fixes.

I've tried to make following regexp

     a=/^-{150}/

try this

uln% diff -u regex.c~ regex.c
--- regex.c~ 2004-10-27 04:46:51.000000000 +0200
+++ regex.c 2004-11-19 12:25:39.000000000 +0100
@@ -1011,8 +1011,8 @@
{
   int mcnt;
   int max = 0;
- char *p = start;
- char *pend = end;
+ unsigned char *p = start;
+ unsigned char *pend = end;
   char *must = 0;

   if (start == NULL) return 0;
uln%

uln% ./ruby -e 'a = "-" * 152; /^-{150}/ =~ a; p $&.size'
150
uln%

Guy Decoux

I can reproduce this problem in 1.8.1. I guess this is a bug in the GNU
engine.

This is show off.. my own regexp engine can deal with /^-{128}/

bash-2.05b$ irb
irb(main):001:0> re = NewRegexp.new("^-{128}")
=> #<NewRegexp:0x403a50f0 @source="^-{128}", @scanner=#<Scanner:0x403a50b4
@root=#<Root:0x403a4e5c @number_of_captures=2,
@node=#<ScannerHierarchy::Capture:0x403a4de4
@succ=#<ScannerHierarchy::anchor:0x403a4d80 @anchor_type=:line_begin,
@succ=#<ScannerHierarchy::BeginMatch:0x403a4d44
@succ=#<ScannerHierarchy::RepeatGreedy:0x403a4d6c @max=128,
@succ=#<ScannerHierarchy::Capture:0x403a4df8
@succ=#<ScannerHierarchy::Last:0x403a4e34>, @register=1>, @min=128,
@index=nil, @pattern=#<ScannerHierarchy::Inside:0x403a4d1c @succ=EndPattern,
@set=#<RangeSet:0x403a4f24 @codepoints=[45]>>>>>, @register=0>,
@parser=+-Sequence
  +-Anchor line_begin
  +-Repeat greedy{128,128}
    +-Inside set="-">>>
irb(main):002:0> re.match(("-"*128)+"x")
=> #<NewMatchData:0x403fb068 @captures=,
@matched_string="--------------------------------------------------------------------------------------------------------------------------------",
@positions=[[0, 128]], @post_match="x",
@string="--------------------------------------------------------------------------------------------------------------------------------x",
@match_array=["--------------------------------------------------------------------------------------------------------------------------------"],
@pre_match="", @length=128, @offset=0>
irb(main):003:0> re.match(("-"*127)+"x")
=> nil
irb(main):004:0> puts re.tree
+-Sequence
  +-Anchor line_begin
  +-Repeat greedy{128,128}
    +-Inside set="-"
=> nil
irb(main):005:0>

(sorry for show off)

···

On Thursday 18 November 2004 22:20, Joel VanderWerf wrote:

rnicz wrote:
> /^-{128}/
>
> hangs interpreter. Is it a bug or not?

Fwiw, I can reproduce the hang in 1.8.2 but not in 1.9.0. Maybe it's
something that Oniguruma fixes.

--
Simon Strandgaard

Joel VanderWerf wrote:

rnicz wrote:

    /^-{128}/

hangs interpreter. Is it a bug or not?

Fwiw, I can reproduce the hang in 1.8.2 but not in 1.9.0. Maybe it's something that Oniguruma fixes.

I'm using 1.8.2 with Oniguruma, and it does not hang. I'm guessing it's something with the legacy regexp engine.

- Jamis

···

--
Jamis Buck
jgb3@email.byu.edu
http://www.jamisbuck.org/jamis

Joel VanderWerf wrote:

Maybe it's something that Oniguruma fixes.

How could it happen that I didn't know Oniguruma? Thank you very much!

···

--
RNicz

That's great: first answer after 4 minutes, and patch within 12 hours.

Thank you rubyists!

···

--
RNicz