I’m doing something that required RLE, and the code that I
translated from Perl to do this included the following regexp:
/^(.?)((.)\2{2,127})(.?)$/ois
# Later code using $1 (.?), $2 (…), $3 (.?)
Now, you’d think that this simply translates as:
/^(.?)((.)\2{2,127})(.?)$/m
# Later code using $1 (.?), $2 (…), $3 (.?)
However, it doesn’t work because of the way that Ruby builds the
regexp backreferences. The backreferences built are $1, $2, $3, and
$4 – the (.) in the ((.)\2{2,127}) is treated as $3 – whereas in
Perl, it’s simply consumed and ignored. Obviously, I can’t simply
replace (.) with (?:.), because I need the backreference within the
regexp itself. Thus, the translated version is:
/^(.?)((.)\3{2,127})(.?)$/m
While what’s happening makes sense, I’m wondering if it’s correct –
how deep should backreferences be nested and considered part of the
process?
-austin
– Austin Ziegler, austin@halostatue.ca on 2003.04.29 at 22:59:16