[RCR] Global Regexp Match Mechanism (//g)

I’m not sure I agree with you about the costs; however, I agree completely about the thread-unsafe behaviour. This is why I suggest in my RCR that the MatchData contain the position data. It’s not as flexible (read: “unsafe”) as the Perl version, but I think it will do the job. I will try to write more detail this weekend.

-a

···


austin ziegler
Sent from my Treo

Hi,

···

At Sat, 14 Dec 2002 06:18:49 +0900, Austin Ziegler wrote:

This is why I suggest in my RCR that the MatchData contain
the position data.

$~.end(0) ?


Nobu Nakada

“Austin Ziegler” austin@halostatue.ca writes:

I’m not sure I agree with you about the costs; however, I agree
completely about the thread-unsafe behaviour. This is why I suggest
in my RCR that the MatchData contain the position data. It’s not as
flexible (read: “unsafe”) as the Perl version, but I think it will
do the job. I will try to write more detail this weekend.

The MatchData already contains the character position of the end of
the match (MatchData#end). If Regexp#match accepts an index offset,
that position can be extracted from the MatchData and passed in.

Originally you had:

while md = /foo/g.match(foostr)
  puts md.to_s
  foostr = md.post_match
end

With Nobu’s patched version of Regexp#match, that loop becomes:

pos = 0
while md = /\Gfoo/.match(foostr, pos)
    puts md.to_s
    pos = md.end
end

The inefficient “foostr = md.post_match” is gone.