The following regexp is supposed to chop off the last / of a string
and all characters following it, but it seems to be ignoring the
non-greedy indicator (?):
irb(main):001:0> “http://www.x.com/y/z.html”.sub(%r|/.+?.html$|, ‘’)
The expected result should be “http://www.x.com/y”. I thought this
was a bug but perl produces the same result, so what am I missing?
You’re missing the notion of a leftmost match. The regex engine reads
from left to right, so to speak, in looking for the ‘/’. It finds it
in the sixth character. Then it does what you ask: namely, look for
’.html’ at the end of the line.
To do what you were trying to do, try this:
That also finds the leftmost match – but in this case, the leftmost
match doesn’t start until the last ‘/’ (because none of the other
’/'s, even though they’re further left, allow the rest of the match to
On Tue, 13 Aug 2002, Tom Robinson wrote:
David Alan Black