Hi,
I've some more regex questions. I wrote a pattern to check for valid regexes and inspect the parts (we all have our reasons for the things we do:) It wasn't working so I went down to simpler and simpler patterns, but I'm a bit surprised at the way Ruby 1.9 is handling the regexes. I tested the same pattern in Perl and it came out with the answers I'd expect.
Is this down to me using perl regexes for so long, or is there something I'm missing about Ruby's implementation? It appears ^ at the beginning of a string doesn't bind as strongly as I'd expect.
I believe this test should fail as <delim> should be bound to the beginning of the string by the ^ , and the match result is a little bit crazy - shouldn't the main capture be "d\\d" if it's following the logical route it's chosen?
$ ruby -e '
md = /^(?<mors>m)?(?<delim>.)(?<pat>.+?)\g<delim>/.match( %q!/\d\d\\d! )
puts md.inspect
'
#<MatchData "/\\d" mors:nil delim:"d" pat:"\\">
Here I add on a trailing slash to the string, and (I believe) it should bring me back what's between the / / :
$ ruby -e '
md = /^(?<mors>m)?(?<delim>.)(?<pat>.+?)\g<delim>/.match( %q!/\d\d\\d/! )
puts md.inspect
'
#<MatchData "/\\d" mors:nil delim:"d" pat:"\\">
Here's the first string in perl 5.12 :
$ perl -e '
if ( q(/\d\d\\d) =~ /^(?<mors>m)?(?<delim>.)(?<pat>.+?)\g{delim}/ ) {
while ( my ($key, $value) = each(%+) ) {
print "$key => $value\n";
}
}
'
<nothing here, what I'd expect>
And here it is with the "valid" string:
$ perl -e '
if ( q(/\d\d\\d/) =~ /^(?<mors>m)?(?<delim>.)(?<pat>.+?)\g{delim}/ ) {
while ( my ($key, $value) = each(%+) ) {
print "$key => $value\n";
}
}
'
pat => \d\d\d
delim => /
These are the answers I'd expect.
Even this seems unexpected to me, if I remove the <mors> then surely ^ should bind <delim> to the beginning???
$ ruby -e '
md = /^(?<delim>.)(?<pat>.+?)\g<delim>/.match( %q!/\d\d\\d/! )
puts md.inspect
'
#<MatchData "/\\d" delim:"d" pat:"\\">
These work as I'd expect by using the end of line $ :
$ ruby -e '
md = /^(?<delim>.)(?<pat>.+?)\g<delim>$/.match( %q!/\d\d\\d/! )
puts md.inspect
'
#<MatchData "/\\d\\d\\d/" delim:"/" pat:"\\d\\d\\d">
$ ruby -e '
md = /^(?<mors>m)?(?<delim>.)(?<pat>.+?)\g<delim>$/.match( %q!/\d\d\\d/! )
puts md.inspect
'
#<MatchData "/\\d\\d\\d/" mors:nil delim:"/" pat:"\\d\\d\\d">
And finally, if I remove the caret but leave the $ I get the answer I'd expect (or that I'm looking for) :
$ ruby -e '
md = /(?<mors>m)?(?<delim>.)(?<pat>.+?)\g<delim>$/.match( %q!/\d\d\\d/! )
puts md.inspect
'
#<MatchData "/\\d\\d\\d/" mors:nil delim:"/" pat:"\\d\\d\\d">
Regards,
Iain