Actually, using parentheses here will not affect what is matched, only what is saved. Even with the parens, each time the accumulator is run it re-matches the character class. Either that, or I'm misinterpreting the results below:
" \t\nHello".match( /^\s*(\w+)/ ) #=> "Hello" , nil
" \t\nHello".match( /^(\s)*(\w+)/ ) #=> "\n" , "Hello"
" \t\nHello".match( /^(\s*)(\w+)/ ) #=> " \t\n" , "Hello"
"\t Hello".match( /^\s*(\w+)/ ) #=> "Hello" , nil
"\t Hello".match( /^(\s)*(\w+)/ ) #=> " " , "Hello"
"\t Hello".match( /^(\s*)(\w+)/ ) #=> "\t " , "Hello"
"\n \tHello".match( /^\s*(\w+)/ ) #=> "Hello" , nil
"\n \tHello".match( /^(\s)*(\w+)/ ) #=> "\t" , "Hello"
"\n \tHello".match( /^(\s*)(\w+)/ ) #=> "\n \t" , "Hello"
(Results are the first two saved subexpressions of the match.)
strings = [
" \t\nHello",
"\t Hello",
"\n \tHello"
]
patterns = [
/^\s*(\w+)/,
/^(\s)*(\w+)/,
/^(\s*)(\w+)/
]
strings.each_with_index{ |str, str_num|
patterns.each_with_index{ |re, re_num|
if match = str.match( re )
info = [ str.inspect, re.inspect, match[1].inspect, match[2].inspect ]
puts "%s.match( %-14s ) #=> %-8s, %-5s" % info
end
}
}
puts "\n(Results are the first two saved subexpressions of the match.)"
···
On Aug 24, 2005, at 6:58 AM, Julian Leviston wrote:
I'm not sure if someone's already answered this, but...
putting parentheses around things groups them... and it's treated as though it's a single regexp...
so:
/\s*/ means match a space, zero or more times to the extent of the contiguous spaces...
but
/(\s)*/ means "match a space, zero or more times to the extent of THIS CONTIGUOUS MATCH. It first matches zero spaces, then the limit of the zero spaces is ... (funnily enough) zero spaces, so it doesn't go any further. You don't want to use parentheses.