Hi,
I have some questions related the correct meaning of * + and ? in Regex
that I would appreciate some clarification:
I have an example (derived from the Programming Ruby 2nd Edition), that
I don't understand why gives these results, here is the code:
def show_regexp(a, re)
if a =~ re
puts "#{$`}<<#{$&}>>#{$'}"
else
puts "no match"
end
end
show_regexp('Example1', /\s*/)
show_regexp('Example2', /\s.*/)
show_regexp('Example3 ', /\s.?/) # Space at the end of string
show_regexp('Example4 ', /\s.+/) # Space at the end of string
show_regexp('Example5 ', /\s.*/) # Space at the end of string
output gives:
<<>>Example1
no match
Example3<< >>
no match
Example5<< >>
If I understand well:
* means - match zero or more occurrences of preceding expression.
+ means - match 1 or more occurrences of preceding expression.
? means - match 0 or 1 occurrence of preceding expression.
Why Example2 gives "no match"? I understand this as find "0 or more
occurrences" of (a space followed by any character)
Why Example4 gives "no match"? I understand this as find "1 or more
occurrence" of (a space followed by any character)
I am assuming that the null character can be matched by a .
Am I correct?
Best Regards
···
--
Posted via http://www.ruby-forum.com/.
A dot (.) can only match an actual character. Example 2 fails because
it's looking not for "0 or more occurrences of (a space followed by
any character)", but "a space followed by 0 or more characters". The *
only applies to whatever immediately precedes it, not the whole
expression... unless the expression's enclosed in parentheses. A regex
for "0 or more occurrences of (a space followed by any character)"
would be /(\s.)*/. In that case, the * applies to the parenthesized
group of whitespace and dot.
Example 4 fails because the only space isn't followed by anything at all.
HTH,
Chris
P.S. I strongly recommend Jeffrey Friedl's Mastering Regular Expressions.
···
On Dec 30, 2007 11:25 PM, Carlos Ortega <caof2005@yahoo.com> wrote:
Hi,
I have some questions related the correct meaning of * + and ? in Regex
that I would appreciate some clarification:
I have an example (derived from the Programming Ruby 2nd Edition), that
I don't understand why gives these results, here is the code:
def show_regexp(a, re)
if a =~ re
puts "#{$`}<<#{$&}>>#{$'}"
else
puts "no match"
end
end
show_regexp('Example1', /\s*/)
show_regexp('Example2', /\s.*/)
show_regexp('Example3 ', /\s.?/) # Space at the end of string
show_regexp('Example4 ', /\s.+/) # Space at the end of string
show_regexp('Example5 ', /\s.*/) # Space at the end of string
output gives:
<<>>Example1
no match
Example3<< >>
no match
Example5<< >>
If I understand well:
* means - match zero or more occurrences of preceding expression.
+ means - match 1 or more occurrences of preceding expression.
? means - match 0 or 1 occurrence of preceding expression.
Why Example2 gives "no match"? I understand this as find "0 or more
occurrences" of (a space followed by any character)
Why Example4 gives "no match"? I understand this as find "1 or more
occurrence" of (a space followed by any character)
I am assuming that the null character can be matched by a .
Am I correct?
Best Regards
--
Posted via http://www.ruby-forum.com/\.
Thanks a lot Chris now I think I got it, however I still have the
doubt interpreting this:
show_regexp('hi hi hihihi hi hi', /\s.*?\s/)
Overall my confusion arrives when 2 special characters are together...
Cause this last would be:
-Match a space
-Followed by 0 or More characters
-Followed by ..... <= Here is my doubt
-Ending with a space.
Again I would appreciate you help on this.
Regards
Carlos
Chris Shea wrote:
···
On Dec 30, 2007 11:25 PM, Carlos Ortega <caof2005@yahoo.com> wrote:
else
output gives:
? means - match 0 or 1 occurrence of preceding expression.
Posted via http://www.ruby-forum.com/\.
A dot (.) can only match an actual character. Example 2 fails because
it's looking not for "0 or more occurrences of (a space followed by
any character)", but "a space followed by 0 or more characters". The *
only applies to whatever immediately precedes it, not the whole
expression... unless the expression's enclosed in parentheses. A regex
for "0 or more occurrences of (a space followed by any character)"
would be /(\s.)*/. In that case, the * applies to the parenthesized
group of whitespace and dot.
Example 4 fails because the only space isn't followed by anything at
all.
HTH,
Chris
P.S. I strongly recommend Jeffrey Friedl's Mastering Regular
Expressions.
--
Posted via http://www.ruby-forum.com/\.
Normally "*" is "greedy" -- i.e., it matches the right-most matching
substring -- but when it's bounded by "?" it matches the left-most
(first) instance.
"Hello world, from ruby".match(/.*?\s+/)[0]
# => "Hello "
"Hello world, from ruby".match(/.*\s+/)[0]
=> "Hello world, from "
Regards,
Jordan
···
On Dec 31, 1:24 am, Carlos Ortega <caof2...@yahoo.com> wrote:
Thanks a lot Chris now I think I got it, however I still have the
doubt interpreting this:
show_regexp('hi hi hihihi hi hi', /\s.*?\s/)
Overall my confusion arrives when 2 special characters are together...
Cause this last would be:
-Match a space
-Followed by 0 or More characters
-Followed by ..... <= Here is my doubt
-Ending with a space.
Again I would appreciate you help on this.
Regards
Carlos
Chris Shea wrote:
> On Dec 30, 2007 11:25 PM, Carlos Ortega <caof2...@yahoo.com> wrote:
>> else
>> output gives:
>> ? means - match 0 or 1 occurrence of preceding expression.
>> Posted viahttp://www.ruby-forum.com/.
> A dot (.) can only match an actual character. Example 2 fails because
> it's looking not for "0 or more occurrences of (a space followed by
> any character)", but "a space followed by 0 or more characters". The *
> only applies to whatever immediately precedes it, not the whole
> expression... unless the expression's enclosed in parentheses. A regex
> for "0 or more occurrences of (a space followed by any character)"
> would be /(\s.)*/. In that case, the * applies to the parenthesized
> group of whitespace and dot.
> Example 4 fails because the only space isn't followed by anything at
> all.
> HTH,
> Chris
> P.S. I strongly recommend Jeffrey Friedl's Mastering Regular
> Expressions.
--
Posted viahttp://www.ruby-forum.com/.