RichardOnRails wrote:
> I forgot to tell you that I finally understand your second example.
>> md = s.match( /^ (.*) [a-zA-Z] /x )
>> md[1] # => "2.1Topi"
> Without the question mark, in principal, the ".* initially consumes
> all the characters, but then it sees the match fails, because there's
> no match for the "[a-zA-Z]". So the ".*" sort of "backs off" and
> satisfies it self with "2.1Topi", leaving the "c" to satisfy "[a-zA-
> Z]".
Very good, Richard!
It is a question on when one is 'content'; if you need a metaphor to
remember it, think of a WallStreet banker (.*, .+) vs a Franciscan monk
(.*?, .+?)

> The one I like settled on is:
> s="2.1Topic 2.1"
> md = s.match( /^ ([\.\d]*) [^\.\d] /x )
> #md[0]=2.1T
> #md[1]=2.1
I see that you have solved the problem in your previous post (that I
could not reply to), when you wrote (removing all other code):
s = "2.002.1Topic 2.2.1"
s =~ /^ (\d+[.]?)+ [^\.\d] /x
I must confess: I was stunned myself that it did not work; foolish of
us, in fact it was working, but you failed to collect the bounty! you
needed parenthesis to include the '+'!
s =~ /^ ((\d+[.]?)+) [^\.\d] /x
p $1, 2 # => "2.002.1", "1"
However it is better to avoid collecting also the inner results as they
overwrite each other in $2 and then confuse us (that's the reason that
you saw the last digit captured above..); so let's use the '?:' trick,
to avoid writing in $2, where instead we will capture the 'non
digits/dots' that come after:
s =~ /^ ((?:\d+[.]?)*) ([^\.\d]+) /x
p $1, $2 # => "2.002.1", "Topic "
Do you see it? I think you do. Now, to finish, let's examine how you
solved the problem in this post:
> s="2.1Topic 2.1"
> md = s.match( /^ ([.\d]*) [^\.\d] /x )
Ah, you resorted to 'pragmatism'.. you said: "the bloody '\d+[.]?)+'
does not work, so I will change it". This was ok, but do you see the
difference between:
((?:\d+[.]?)*) # I changed + -> * to compare
([\d[.]]*)
aside that the second one is easier to read? (you may want to stop
reading and think about this as this is your test to graduate from
"intermediate level regexp" 
Ok: if they could speak, they would say respectively:
1) I want 0 or more sequences of (digits followed optionally by a dot)
2) I want 0 or more combinations of digits and dots as they come
Do you see?
both would match: "2.002.1" but the 2nd would also match "...1..37"!
The last question you had was: how do I pick up the digits once I
collected the "2.002.1"? Study scan in Pickaxe and then do:
str = "2.002.1"
str.scan(/ (\d+) /x) # => [["2"], ["002"], ["1"]]
All right, let's call it a Regexp day,
Raul
--
Posted viahttp://www.ruby-forum.com/.
Hi Raul,
Thank you for your further support of my obstinacy º Your help has
guided me to the solution I wanted. Your original one is succinct,
perhaps even elegant in that it decomposes the problem into two sub-
problems which admit of essentially one-line solutions. While I truly
appreciate that approach, I wanted to find a "natural" solution,
which is the one included below. It has one caveat: it's aimed at
processing files of only a few megabytes. That said, I'd be pleased
to hear of any downsides you may foresee.
> I forgot to tell you that I finally understand your second example.
[snip]
Very good, Richard!
It is a question on when one is 'content'; if you need a metaphor ...
Thanks. I've got that stuff wired into brain now.
I must confess: I was stunned myself that it did not work; foolish of
us, in fact it was working, but you failed to collect the bounty! you
needed parenthesis to include the '+'!
That approach is old news, now that I've conceived of my "natural"
approach
However it is better to avoid collecting also the inner results as they
overwrite each other in $2 and then confuse us
Understood! As you'll see, I avoided that pitfall below.
[snip]
Do you see it? I think you do.
Quit so.
[snip]
This was ok, but do you see the
difference between:
((?:\d+[.]?)*) # I changed + -> * to compare
([\d[.]]*)
[snip]
Do you see?
For sure!
All right, let's call it a Regexp day,
I'll drink to that!
With Thanks and Best Wishes, I remain
Yours truly,
Richard
# "Natural" Solution
input = <<DATA
05Topic 05
1.0Topic 1.0
2.002.1Topic 2.2.1
3.15.26.37Topic 3.15.26.37
DATA
MaxDepth = 5
sRE = "^"
(1..MaxDepth).each { |i|
sRE << ' (\d*)(?:\.?)'
}
sRE += ' ([^\.\d].*)'
re = Regexp.new(sRE, Regexp::EXTENDED)
input.each { |line|
puts '='*10
puts line
puts '='*10
# puts re.to_s # Debug
md = line.match( re )
(0..MaxDepth+1).each { |i|
puts "md[#{i}] = " + md[i] if md[i]
}
puts
}
···
On Nov 26, 8:36 pm, Raul Parolari <raulparol...@gmail.com> wrote: