i give up. there seems to be no way to get all the captures for a
group. the corresponding $ variable just has the last one. thanks to
everyone who responded. sorry, did not mean to start a war over
people's coding styles.
if i wanted to parse a list of letters separated by spaces and commas:
'a , b,c' =~ /^(?:(\w)\s*,\s*)*(\w)$/
i need to get ['a','b'] in group 1 and ['c'] in group 2. yes, i know i
can split, then massage the result some more and get the final result.
is there a way to get to groups' captures after a regex match? like in
microsoft's .net?
Have to admit I'm not exactly a regex wiz, but I imagine it can be done somehow. I assume you mean having a repeated capturing group append to an array any number of times?
But, I still think scan is a good tool for the job, it can do any regexp anyway. I don't think a single regexp is really intended for doing variable numbers of captures anyway (?) ).
if i wanted to parse a list of letters separated by spaces and commas:
'a , b,c' =~ /^(?:(\w)\s*,\s*)*(\w)$/
i need to get ['a','b'] in group 1 and ['c'] in group 2. yes, i know i
can split, then massage the result some more and get the final result.
is there a way to get to groups' captures after a regex match? like in
microsoft's .net?
I don't really get what you mean. I don't understand the rules that got a and b into one group and c into another. When you say it's a general question, do you mean you just want access to the captures from some regexp match?
On Wed, 14 Dec 2005 21:59:27 -0000, ako... <akonsu@gmail.com> wrote:
thank you. the question is general.
if i wanted to parse a list of letters separated by spaces and commas:
'a , b,c' =~ /^(?:(\w)\s*,\s*)*(\w)$/
i need to get ['a','b'] in group 1 and ['c'] in group 2. yes, i know i
can split, then massage the result some more and get the final result.
is there a way to get to groups' captures after a regex match? like in
microsoft's .net?
--
Ross Bamford - rosco@roscopeco.remove.co.uk
"\e[1;31mL"
if i wanted to parse a list of letters separated by spaces and commas:
'a , b,c' =~ /^(?:(\w)\s*,\s*)*(\w)$/
i need to get ['a','b'] in group 1 and ['c'] in group 2. yes, i know i
can split, then massage the result some more and get the final result.
is there a way to get to groups' captures after a regex match? like in
microsoft's .net?
t = 'a , b,c'.split( /\s*,\s*/ )
group1 = t[0..-2]
group2 = t[-1,1]
scan does not help because it can match a portion of the source string,
and what is in between the matches is skipped. so scan is just a
special case of the functionality that i was looking for. i need to
make sure the whole string has a defined structure and get parts of it
as groups.
You should be able to tell who this message is meant for:
PLEASE stop sending out code that uses any of the perl ${x} variables ...
They are ugly and have no place in Ruby ... they are only provided to
make the transition of Perl people easier ...
Please teach people to use MatchData objects ...
my_regex = /(\w\s*?.\s*?\w)\s*?.\s*?(\w)/
matches = my_regex.match( "a , b,c" )
element 0 of the matches object will contain the complete matched string.
each element after that will map to one of the groups you defined ...
so:
matches[0] will be the whole string
"a , b,c"
matches[1] will be your first group
"a , b"
matches[2] will be your second group
"c"
... seriously, we're not helping people make cleaner code when we show
approval for the ugly/evil ${x} warts we've kept from Perl.
... show people the beauty and cleanliness of using an OOP solution ...
I hope you agree.
j.
···
On 12/14/05, Ross Bamford <rosco@roscopeco.remove.co.uk> wrote:
On Wed, 14 Dec 2005 21:59:27 -0000, ako... <akonsu@gmail.com> wrote:
> thank you. the question is general.
>
> if i wanted to parse a list of letters separated by spaces and commas:
>
> 'a , b,c' =~ /^(?:(\w)\s*,\s*)*(\w)$/
>
> i need to get ['a','b'] in group 1 and ['c'] in group 2. yes, i know i
> can split, then massage the result some more and get the final result.
> is there a way to get to groups' captures after a regex match? like in
> microsoft's .net?
>
I don't really get what you mean. I don't understand the rules that got a
and b into one group and c into another. When you say it's a general
question, do you mean you just want access to the captures from some
regexp match?
scan does not help because it can match a portion of the source string,
and what is in between the matches is skipped. so scan is just a
special case of the functionality that i was looking for. i need to
make sure the whole string has a defined structure and get parts of it
as groups.
Ah, OK thanks. From your earlier post:
if i wanted to parse a list of letters separated by spaces and commas:
'a , b,c' =~ /^(?:(\w)\s*,\s*)*(\w)$/
i need to get ['a','b'] in group 1 and ['c'] in group 2.
Since we first verified the whole string conforms to the required
pattern, we can then safely perform the scan on the captured group
to obtain the individual matches.
Or we could write the scan using look-ahead assertions, as another
way to prevent the skipping of in-between parts:
str = 'a , b,c'
# first verify whole pattern matches, and get final match group
if str =~ /^(?:\w\s*,\s*)*(\w)$/
last_match = $1
first_matches = str.scan(/(?:(\w)\s*,\s*)(?=(?:\w\s*,\s*)*\w$)/).flatten
end
Regular expressions is the only area I still use Perl magic variables
because it's concise, readable, and works well in that context. It feels
like a regexp standard to me.
The other magic variables I've dispensed with.
Nick
···
On 12/14/05, Jeff Wood <jeff.darklight@gmail.com> wrote:
You should be able to tell who this message is meant for:
PLEASE stop sending out code that uses any of the perl ${x} variables ...
They are ugly and have no place in Ruby ... they are only provided to
make the transition of Perl people easier ...
Please teach people to use MatchData objects ...
my_regex = /(\w\s*?.\s*?\w)\s*?.\s*?(\w)/
matches = my_regex.match( "a , b,c" )
element 0 of the matches object will contain the complete matched string.
each element after that will map to one of the groups you defined ...
so:
matches[0] will be the whole string
"a , b,c"
matches[1] will be your first group
"a , b"
matches[2] will be your second group
"c"
... seriously, we're not helping people make cleaner code when we show
approval for the ugly/evil ${x} warts we've kept from Perl.
... show people the beauty and cleanliness of using an OOP solution ...
I hope you agree.
j.
On 12/14/05, Ross Bamford <rosco@roscopeco.remove.co.uk> wrote:
> On Wed, 14 Dec 2005 21:59:27 -0000, ako... <akonsu@gmail.com> wrote:
>
> > thank you. the question is general.
> >
> > if i wanted to parse a list of letters separated by spaces and commas:
> >
> > 'a , b,c' =~ /^(?:(\w)\s*,\s*)*(\w)$/
> >
> > i need to get ['a','b'] in group 1 and ['c'] in group 2. yes, i know i
> > can split, then massage the result some more and get the final result.
> > is there a way to get to groups' captures after a regex match? like in
> > microsoft's .net?
> >
>
> I don't really get what you mean. I don't understand the rules that got
a
> and b into one group and c into another. When you say it's a general
> question, do you mean you just want access to the captures from some
> regexp match?
>
> irb(main):009:0> "a , b,c" =~ /(\w\s*?,\s*?\w)\s*?,\s*?(\w)/
> => 0
> irb(main):010:0> $1
> => "a , b"
> irb(main):011:0> $2
> => "c"
> irb(main):012:0> $~[1]
> => "a , b"
> irb(main):013:0> $~[2]
> => "c"
> irb(main):014:0> md = /(\w\s*?,\s*?\w)\s*?,\s*?(\w)/.match("a, b,c")
> => #<MatchData:0xb7a47860>
> irb(main):015:0> md[1]
> => "a, b"
> irb(main):016:0> md.captures[1]
> => "c"
> irb(main):017:0> $~.inspect
> => "#<MatchData:0xb7a47860>"
>
> (and others...)
>
> Hope that helps,
> Ross
>
> --
> Ross Bamford - rosco@roscopeco.remove.co.uk
> "\e[1;31mL"
>
>
PLEASE stop sending out code that uses any of the perl ${x} variables ...
They are ugly and have no place in Ruby ... they are only provided to
make the transition of Perl people easier ...
Thankfully, this is Ruby, and not Python with its rigid
Only One Way mentality.
Myself, though I've been aware of MatchData for going on five years now, I find I don't use it that often. The
$1..$n variables are perfectly legible to me. They have
a fine history too: not just Perl but awk, and Unix shell
programming . . .
thank you. yes, it seems to be the only way. just that it is a shame
that we have to match the same expression again! the information was
available already, it was just discarded during the first match in your
sample.
You should be able to tell who this message is meant for:
Yes, I recognize that you are probably speaking at least in part to me, since I did that in this very thread. You can call me by name if you like. I'm a big boy and I can take it.
PLEASE stop sending out code that uses any of the perl ${x} variables ...
Hang on there Mr. Code Police. Let's not lay down the law down too heavily before we get into this...
They are ugly and have no place in Ruby ... they are only provided to
make the transition of Perl people easier ...
I seriously doubt those variables were invented in Perl. They are a common feature to many Regular Expression implementation and I'm not sure they are even that ugly. $1 holds what was grabbed by the first set of parenthesis. Fairly logical.
Please teach people to use MatchData objects ...
I also showed a MatchData example.
I've used them a time or two, but honestly, they just don't feel right to me. I've stopped using the default variable, I'm using a two-space tab, etc. I'm Ruby assimilated, but I just like the Regexp-linked variables.
I see a lot of code running the Ruby Quiz and I feel quite confident saying that the Regexp variables are far more common than MatchData. I don't think that says anything bad about the latter, but it does tell me that you are in the minority.
We won't yell at you for using MatchData, if you'll provide the same consideration...