I have a HTML document containing many table cells of which I wish to
extract the contents.
r = Regexp.new("\<TD.*?\>(.*?)\<\TD\>", Regexp::MULTILINE)
m = r.match(table)
The above only matches the first cell, I'd like to continue the match
finding each subsequent cell. The not-very-nice way to do this is to
take the char offset of the match, create a new string from that point
and feed it back into match.
I have a HTML document containing many table cells of which I wish to
extract the contents.
r = Regexp.new("\<TD.*?\>(.*?)\<\TD\>", Regexp::MULTILINE)
m = r.match(table)
The above only matches the first cell, I'd like to continue the match
finding each subsequent cell. The not-very-nice way to do this is to
take the char offset of the match, create a new string from that point
and feed it back into match.
r = Regexp.new("\<TD.*?\>(.*?)\<\TD\>", Regexp::MULTILINE)
m = r.match(table)
The above only matches the first cell, I'd like to continue the match
finding each subsequent cell. The not-very-nice way to do this is to
take the char offset of the match, create a new string from that point
and feed it back into match.
Whats the better way?
Using String#scan, as previously suggested, is the easiest method.
Another way of doing it is to use a loop while m is non-nil and match
against m.post_match on each iteration. See the documentation of the
MatchData class for more information,
nikolai
···
--
Nikolai Weibull: now available free of charge at http://bitwi.se/\!
Born in Chicago, IL USA; currently residing in Gothenburg, Sweden.
main(){printf(&linux["\021%six\012\0"],(linux)["have"]+"fun"-97);}