Regular expression question

I’m sure this is trivial, but I don’t have my Mastering Regular Expressions handy (and I haven’t put sufficient effort into getting through it!).

I have a string that looks something like …

“Guido van Rossum”, “Larry Wall”, “Matz”<<

(ie, it’s the set of quoted strings between the markers). What I’d like to do is split the string up into an array containing the three quoted items (with or without the quote marks).

At some point, I’d probably also like to use something like " to represent a quote within a string.

I’m sorry to say that I’m actually having to do this in Java (it’s a work thing), so I’ll have to do some mangling of whatever the “correct” regex is, but that’s OK … well, it’s not, but I’ll have to live with it :-).

Cheers,

H.

Hi –

···

On Thu, 21 Aug 2003, Harry Ohlsen wrote:

I’m sure this is trivial, but I don’t have my Mastering Regular Expressions handy (and I haven’t put sufficient effort into getting through it!).

I have a string that looks something like …

“Guido van Rossum”, “Larry Wall”, “Matz”<<

(ie, it’s the set of quoted strings between the markers). What I’d like to do is split the string up into an array containing the three quoted items (with or without the quote marks).

Something like this (tweaked as needed) should work:

irb(main):006:0> str
=> “"Guido van Rossum", "Larry Wall", "Matz"”
irb(main):007:0> str.scan(/“.+?”/)
=> [“"Guido van Rossum"”, “"Larry Wall"”, “"Matz"”]

David


David Alan Black
home: dblack@superlink.net
work: blackdav@shu.edu
Web: http://pirate.shu.edu/~blackdav

Harry asked for support for "s inside strings as well. For that a bit
more of work is needed:

test = '>>"Guido", "Larry", "Matz", "foo \"bar\""<<'
puts test.scan(/"(?:\\.|[^"\\])*"/)

gives

"Guido"
"Larry"
"Matz"
"foo \"bar\""

A post-process of the result would unescape those " easily if wanted.

– fxn

···

On Thursday 21 August 2003 13:23, dblack@superlink.net wrote:

Something like this (tweaked as needed) should work:

irb(main):006:0> str
=> “"Guido van Rossum", "Larry Wall", "Matz"”
irb(main):007:0> str.scan(/“.+?”/)
=> [“"Guido van Rossum"”, “"Larry Wall"”, “"Matz"”]

Xavier Noria wrote:

···

On Thursday 21 August 2003 13:23, dblack@superlink.net wrote:

Something like this (tweaked as needed) should work:

irb(main):006:0> str
=> “"Guido van Rossum", "Larry Wall", "Matz"”
irb(main):007:0> str.scan(/“.+?”/)
=> [“"Guido van Rossum"”, “"Larry Wall"”, “"Matz"”]

Harry asked for support for "s inside strings as well. For that a bit
more of work is needed:

test = ‘>>“Guido”, “Larry”, “Matz”, “foo "bar"”<<’
puts test.scan(/“(?:\.|[^”\])*"/)

gives

“Guido”
“Larry”
“Matz”
“foo "bar"”

Thanks to both of you. I like that a lot better than the long one from
MRE that Mike posted, but I guess I’ll have to try some complex cases to
see that it works 100%.

Cheers,

Harry O.