Regular expression help please

How do I extract the name and value from the following lines:

name=paul value=10 otherstuff=123

but the line may also be:
name=‘hello paul’ value=‘10’ otherstuff=‘123’

I know it has to do with \0 \1 etc, but cant figure out how to make
the re work for both cases

Thanks

Paul wrote:

How do I extract the name and value from the following lines:

name=paul value=10 otherstuff=123

but the line may also be:
name=‘hello paul’ value=‘10’ otherstuff=‘123’

irb(main):001:0> lines.scan(/^name=(‘?)(.?)\1\s+value=('?)(\S?)\3/)
=> [[“”, “paul”, “”, “”], ["’", “hello paul”, “'”, “10”]]

Things start to get more interesting when Strings can also contain
quoted delimiters however. (As in ‘Don't use PHP!’)

Regexp::English lets us solve that case relatively easily however:

irb(main):035:0> re = Regexp::English.new do
irb(main):036:1* quoted_string = quoted_text(“'”)
irb(main):037:1> unquoted_string = non_whitespace
irb(main):038:1> name_val = (quoted_string | unquoted_string).capture(:name)
irb(main):039:1> value_val = (quoted_string | unquoted_string).capture(:value)
irb(main):040:1> literal(“name=”) + name_val + whitespace +
irb(main):041:1* literal(“value=”) + value_val
irb(main):042:1> end
=> /name=((?x:‘((?x:(?!\).(?:\{2})?\’|(?!‘).))‘|\S+))\s+value=((?x:’((?x:(?!\).(?:\{2})?\‘|(?!’).))’|\S+))/
irb(main):051:0> lines = %{
irb(main):052:0" name='hello. I\‘m paul’ value='don\‘t do that’
irb(main):053:0" name=foobar value=3
irb(main):054:0" name=‘drei’ value=‘three’
irb(main):055:0" }
irb(main):070:0> lines.scan(re)
=> [[“'hello. I\‘m paul’”, “hello. I\'m paul”, “'don\‘t do that’”, “don\'t do that”],
[“foobar”, nil, “3”, nil],
[“‘drei’”, “drei”, “‘three’”, “three”]]

Regards,
Florian Gross

“Paul” paul.rogers@shaw.ca schrieb im Newsbeitrag
news:4ee21163.0405281638.45d95197@posting.google.com

How do I extract the name and value from the following lines:

name=paul value=10 otherstuff=123

but the line may also be:
name=‘hello paul’ value=‘10’ otherstuff=‘123’

I know it has to do with \0 \1 etc, but cant figure out how to make
the re work for both cases

Thanks

You could do:

lines = <<‘EOF’
name=paul value=10 otherstuff=123
name=‘hello paul’ value=‘10’ otherstuff=‘123’
name=‘hello paul, it's nice here’ value=‘10’ otherstuff=‘123’
name=‘hello paul, don’t do that’ value=‘10’ otherstuff=‘123’
EOF

lines.scan( %r{
(name|value|otherstuff)

···

=
(?: ‘((?:[^’\]|\‘)*)’ | (\S+) )
}x ) do |m|
key = m[0]
val = (m[1]||m[2]).gsub(/\(.)/, ‘\1’)
puts “key=#{key}”
puts “value=‘#{val}’”
end

$ ./sc.rb
key=name
value=‘paul’
key=value
value=‘10’
key=otherstuff
value=‘123’
key=name
value=‘hello paul’
key=value
value=‘10’
key=otherstuff
value=‘123’
key=name
value=‘hello paul, it’s nice here’
key=value
value=‘10’
key=otherstuff
value=‘123’
key=name
value=‘hello paul, don’
key=value
value=‘10’
key=otherstuff
value=‘123’

Of course you can replicate the expression to cover all three x=y pairs.

Regards

robert

“Robert Klemme” bob.news@gmx.net wrote in message news:2hrtj4Fg77irU1@uni-berlin.de

“Paul” paul.rogers@shaw.ca schrieb im Newsbeitrag
news:4ee21163.0405281638.45d95197@posting.google.com

How do I extract the name and value from the following lines:

name=paul value=10 otherstuff=123

but the line may also be:
name=‘hello paul’ value=‘10’ otherstuff=‘123’

I know it has to do with \0 \1 etc, but cant figure out how to make
the re work for both cases

Thanks

You could do:

lines = <<‘EOF’
name=paul value=10 otherstuff=123
name=‘hello paul’ value=‘10’ otherstuff=‘123’
name=‘hello paul, it's nice here’ value=‘10’ otherstuff=‘123’
name=‘hello paul, don’t do that’ value=‘10’ otherstuff=‘123’
EOF

lines.scan( %r{
(name|value|otherstuff)

(?: ‘((?:[^’\]|\‘)*)’ | (\S+) )
}x ) do |m|
key = m[0]
val = (m[1]||m[2]).gsub(/\(.)/, ‘\1’)
puts “key=#{key}”
puts “value=‘#{val}’”
end

$ ./sc.rb
key=name
value=‘paul’
key=value
value=‘10’
key=otherstuff
value=‘123’
key=name
value=‘hello paul’
key=value
value=‘10’
key=otherstuff
value=‘123’
key=name
value=‘hello paul, it’s nice here’
key=value
value=‘10’
key=otherstuff
value=‘123’
key=name
value=‘hello paul, don’
key=value
value=‘10’
key=otherstuff
value=‘123’

Of course you can replicate the expression to cover all three x=y pairs.

Regards

robert

Thanks guy this is great.

Where would I find Regexp::English ? is it a module in the RAA?

Thanks

Paul

Paul wrote:

Where would I find Regexp::English ? is it a module in the RAA?

I’m planning to release it Real Soon Now. Still thinking about what of
the more advanced features should be in the final release and which
shouldn’t.

Until then I’ll just keep using it to solve other persons problem when
the situation demands it. :slight_smile:

Thanks
Paul

No problem, glad I could help!

Regards,
Florian Gross