How do I extract the name and value from the following lines:
name=paul value=10 otherstuff=123
but the line may also be:
name=‘hello paul’ value=‘10’ otherstuff=‘123’
I know it has to do with \0 \1 etc, but cant figure out how to make
the re work for both cases
Thanks
Paul wrote:
How do I extract the name and value from the following lines:
name=paul value=10 otherstuff=123
but the line may also be:
name=‘hello paul’ value=‘10’ otherstuff=‘123’
irb(main):001:0> lines.scan(/^name=(‘?)(.?)\1\s+value=('?)(\S?)\3/)
=> [[“”, “paul”, “”, “”], ["’", “hello paul”, “'”, “10”]]
Things start to get more interesting when Strings can also contain
quoted delimiters however. (As in ‘Don't use PHP!’)
Regexp::English lets us solve that case relatively easily however:
irb(main):035:0> re = Regexp::English.new do
irb(main):036:1* quoted_string = quoted_text(“'”)
irb(main):037:1> unquoted_string = non_whitespace
irb(main):038:1> name_val = (quoted_string | unquoted_string).capture(:name)
irb(main):039:1> value_val = (quoted_string | unquoted_string).capture(:value)
irb(main):040:1> literal(“name=”) + name_val + whitespace +
irb(main):041:1* literal(“value=”) + value_val
irb(main):042:1> end
=> /name=((?x:‘((?x:(?!\).(?:\{2})?\’|(?!‘).))‘|\S+))\s+value=((?x:’((?x:(?!\).(?:\{2})?\‘|(?!’).))’|\S+))/
irb(main):051:0> lines = %{
irb(main):052:0" name='hello. I\‘m paul’ value='don\‘t do that’
irb(main):053:0" name=foobar value=3
irb(main):054:0" name=‘drei’ value=‘three’
irb(main):055:0" }
irb(main):070:0> lines.scan(re)
=> [[“'hello. I\‘m paul’”, “hello. I\'m paul”, “'don\‘t do that’”, “don\'t do that”],
[“foobar”, nil, “3”, nil],
[“‘drei’”, “drei”, “‘three’”, “three”]]
Regards,
Florian Gross
Robert
(Robert)
3
“Paul” paul.rogers@shaw.ca schrieb im Newsbeitrag
news:4ee21163.0405281638.45d95197@posting.google.com…
How do I extract the name and value from the following lines:
name=paul value=10 otherstuff=123
but the line may also be:
name=‘hello paul’ value=‘10’ otherstuff=‘123’
I know it has to do with \0 \1 etc, but cant figure out how to make
the re work for both cases
Thanks
You could do:
lines = <<‘EOF’
name=paul value=10 otherstuff=123
name=‘hello paul’ value=‘10’ otherstuff=‘123’
name=‘hello paul, it's nice here’ value=‘10’ otherstuff=‘123’
name=‘hello paul, don’t do that’ value=‘10’ otherstuff=‘123’
EOF
lines.scan( %r{
(name|value|otherstuff)
···
=
(?: ‘((?:[^’\]|\‘)*)’ | (\S+) )
}x ) do |m|
key = m[0]
val = (m[1]||m[2]).gsub(/\(.)/, ‘\1’)
puts “key=#{key}”
puts “value=‘#{val}’”
end
$ ./sc.rb
key=name
value=‘paul’
key=value
value=‘10’
key=otherstuff
value=‘123’
key=name
value=‘hello paul’
key=value
value=‘10’
key=otherstuff
value=‘123’
key=name
value=‘hello paul, it’s nice here’
key=value
value=‘10’
key=otherstuff
value=‘123’
key=name
value=‘hello paul, don’
key=value
value=‘10’
key=otherstuff
value=‘123’
Of course you can replicate the expression to cover all three x=y pairs.
Regards
robert
“Robert Klemme” bob.news@gmx.net wrote in message news:2hrtj4Fg77irU1@uni-berlin.de…
“Paul” paul.rogers@shaw.ca schrieb im Newsbeitrag
news:4ee21163.0405281638.45d95197@posting.google.com…
How do I extract the name and value from the following lines:
name=paul value=10 otherstuff=123
but the line may also be:
name=‘hello paul’ value=‘10’ otherstuff=‘123’
I know it has to do with \0 \1 etc, but cant figure out how to make
the re work for both cases
Thanks
You could do:
lines = <<‘EOF’
name=paul value=10 otherstuff=123
name=‘hello paul’ value=‘10’ otherstuff=‘123’
name=‘hello paul, it's nice here’ value=‘10’ otherstuff=‘123’
name=‘hello paul, don’t do that’ value=‘10’ otherstuff=‘123’
EOF
lines.scan( %r{
(name|value|otherstuff)
(?: ‘((?:[^’\]|\‘)*)’ | (\S+) )
}x ) do |m|
key = m[0]
val = (m[1]||m[2]).gsub(/\(.)/, ‘\1’)
puts “key=#{key}”
puts “value=‘#{val}’”
end
$ ./sc.rb
key=name
value=‘paul’
key=value
value=‘10’
key=otherstuff
value=‘123’
key=name
value=‘hello paul’
key=value
value=‘10’
key=otherstuff
value=‘123’
key=name
value=‘hello paul, it’s nice here’
key=value
value=‘10’
key=otherstuff
value=‘123’
key=name
value=‘hello paul, don’
key=value
value=‘10’
key=otherstuff
value=‘123’
Of course you can replicate the expression to cover all three x=y pairs.
Regards
robert
Thanks guy this is great.
Where would I find Regexp::English ? is it a module in the RAA?
Thanks
Paul
Paul wrote:
Where would I find Regexp::English ? is it a module in the RAA?
I’m planning to release it Real Soon Now. Still thinking about what of
the more advanced features should be in the final release and which
shouldn’t.
Until then I’ll just keep using it to solve other persons problem when
the situation demands it.
Thanks
Paul
No problem, glad I could help!
Regards,
Florian Gross