What's the bug to you? The fact that the second set of <p></p> wasn't
stripped or the fact that $2 is empty?
In the former, sub != gsub. In the latter, you need multi-line mode
because of the "\n\n":
# Without /m
irb(main):026:0> html =~ /<p>(.*?)<\/p>(.*)/
=> 0
irb(main):027:0> $1
=> "one"
irb(main):028:0> $2
=> ""
# With /m
irb(main):023:0> html =~ /<p>(.*?)<\/p>(.*)/m
=> 0
irb(main):024:0> $1
=> "one"
irb(main):025:0> $2
=> "\n\n<p>two</p>"
Regards,
Dan
···
-----Original Message-----
From: James Edward Gray II [mailto:james@grayproductions.net]
Sent: Tuesday, September 13, 2005 12:31 PM
To: ruby-talk ML
Subject: Surprising Regexp Behavior
I keep running into some surprising points with Ruby's Regexp engine
today and this first one just looks plain wrong to me:
irb(main):001:0> html = "<p>one</p>\n\n<p>two</p>"
=> "<p>one</p>\n\n<p>two</p>"
irb(main):002:0> html.sub!(/<p>(.*?)<\/p>(.*)/) { $1.strip }
=> "one\n\n<p>two</p>"
irb(main):003:0> $2
=> ""
Can anyone explain to me how that isn't a bug?
Yep, that's what I was forgetting. Thanks for the lesson.
James Edward Gray II
···
On Sep 13, 2005, at 1:46 PM, Berger, Daniel wrote:
In the former, sub != gsub. In the latter, you need multi-line mode
because of the "\n\n":
# Without /m
irb(main):026:0> html =~ /<p>(.*?)<\/p>(.*)/
=> 0
irb(main):027:0> $1
=> "one"
irb(main):028:0> $2
=> ""
# With /m
irb(main):023:0> html =~ /<p>(.*?)<\/p>(.*)/m
=> 0
irb(main):024:0> $1
=> "one"
irb(main):025:0> $2
=> "\n\n<p>two</p>"
thank dave thomas - the pickaxe (html version I) is always open in my browser
- but far the most oft used page is the one on regex syntax. it just happend
to be open
-a
···
On Wed, 14 Sep 2005, James Edward Gray II wrote:
On Sep 13, 2005, at 1:46 PM, Berger, Daniel wrote:
In the former, sub != gsub. In the latter, you need multi-line mode
because of the "\n\n":
# Without /m
irb(main):026:0> html =~ /<p>(.*?)<\/p>(.*)/
=> 0
irb(main):027:0> $1
=> "one"
irb(main):028:0> $2
=> ""
# With /m
irb(main):023:0> html =~ /<p>(.*?)<\/p>(.*)/m
=> 0
irb(main):024:0> $1
=> "one"
irb(main):025:0> $2
=> "\n\n<p>two</p>"
Yep, that's what I was forgetting. Thanks for the lesson.
--
email :: ara [dot] t [dot] howard [at] noaa [dot] gov
phone :: 303.497.6469
Your life dwells amoung the causes of death
Like a lamp standing in a strong breeze. --Nagarjuna
===============================================================================