Regex multiline weirdness

Derek_Wyatt · 21 August 2005 21:52

This one's got me stumped.

I'm trying to rip out incomplete HTML tags from the end of a
string... here's an example:

Here is a string with <a\nhref=http://somewhere.com/and/then

Note that there is a newline in there. So i try something simple, like:

~ str.sub(/<[^>]*?$/m, "")

but it doesn't do squat. Interestingly enough, i tried the
following in perl, which i think is the direct translation:

~ $str =~ s/<[^>]*?$//s

and it works just fine.

Anyone know how to get this to work in ruby?

Thanks,
D

- --
Derek Wyatt - C++ / Ruby / Unix Programmer
http://derekwyatt.org

David_A_Black3 · 21 August 2005 22:55

Hi --

This one's got me stumped.

I'm trying to rip out incomplete HTML tags from the end of a
string... here's an example:

Here is a string with <a\nhref=http://somewhere.com/and/then

Note that there is a newline in there. So i try something simple, like:

~ str.sub(/<[^>]*?$/m, "")

You don't need /m here. /m has the effect of adding \n to . (the
dot). Since you're specifying [^>], that already includes \n.

And as Nikolai mentioned, you can use \Z (or \z, the difference being
that \Z ignores a final \n) to reach the absolute end of string.

So... try this:

str.sub(/<[^>]*\Z/,"")

David

···

On Mon, 22 Aug 2005, Derek Wyatt wrote:

--
David A. Black
dblack@wobblini.net

Derek_Wyatt · 22 August 2005 09:53

Thanks David and Nikolai, the \Z works as advertized I just have
to go through the adverts now and make sure i've got them all so i
don't have to ask stupid questions

Regs,
Derek

David A. Black wrote:

Hi --

This one's got me stumped.

I'm trying to rip out incomplete HTML tags from the end of a
string... here's an example:

Here is a string with <a\nhref=http://somewhere.com/and/then

Note that there is a newline in there. So i try something

simple, like:

~ str.sub(/<[^>]*?$/m, "")

You don't need /m here. /m has the effect of adding \n to . (the
dot). Since you're specifying [^>], that already includes \n.

And as Nikolai mentioned, you can use \Z (or \z, the difference being
that \Z ignores a final \n) to reach the absolute end of string.

So... try this:

str.sub(/<[^>]*\Z/,"")

David

- --
Derek Wyatt - C++ / Ruby / Unix Programmer

···

On Mon, 22 Aug 2005, Derek Wyatt wrote:

Topic		Replies	Views
Regex multiline weirdness ruby-talk	0	108	21 August 2005
Regexp and $ ruby-talk	8	76	28 April 2003
Multiline Regexps ruby-talk	3	83	9 December 2003
Surprising Regexp Behavior ruby-talk	2	86	13 September 2005
Bug with end of string characters in regex? ruby-talk	5	142	14 January 2010

Regex multiline weirdness

Related topics