Regexp questions

Mike_Steiner · 23 June 2007 22:22

I'm converting a big Python program to Ruby which uses lots of regexps, and
I'm getting some odd errors. One problem seems to be that \1 in the
replacement string doesn't always work. Are there any known "gotchas"
between Python's regexps and Ruby's?

And what's the most robust way to convert "underlined" words to HTML italics
using a regexp? Something like: "This _word_ is in italics." -> "This
word is in italics."

Thanks!

Mike Steiner

Daniel_Lucraft · 24 June 2007 08:45

Mike Steiner wrote:

I'm getting some odd errors. One problem seems to be that \1 in the
replacement string doesn't always work. Are there any known "gotchas"
between Python's regexps and Ruby's?

And what's the most robust way to convert "underlined" words to HTML
italics
using a regexp? Something like: "This _word_ is in italics." -> "This
word is in italics."

Could you give an example of where it isn't working in the first case?

As for the second, I don't know about 'most robust' but I might try
something like this (with the assumption that words aren't broken over
lines):

str

=> "asdf asdf _asdfasd_ asdf _ash_ h"

str.gsub(/_([^\s]+)_/, "\\1")

=> "asdf asdf asdfasd asdf ash h"

Or if I wanted to be able to have italicised sentences (_word word_) I
might try this:

str

=> "asdf asdf _asdf asd_ asdf _ash \nash_ h"

str.gsub(/_(.+?)_/m, "\\1")

=> "asdf asdf asdf asd asdf ash \nash h"

But then I would worry about performance because of the lazy operator
and would want to test it on some real data.

best,
Dan

···

--
Posted via http://www.ruby-forum.com/\.

Wyatt_Draggoo · 24 June 2007 16:35

> str
=> "asdf asdf _asdfasd_ asdf _ash_ h"
> str.gsub(/_([^\s]+)_/, "\\1")
=> "asdf asdf asdfasd asdf ash h"

Or if I wanted to be able to have italicised sentences (_word word_) I
might try this:

> str
=> "asdf asdf _asdf asd_ asdf _ash \nash_ h"
> str.gsub(/_(.+?)_/m, "\\1")
=> "asdf asdf asdf asd asdf ash \nash h"

I like to be very strict with things like quotes (and underscores in this case), so I would probably use:

str

=> "asdf asdf _asdf asd_ asdf _ash \nash_ h"

str.gsub(/_([^_]+)_/, "\\1")

=> "asdf asdf asdf asd asdf ash \nash h"

That seems to work like I would expect it to---I'm just coming over from Perl...

Wyatt

···

On Sun, Jun 24, 2007 at 05:45:30PM +0900, Daniel Lucraft wrote:

Michael_Glaesemann · 24 June 2007 18:20

From a strictness point of view, what's the difference between /(.+?)_/ and /([^_]+)_/ in the above? AIUI, they're equivalent. I personally like the former because if you need to change the _ to some other character, you only have to make a single character change.

Michael Glaesemann
grzm seespotcode net

···

On Jun 24, 2007, at 11:35 , Wyatt Draggoo wrote:

On Sun, Jun 24, 2007 at 05:45:30PM +0900, Daniel Lucraft wrote:

> str.gsub(/_(.+?)_/m, "\\1")

I like to be very strict with things like quotes (and underscores in this case), so I would probably use:

> str.gsub(/_([^_]+)_/, "\\1")

Mike_Steiner · 25 June 2007 02:01

Thanks for the ideas about the _italics_!

I found out what my "weird" problem was - I wasn't double-escaping the \1 in
the replacement string (I was using "\1" instead of "\\1".)
It's funny how Python didn't require this. Hmmm.

Mike Steiner

···

On 6/24/07, Michael Glaesemann <grzm@seespotcode.net> wrote:

On Jun 24, 2007, at 11:35 , Wyatt Draggoo wrote:

> On Sun, Jun 24, 2007 at 05:45:30PM +0900, Daniel Lucraft wrote:
>
>> > str.gsub(/_(.+?)_/m, "\\1")
>
> I like to be very strict with things like quotes (and underscores
> in this case), so I would probably use:
>
> > str.gsub(/_([^_]+)_/, "\\1")

From a strictness point of view, what's the difference between /(.+?)
_/ and /([^_]+)_/ in the above? AIUI, they're equivalent. I
personally like the former because if you need to change the _ to
some other character, you only have to make a single character change.

Michael Glaesemann
grzm seespotcode net

Topic		Replies	Views
Regexp ruby-talk	10	79	7 July 2006
Regular expressions ruby-talk	26	145	17 April 2003
Ruby 1.8.1. RegExp documentation? ruby-talk	2	119	12 April 2004
Regexp Help ruby-talk	5	121	28 July 2009
Strings to regexps (non-demanding post :->) ruby-talk	0	94	23 December 2003

Regexp questions

Related topics