I'm converting a big Python program to Ruby which uses lots of regexps, and
I'm getting some odd errors. One problem seems to be that \1 in the
replacement string doesn't always work. Are there any known "gotchas"
between Python's regexps and Ruby's?
And what's the most robust way to convert "underlined" words to HTML italics
using a regexp? Something like: "This _word_ is in italics." -> "This
<i>word</i> is in italics."
Thanks!
Mike Steiner
Mike Steiner wrote:
I'm getting some odd errors. One problem seems to be that \1 in the
replacement string doesn't always work. Are there any known "gotchas"
between Python's regexps and Ruby's?
And what's the most robust way to convert "underlined" words to HTML
italics
using a regexp? Something like: "This _word_ is in italics." -> "This
<i>word</i> is in italics."
Could you give an example of where it isn't working in the first case?
As for the second, I don't know about 'most robust' but I might try
something like this (with the assumption that words aren't broken over
lines):
str
=> "asdf asdf _asdfasd_ asdf _ash_ h"
str.gsub(/_([^\s]+)_/, "<i>\\1</i>")
=> "asdf asdf <i>asdfasd</i> asdf <i>ash</i> h"
Or if I wanted to be able to have italicised sentences (_word word_) I
might try this:
str
=> "asdf asdf _asdf asd_ asdf _ash \nash_ h"
str.gsub(/_(.+?)_/m, "<i>\\1</i>")
=> "asdf asdf <i>asdf asd</i> asdf <i>ash \nash</i> h"
But then I would worry about performance because of the lazy operator
and would want to test it on some real data.
best,
Dan
···
--
Posted via http://www.ruby-forum.com/\.
> str
=> "asdf asdf _asdfasd_ asdf _ash_ h"
> str.gsub(/_([^\s]+)_/, "<i>\\1</i>")
=> "asdf asdf <i>asdfasd</i> asdf <i>ash</i> h"
Or if I wanted to be able to have italicised sentences (_word word_) I
might try this:
> str
=> "asdf asdf _asdf asd_ asdf _ash \nash_ h"
> str.gsub(/_(.+?)_/m, "<i>\\1</i>")
=> "asdf asdf <i>asdf asd</i> asdf <i>ash \nash</i> h"
I like to be very strict with things like quotes (and underscores in this case), so I would probably use:
str
=> "asdf asdf _asdf asd_ asdf _ash \nash_ h"
str.gsub(/_([^_]+)_/, "<i>\\1</i>")
=> "asdf asdf <i>asdf asd</i> asdf <i>ash \nash</i> h"
That seems to work like I would expect it to---I'm just coming over from Perl...
Wyatt
···
On Sun, Jun 24, 2007 at 05:45:30PM +0900, Daniel Lucraft wrote:
From a strictness point of view, what's the difference between /(.+?)_/ and /([^_]+)_/ in the above? AIUI, they're equivalent. I personally like the former because if you need to change the _ to some other character, you only have to make a single character change.
Michael Glaesemann
grzm seespotcode net
···
On Jun 24, 2007, at 11:35 , Wyatt Draggoo wrote:
On Sun, Jun 24, 2007 at 05:45:30PM +0900, Daniel Lucraft wrote:
> str.gsub(/_(.+?)_/m, "<i>\\1</i>")
I like to be very strict with things like quotes (and underscores in this case), so I would probably use:
> str.gsub(/_([^_]+)_/, "<i>\\1</i>")
Thanks for the ideas about the _italics_!
I found out what my "weird" problem was - I wasn't double-escaping the \1 in
the replacement string (I was using "<i>\1</i>" instead of "<i>\\1</i>".)
It's funny how Python didn't require this. Hmmm.
Mike Steiner
···
On 6/24/07, Michael Glaesemann <grzm@seespotcode.net> wrote:
On Jun 24, 2007, at 11:35 , Wyatt Draggoo wrote:
> On Sun, Jun 24, 2007 at 05:45:30PM +0900, Daniel Lucraft wrote:
>
>> > str.gsub(/_(.+?)_/m, "<i>\\1</i>")
>
> I like to be very strict with things like quotes (and underscores
> in this case), so I would probably use:
>
> > str.gsub(/_([^_]+)_/, "<i>\\1</i>")
From a strictness point of view, what's the difference between /(.+?)
_/ and /([^_]+)_/ in the above? AIUI, they're equivalent. I
personally like the former because if you need to change the _ to
some other character, you only have to make a single character change.
Michael Glaesemann
grzm seespotcode net