Nasty regexp problem

Try:

irb(main):018:0> rex2 = /(?!\)’/
=> /(?!\)’/
irb(main):019:0> “’’’”.gsub(rex2) { “’’” }.length
=> 6
irb(main):020:0> “’”.gsub(rex2) { “’’” }.length
=> 2
irb(main):021:0> “’’”.gsub(rex2) { “’’” }.length
=> 4
irb(main):022:0> “’’’”.gsub(rex2) { “’’” }
=> “’’’’’’”

···

-----Original Message-----
From: Francis Hwang [mailto:sera@fhwang.net]
Sent: Thursday, 13 May 2004 10:49 AM
To: ruby-talk ML
Subject: nasty regexp problem

Hi all,

I discovered a strange bug in Lafcadio which I’m trying to work out,
but the regexp is getting really nasty. Basically, to commit text
values to the database I need to double any apostrophe, except those
that are preceded by a backslash. This has to account for apostrophes
at the beginning of the string, and multiline strings where the
apostrophe is at the start of a new line. Examples would be:

“I can’t drive 55” -> “I can’'t drive 55"
“Don’t escape here” -> “Don’t escape here”
”’" -> “’’”
“line 1\n’ line 2” -> “line 1\n’’ line 2”

So here’s the regexp that does it:

value = value.gsub( /(^|[^\\n])’/ ) { $& + “’” }

which works fine, except I just discovered that it fails in one odd
case:

“’’’” -> “’’’’’”

(That is, a single line of three apostrophes should become six
apostrophes, but instead it becomes five.)

Any idea why this is doing it? I suspect I’m being too clever by
including the beginning of line in a grouping, and maybe that affects
how the regexp is processing the string?

F.