Gsub oddity?

What am I missing here. I am trying to replace a single quote with the
string ’

irb(main):032:0> “replace ‘quotes’.”.gsub(/’/,"\’")
=> “replace quotes’.quotes…”

should be

“replace ‘quotes’.”

Thanks
Ralph

You’re missing an extra level of processing applied to the replacement
string for gsub in order to let you include parts of what you’re
replacing. Specifically, & gets replaced by the string that matched
the regex, ` gets replaced by the part of the string before what
matched the regex, and ' gets replaced by the part of the string after
what matched the regex, which is what is happening in your case.

    Original:                               "\\'"
    After double-quote processing:          \'
    After gsub processing:                  quotes'.

You need an extra doubling:

    Original:                               "\\\\'"
    After double-quote processing:          \\'
    After gsub processing:                  \'

-Mark

···

On Thu, Jan 22, 2004 at 06:57:22AM +0900, Ralph Mason wrote:

What am I missing here. I am trying to replace a single quote with the
string '

“Mark J. Reed” markjreed@mail.com schrieb im Newsbeitrag
news:20040121221951.GD16733@mulan.thereeds.org

What am I missing here. I am trying to replace a single quote with
the
string '

You’re missing an extra level of processing applied to the replacement
string for gsub in order to let you include parts of what you’re
replacing.

No, in this case he’s missing extra backslashes:

irb(main):002:0> str=‘a'o’
=> “a’o”
irb(main):003:0> str.gsub(/‘/, ‘\\'’ )
=> “a\'o”
irb(main):004:0> puts str.gsub(/’/, ‘\\'’ )
a'o
=> nil

It does look a bit odd, but the explanation is logical: There are two
levels of escaping involved. The first level is that needed for the ruby
parser to get the expected characters into the replacement string. The
second level is used in order to be able to use a backslash literally in
the substitution string.

irb(main):005:0> puts ‘\\'’
\’
=> nil

You need two backslashes in the substitution string to prevent gsub from
interpreting the backslash as escape char. That makes four backslashes
when entering the string. The fifth backslash escapes the single quote in
order to be able to include it into the string.

Regards

robert
···

On Thu, Jan 22, 2004 at 06:57:22AM +0900, Ralph Mason wrote:

In article buo0rc$j5ef5$1@ID-52924.news.uni-berlin.de,

“Mark J. Reed” markjreed@mail.com schrieb im Newsbeitrag
news:20040121221951.GD16733@mulan.thereeds.org

What am I missing here. I am trying to replace a single quote with
the
string '

You’re missing an extra level of processing applied to the replacement
string for gsub in order to let you include parts of what you’re
replacing.

No, in this case he’s missing extra backslashes:

irb(main):002:0> str=‘a'o’
=> “a’o”
irb(main):003:0> str.gsub(/‘/, ‘\\'’ )
=> “a\'o”
irb(main):004:0> puts str.gsub(/’/, ‘\\'’ )
a'o
=> nil

It does look a bit odd, but the explanation is logical: There are two
levels of escaping involved. The first level is that needed for the ruby
parser to get the expected characters into the replacement string. The
second level is used in order to be able to use a backslash literally in
the substitution string.

irb(main):005:0> puts ‘\\'’
\’
=> nil

You need two backslashes in the substitution string to prevent gsub from
interpreting the backslash as escape char. That makes four backslashes
when entering the string. The fifth backslash escapes the single quote in
order to be able to include it into the string.

You can make life a little easier by picking a quote other than ’ to
get rid of one backslash e.g.

[mike@ratdog mike]$ irb --simple-prompt

str = “a’o”
=> “a’o”
puts str.gsub(/‘/, "\\’")
a'o
=> nil

It can be tiresome to remember how many times a string will be scanned
in a context where \ characters matter, so I prefer using the block form
e.g.

puts str.gsub(/(')/) { |c| “\#{c}” }
a'o
=> nil

…but I have never measured whether this is significantly more
expensive.

Hal, is this a FAQ? “How do I replace ’ with ' in a string?” I’m sure
that this has come up a few times…

Hope this helps,

Mike

···

Robert Klemme bob.news@gmx.net wrote:

On Thu, Jan 22, 2004 at 06:57:22AM +0900, Ralph Mason wrote:


mike@stok.co.uk | The “`Stok’ disclaimers” apply.
http://www.stok.co.uk/~mike/ | GPG PGP Key 1024D/059913DA
mike@exegenix.com | Fingerprint 0570 71CD 6790 7C28 3D60
http://www.exegenix.com/ | 75D2 9EC4 C1C0 0599 13DA

Did you not read the rest of my post, when I pointed that out and
supplied the solution? I answered the original question in terms
of what he was missing in the colloquial sense of being unaware of.

Sheesh.

-Mark

···

On Thu, Jan 22, 2004 at 09:15:35AM +0100, Robert Klemme wrote:

“Mark J. Reed” markjreed@mail.com schrieb im Newsbeitrag
news:20040121221951.GD16733@mulan.thereeds.org

On Thu, Jan 22, 2004 at 06:57:22AM +0900, Ralph Mason wrote:

What am I missing here. I am trying to replace a single quote with
the
string '

You’re missing an extra level of processing applied to the replacement
string for gsub in order to let you include parts of what you’re
replacing.

No, in this case he’s missing extra backslashes:

“Mike Stok” mike@stok.co.uk schrieb im Newsbeitrag
news:MWOPb.54576$lGr.29805@twister01.bloor.is.net.cable.rogers.com

In article buo0rc$j5ef5$1@ID-52924.news.uni-berlin.de,

“Mark J. Reed” markjreed@mail.com schrieb im Newsbeitrag
news:20040121221951.GD16733@mulan.thereeds.org

What am I missing here. I am trying to replace a single quote with
the
string '

You’re missing an extra level of processing applied to the
replacement
string for gsub in order to let you include parts of what you’re
replacing.

No, in this case he’s missing extra backslashes:

irb(main):002:0> str=‘a'o’
=> “a’o”
irb(main):003:0> str.gsub(/‘/, ‘\\'’ )
=> “a\'o”
irb(main):004:0> puts str.gsub(/’/, ‘\\'’ )
a'o
=> nil

It does look a bit odd, but the explanation is logical: There are two
levels of escaping involved. The first level is that needed for the
ruby
parser to get the expected characters into the replacement string. The
second level is used in order to be able to use a backslash literally
in
the substitution string.

irb(main):005:0> puts ‘\\'’
\’
=> nil

You need two backslashes in the substitution string to prevent gsub
from
interpreting the backslash as escape char. That makes four backslashes
when entering the string. The fifth backslash escapes the single quote
in
order to be able to include it into the string.

You can make life a little easier by picking a quote other than ’ to
get rid of one backslash e.g.

Yes, although it’s just a small improvement.

[mike@ratdog mike]$ irb --simple-prompt

str = “a’o”
=> “a’o”
puts str.gsub(/‘/, "\\’")
a'o
=> nil

It can be tiresome to remember how many times a string will be scanned
in a context where \ characters matter, so I prefer using the block form
e.g.

puts str.gsub(/(')/) { |c| “\#{c}” }
a'o
=> nil

…but I have never measured whether this is significantly more
expensive.

It’s definitely more expensive, which is not really a surprise given the
additional overhead of

  • a block yield
  • string replacement with {}

But even with a constant string there’s still the block call overhead:

14:21:43 [ADMIN]: /c/temp/ruby/rx-replace-bm.rb
user system total real
direct 0.766000 0.000000 0.766000 ( 0.763000)
direct & 0.766000 0.000000 0.766000 ( 0.769000)
block 1.765000 0.000000 1.765000 ( 1.770000)
block fix 1.438000 0.000000 1.438000 ( 1.432000)
14:21:50 [ADMIN]: cat /c/temp/ruby/rx-replace-bm.rb
#!/usr/bin/ruby

require ‘benchmark’
include Benchmark

N=100000

str = ’ /* comment / String s = "**/"; ’

bm(10) do |x|
x.report(“direct”) do
for n in 1…N
str.gsub( /“/, '\\”’ )
end
end

x.report(“direct &”) do
for n in 1…N
str.gsub( /"/, ‘\\\&’ )
end
end

x.report(“block”) do
for n in 1…N
str.gsub( /"/ ) {|m| “\#{m}”}
end
end

x.report(“block fix”) do
for n in 1…N
str.gsub( /“/ ) {|m| '\”'}
end
end
end
14:21:58 [ADMIN]:

Hal, is this a FAQ? “How do I replace ’ with ' in a string?” I’m sure
that this has come up a few times…

I put it on the Wiki:

Cheers

robert
···

Robert Klemme bob.news@gmx.net wrote:

On Thu, Jan 22, 2004 at 06:57:22AM +0900, Ralph Mason wrote:

Mike Stok wrote:

Hal, is this a FAQ? “How do I replace ’ with ' in a string?” I’m sure
that this has come up a few times…

I’m the maintainer of the comp.lang.ruby FAQ, which (mainly) deals with
the newsgroup itself, not language details.

Which reminds me, I’ve been terribly lax about that. My automated
process broke and I’ve not fixed it in three or four months. Mea culpa.

But I do think this is a FAQ from the Ruby language point of view.

/me scribbles note for potential second edition of TRW. :slight_smile:

Cheers,
Hal

“Mark J. Reed” markjreed@mail.com schrieb im Newsbeitrag
news:20040122155713.GC27950@mulan.thereeds.org

“Mark J. Reed” markjreed@mail.com schrieb im Newsbeitrag
news:20040121221951.GD16733@mulan.thereeds.org

What am I missing here. I am trying to replace a single quote
with
the
string '

You’re missing an extra level of processing applied to the
replacement
string for gsub in order to let you include parts of what you’re
replacing.

No, in this case he’s missing extra backslashes:

Did you not read the rest of my post, when I pointed that out and
supplied the solution? I answered the original question in terms
of what he was missing in the colloquial sense of being unaware of.

Apparently I misunderstood you here. I’m sorry.

Kind regards

robert
···

On Thu, Jan 22, 2004 at 09:15:35AM +0100, Robert Klemme wrote:

On Thu, Jan 22, 2004 at 06:57:22AM +0900, Ralph Mason wrote: