Gsub problem

I'm trying to use gsub to put escapes before single quotes.

irb(main):001:0> a = "Ain't Can't"
=> "Ain't Can't"
irb(main):002:0> a.gsub(/(\')/, '\\1')
=> "Ain't Can't"
irb(main):003:0> a.gsub(/(\')/, '\\\1')
=> "Ain\\1t Can\\1t"
irb(main):004:0> a.gsub(/(\')/, '\\\\1')
=> "Ain\\1t Can\\1t"
irb(main):005:0> a.gsub(/(\')/, '\\\\\1')
=> "Ain\\'t Can\\'t"

How does one do it?

Actually, I oversimplified my example. I want to escape single quotes and question marks.

irb(main):001:0> a = "Can't we?"
=> "Can't we?"
irb(main):002:0> a.gsub(/([\'\?])/, '\\1')
=> "Can't we?"
irb(main):003:0> a.gsub(/([\'\?])/, '\\\1')
=> "Can\\1t we\\1"
irb(main):004:0> a.gsub(/([\'\?])/, '\\\\1')
=> "Can\\1t we\\1"
irb(main):005:0> a.gsub(/([\'\?])/, '\\\\\1')
=> "Can\\'t we\\?"

This is probably what you want -- keep in mind that irb will show
you escaped backslashes if there's a single backslash in the string.



On 2010-01-25, lalawawa <> wrote:

irb(main):005:0> a.gsub(/([\'\?])/, '\\\\\1')
=> "Can\\'t we\\?"

Copyright 2010, all wrongs reversed. Peter Seebach /
| Seebs.Net <-- lawsuits, religion, and funny pictures
Fair game (Scientology) - Wikipedia <-- get educated!

Actually, I oversimplified my example. I want to escape single quotes and question marks.

irb(main):001:0> a = "Can't we?"
=> "Can't we?"
irb(main):002:0> a.gsub(/([\'\?])/, '\\1')
=> "Can't we?"
irb(main):003:0> a.gsub(/([\'\?])/, '\\\1')
=> "Can\\1t we\\1"
irb(main):004:0> a.gsub(/([\'\?])/, '\\\\1')
=> "Can\\1t we\\1"
irb(main):005:0> a.gsub(/([\'\?])/, '\\\\\1')
=> "Can\\'t we\\?"

(Well, Mike's answer steals my thunder a bit, but I'll put this out there for anyone else that finds this thread.)

You just have to remember that there's two levels of confusion, uh, I mean *escaping* going on here.

Within single-quoted strings, the only backslash-escape is for a ' and, so you can have a \, you also need to escape the escape character:

'\1' means a backslash and a digit (because the character following the \ is not ' or \ it doesn't have its special meaning)
'\\1' means an escaped backslash and a digit (yes, the same as above, but this time the first \ escapes the second \ so that it is interpreted literally)
'\\\1' means an escaped backslash, a backslash, and a digit (you see where this is going, right?)

Now, within the replacement string for a gsub, you can have a backslash-digit to mean "the n-th parenthesized group where the digit is n"

So you want to replace with "a backslash and a backslash-digit for the first group". You need the final interpretation of the replacement to end up as \ \1 (no quotes here to confuse things)

'\\' is a literal \
'\\\\' is a literal \\ so gsub will see a real \
'\1' is \1 because 1 isn't a special character
'\\\\\1' is seen as \\\1 (as the argument) and gsub interprets it as "literal backslash, first group"

'\\\\\\1' is seen the same way; the fifth backslash escapes the sixth, then the 1

You might consider using the block form of gsub replacement.

a = "Can't we?"

=> "Can't we?"

a.gsub(/(['?])/, '\\\\\\1')

=> "Can\\'t we\\?"

puts _

Can\'t we\?
=> nil

a.gsub(/['?]/) {|m| '\\' + m}

=> "Can\\'t we\\?"

puts _

Can\'t we\?
=> nil

In your case, the clarity that results from the simplification of the replacement should be obvious.


Rob Biedenharn


On Jan 25, 2010, at 12:00 PM, lalawawa wrote:

Seebs wrote:


On 2010-01-25, lalawawa <> wrote:

irb(main):005:0> a.gsub(/([\'\?])/, '\\\\\1')
=> "Can\\'t we\\?"

This is probably what you want -- keep in mind that irb will show
you escaped backslashes if there's a single backslash in the string.


You're right.

irb(main):009:0> puts a.gsub(/([\'\?])/, '\\\\\1')
Can\'t we\?
=> nil


Rob Biedenharn wrote:


On Jan 25, 2010, at 12:00 PM, lalawawa wrote:

Actually, I oversimplified my example. I want to escape single quotes and question marks.

irb(main):001:0> a = "Can't we?"
=> "Can't we?"
irb(main):002:0> a.gsub(/([\'\?])/, '\\1')
=> "Can't we?"
irb(main):003:0> a.gsub(/([\'\?])/, '\\\1')
=> "Can\\1t we\\1"
irb(main):004:0> a.gsub(/([\'\?])/, '\\\\1')
=> "Can\\1t we\\1"
irb(main):005:0> a.gsub(/([\'\?])/, '\\\\\1')
=> "Can\\'t we\\?"

(Well, Mike's answer steals my thunder a bit, but I'll put this out there for anyone else that finds this thread.)

You just have to remember that there's two levels of confusion, uh, I mean *escaping* going on here.

Within single-quoted strings, the only backslash-escape is for a ' and, so you can have a \, you also need to escape the escape character:

'\1' means a backslash and a digit (because the character following the \ is not ' or \ it doesn't have its special meaning)
'\\1' means an escaped backslash and a digit (yes, the same as above, but this time the first \ escapes the second \ so that it is interpreted literally)
'\\\1' means an escaped backslash, a backslash, and a digit (you see where this is going, right?)

Now, within the replacement string for a gsub, you can have a backslash-digit to mean "the n-th parenthesized group where the digit is n"

So you want to replace with "a backslash and a backslash-digit for the first group". You need the final interpretation of the replacement to end up as \ \1 (no quotes here to confuse things)

'\\' is a literal \
'\\\\' is a literal \\ so gsub will see a real \
'\1' is \1 because 1 isn't a special character
'\\\\\1' is seen as \\\1 (as the argument) and gsub interprets it as "literal backslash, first group"

'\\\\\\1' is seen the same way; the fifth backslash escapes the sixth, then the 1

You might consider using the block form of gsub replacement.

> a = "Can't we?"
=> "Can't we?"
> a.gsub(/(['?])/, '\\\\\\1')
=> "Can\\'t we\\?"
> puts _
Can\'t we\?
=> nil
> a.gsub(/['?]/) {|m| '\\' + m}
=> "Can\\'t we\\?"
> puts _
Can\'t we\?
=> nil

In your case, the clarity that results from the simplification of the replacement should be obvious.


Rob Biedenharn

Most excellent and informative posts, Rob and Mike. I've saved them both to my "best of Ruby" folder.

Seebs wrote:

irb(main):005:0> a.gsub(/([\'\?])/, '\\\\\1')
=> "Can\\'t we\\?"

This is probably what you want -- keep in mind that irb will show
you escaped backslashes if there's a single backslash in the string.

You're right.

irb(main):009:0> puts a.gsub(/([\'\?])/, '\\\\\1')
Can\'t we\?
=> nil


You can get rid of a few backslashes if you like:

$ irb --simple-prompt

a = "Can't we?"

=> "Can't we?"

puts a.gsub(/['?]/) { |c| "\\#{c}" }

Can\'t we\?
=> nil



On Jan 25, 2010, at 12:50 PM, lalawawa wrote:

On 2010-01-25, lalawawa <> wrote:


Mike Stok <>

The "`Stok' disclaimers" apply.