Gsub problem

I'm trying to use gsub to put escapes before single quotes.

irb(main):001:0> a = "Ain't Can't"
=> "Ain't Can't"
irb(main):002:0> a.gsub(/(\')/, '\\1')
=> "Ain't Can't"
irb(main):003:0> a.gsub(/(\')/, '\\\1')
=> "Ain\\1t Can\\1t"
irb(main):004:0> a.gsub(/(\')/, '\\\\1')
=> "Ain\\1t Can\\1t"
irb(main):005:0> a.gsub(/(\')/, '\\\\\1')
=> "Ain\\'t Can\\'t"
irb(main):006:0>

How does one do it?

Actually, I oversimplified my example. I want to escape single quotes and question marks.

irb(main):001:0> a = "Can't we?"
=> "Can't we?"
irb(main):002:0> a.gsub(/([\'\?])/, '\\1')
=> "Can't we?"
irb(main):003:0> a.gsub(/([\'\?])/, '\\\1')
=> "Can\\1t we\\1"
irb(main):004:0> a.gsub(/([\'\?])/, '\\\\1')
=> "Can\\1t we\\1"
irb(main):005:0> a.gsub(/([\'\?])/, '\\\\\1')
=> "Can\\'t we\\?"
irb(main):006:0>

This is probably what you want -- keep in mind that irb will show
you escaped backslashes if there's a single backslash in the string.

-s

···

On 2010-01-25, lalawawa <usenet@ccjj.info> wrote:

irb(main):005:0> a.gsub(/([\'\?])/, '\\\\\1')
=> "Can\\'t we\\?"

--
Copyright 2010, all wrongs reversed. Peter Seebach / usenet-nospam@seebs.net
| Seebs.Net <-- lawsuits, religion, and funny pictures
Fair game (Scientology) - Wikipedia <-- get educated!

Actually, I oversimplified my example. I want to escape single quotes and question marks.

irb(main):001:0> a = "Can't we?"
=> "Can't we?"
irb(main):002:0> a.gsub(/([\'\?])/, '\\1')
=> "Can't we?"
irb(main):003:0> a.gsub(/([\'\?])/, '\\\1')
=> "Can\\1t we\\1"
irb(main):004:0> a.gsub(/([\'\?])/, '\\\\1')
=> "Can\\1t we\\1"
irb(main):005:0> a.gsub(/([\'\?])/, '\\\\\1')
=> "Can\\'t we\\?"

(Well, Mike's answer steals my thunder a bit, but I'll put this out there for anyone else that finds this thread.)

You just have to remember that there's two levels of confusion, uh, I mean *escaping* going on here.

Within single-quoted strings, the only backslash-escape is for a ' and, so you can have a \, you also need to escape the escape character:

'\1' means a backslash and a digit (because the character following the \ is not ' or \ it doesn't have its special meaning)
'\\1' means an escaped backslash and a digit (yes, the same as above, but this time the first \ escapes the second \ so that it is interpreted literally)
'\\\1' means an escaped backslash, a backslash, and a digit (you see where this is going, right?)

Now, within the replacement string for a gsub, you can have a backslash-digit to mean "the n-th parenthesized group where the digit is n"

So you want to replace with "a backslash and a backslash-digit for the first group". You need the final interpretation of the replacement to end up as \ \1 (no quotes here to confuse things)

'\\' is a literal \
'\\\\' is a literal \\ so gsub will see a real \
'\1' is \1 because 1 isn't a special character
SO...
'\\\\\1' is seen as \\\1 (as the argument) and gsub interprets it as "literal backslash, first group"

'\\\\\\1' is seen the same way; the fifth backslash escapes the sixth, then the 1

You might consider using the block form of gsub replacement.

a = "Can't we?"

=> "Can't we?"

a.gsub(/(['?])/, '\\\\\\1')

=> "Can\\'t we\\?"

puts _

Can\'t we\?
=> nil

a.gsub(/['?]/) {|m| '\\' + m}

=> "Can\\'t we\\?"

puts _

Can\'t we\?
=> nil

In your case, the clarity that results from the simplification of the replacement should be obvious.

-Rob

Rob Biedenharn http://agileconsultingllc.com
Rob@AgileConsultingLLC.com

···

On Jan 25, 2010, at 12:00 PM, lalawawa wrote:

Seebs wrote:

···

On 2010-01-25, lalawawa <usenet@ccjj.info> wrote:

irb(main):005:0> a.gsub(/([\'\?])/, '\\\\\1')
=> "Can\\'t we\\?"

This is probably what you want -- keep in mind that irb will show
you escaped backslashes if there's a single backslash in the string.

-s

You're right.

irb(main):009:0> puts a.gsub(/([\'\?])/, '\\\\\1')
Can\'t we\?
=> nil
irb(main):010:0>

Thanks

Rob Biedenharn wrote:

···

On Jan 25, 2010, at 12:00 PM, lalawawa wrote:

Actually, I oversimplified my example. I want to escape single quotes and question marks.

irb(main):001:0> a = "Can't we?"
=> "Can't we?"
irb(main):002:0> a.gsub(/([\'\?])/, '\\1')
=> "Can't we?"
irb(main):003:0> a.gsub(/([\'\?])/, '\\\1')
=> "Can\\1t we\\1"
irb(main):004:0> a.gsub(/([\'\?])/, '\\\\1')
=> "Can\\1t we\\1"
irb(main):005:0> a.gsub(/([\'\?])/, '\\\\\1')
=> "Can\\'t we\\?"

(Well, Mike's answer steals my thunder a bit, but I'll put this out there for anyone else that finds this thread.)

You just have to remember that there's two levels of confusion, uh, I mean *escaping* going on here.

Within single-quoted strings, the only backslash-escape is for a ' and, so you can have a \, you also need to escape the escape character:

'\1' means a backslash and a digit (because the character following the \ is not ' or \ it doesn't have its special meaning)
'\\1' means an escaped backslash and a digit (yes, the same as above, but this time the first \ escapes the second \ so that it is interpreted literally)
'\\\1' means an escaped backslash, a backslash, and a digit (you see where this is going, right?)

Now, within the replacement string for a gsub, you can have a backslash-digit to mean "the n-th parenthesized group where the digit is n"

So you want to replace with "a backslash and a backslash-digit for the first group". You need the final interpretation of the replacement to end up as \ \1 (no quotes here to confuse things)

'\\' is a literal \
'\\\\' is a literal \\ so gsub will see a real \
'\1' is \1 because 1 isn't a special character
SO...
'\\\\\1' is seen as \\\1 (as the argument) and gsub interprets it as "literal backslash, first group"

'\\\\\\1' is seen the same way; the fifth backslash escapes the sixth, then the 1

You might consider using the block form of gsub replacement.

> a = "Can't we?"
=> "Can't we?"
> a.gsub(/(['?])/, '\\\\\\1')
=> "Can\\'t we\\?"
> puts _
Can\'t we\?
=> nil
> a.gsub(/['?]/) {|m| '\\' + m}
=> "Can\\'t we\\?"
> puts _
Can\'t we\?
=> nil

In your case, the clarity that results from the simplification of the replacement should be obvious.

-Rob

Rob Biedenharn http://agileconsultingllc.com
Rob@AgileConsultingLLC.com

Most excellent and informative posts, Rob and Mike. I've saved them both to my "best of Ruby" folder.

Seebs wrote:

irb(main):005:0> a.gsub(/([\'\?])/, '\\\\\1')
=> "Can\\'t we\\?"

This is probably what you want -- keep in mind that irb will show
you escaped backslashes if there's a single backslash in the string.
-s

You're right.

irb(main):009:0> puts a.gsub(/([\'\?])/, '\\\\\1')
Can\'t we\?
=> nil
irb(main):010:0>

Thanks

You can get rid of a few backslashes if you like:

$ irb --simple-prompt

a = "Can't we?"

=> "Can't we?"

puts a.gsub(/['?]/) { |c| "\\#{c}" }

Can\'t we\?
=> nil

Mike

···

On Jan 25, 2010, at 12:50 PM, lalawawa wrote:

On 2010-01-25, lalawawa <usenet@ccjj.info> wrote:

--

Mike Stok <mike@stok.ca>
http://www.stok.ca/~mike/

The "`Stok' disclaimers" apply.