Backslash sequences\1\2 in regexs (backreferences)

Is this behavior documented anywhere:

1)
puts "fred:smith".gsub(/(\w+):(\w+)/, '\2, \1')

--output:--
smith, fred

2)
puts "abc".gsub(/a(b)(c)/, "a\2\1")

--output:--
a

The double quotes surrounding the replacement string cause the backslash
sequences to stop working. With single quotes the backslash sequences
work. I can't find anything in pickaxe2 about that. .My understanding
was that double quotes allowed for more substitutions than single
quotes. This appears to be a case where double quotes allow fewer
substitutions than single quotes.

···

--
Posted via http://www.ruby-forum.com/.

The double quotes interpolate the \1 and \2 as characters before gsub ever sees it.

ratdog:~ mike$ ruby -e 'puts "abc".gsub(/a(b)(c)/, "a\2\1")' | od -c
0000000 a 002 001 \n
0000004

ratdog:~ mike$ irb
irb(main):001:0> 'a\1\2'.length
=> 5
irb(main):002:0> "a\1\2".length
=> 3
irb(main):003:0> "a\2\1"
=> "a\002\001"

the \2 and \1 are interpolated into two single characters in the double quotes.

Table 22.2 in The Basic Types says \nnn goes to Octal nnn, and here you see 8 (not a valid octal digit) doesn't get treated the same way as 1 and 2:

irb(main):004:0> "a\2\1\8"
=> "a\002\0018"

Hope this helps,

Mike

···

On 1-Nov-07, at 9:47 PM, 7stud -- wrote:

Is this behavior documented anywhere:

1)
puts "fred:smith".gsub(/(\w+):(\w+)/, '\2, \1')

--output:--
smith, fred

2)
puts "abc".gsub(/a(b)(c)/, "a\2\1")

--output:--
a

The double quotes surrounding the replacement string cause the backslash
sequences to stop working. With single quotes the backslash sequences
work. I can't find anything in pickaxe2 about that. .My understanding
was that double quotes allowed for more substitutions than single
quotes. This appears to be a case where double quotes allow fewer
substitutions than single quotes.
--
Posted via http://www.ruby-forum.com/\.

--

Mike Stok <mike@stok.ca>
http://www.stok.ca/~mike/

The "`Stok' disclaimers" apply.

Yes. In many Ruby books, in at least one Ruby FAQ, and many, many
times on the ruby mailing list/forum/newsgroup.

···

On Nov 1, 7:47 pm, bbxx789_0...@yahoo.com wrote:

Is this behavior documented anywhere:

Mike Stok wrote:

Table 22.2 in The Basic Types says \nnn goes to Octal nnn,

Ah. So, \1 and \2 are interpreted as octal character codes. I was
using the following puts statement to debug:

puts "abc".gsub(/a(b)(c)/, "a\2\1") + "<---"

--output:--
a<---

I should have been using:

p "abc".gsub(/a(b)(c)/, "a\2\1")

--output:--
"a\002\001"

Since the ascii codes 1 and 2 represent non-printable characters, I got
no output for them using puts.

My question stemmed from this passage about gsub() in pickaxe2 on p.
613:

"If a string is used as the replacement, special variables from the
match (such as $& and $1) cannot be substituted into it, as the
substitution into the string occurs before the pattern match starts.
However, the sequences \1, \2 and so on may be used to interpolate
successive groups in the match."

That makes it sound like \1 and \2 can be freely used in the replacement
string. There is no mention of the fact that single quotes are required
to keep them from being interpreted as chars written in octal. That
description is very misleading

···

--
Posted via http://www.ruby-forum.com/\.

No, it's not, That single quotes are required has nothing to do with gsub. It's something you should know from your understanding of how the Ruby interpreter handles double quoted strings. As Mike Stok said the string literal is converted to "a\002\001" long before gsub is called.

Regards, Morton

···

On Nov 2, 2007, at 12:59 AM, 7stud -- wrote:

Mike Stok wrote:

Table 22.2 in The Basic Types says \nnn goes to Octal nnn,

Ah. So, \1 and \2 are interpreted as octal character codes. I was
using the following puts statement to debug:

puts "abc".gsub(/a(b)(c)/, "a\2\1") + "<---"

--output:--
a<---

I should have been using:

p "abc".gsub(/a(b)(c)/, "a\2\1")

--output:--
"a\002\001"
Since the ascii codes 1 and 2 represent non-printable characters, I got
no output for them using puts.

My question stemmed from this passage about gsub() in pickaxe2 on p.
613:

"If a string is used as the replacement, special variables from the
match (such as $& and $1) cannot be substituted into it, as the
substitution into the string occurs before the pattern match starts.
However, the sequences \1, \2 and so on may be used to interpolate
successive groups in the match."

That makes it sound like \1 and \2 can be freely used in the replacement
string. There is no mention of the fact that single quotes are required
to keep them from being interpreted as chars written in octal. That
description is very misleading

Morton Goldberg wrote:

···

On Nov 2, 2007, at 12:59 AM, 7stud -- wrote:

--output:--
got

That makes it sound like \1 and \2 can be freely used in the
replacement
string. There is no mention of the fact that single quotes are
required
to keep them from being interpreted as chars written in octal. That
description is very misleading

No, it's not, That single quotes are required has nothing to do with
gsub. It's something you should know from your understanding of how
the Ruby interpreter handles double quoted strings. As Mike Stok said
the string literal is converted to "a\002\001" long before gsub is
called.

Regards, Morton

You should simply use "double-quote-double-quote"

irb(main):001:0> puts "fred:smith".gsub(/(\w+):(\w+)/, '\\2, \\1')
smith, fred

Wolfgang Nádasi-Donner

--
Posted via http://www.ruby-forum.com/\.