String#split: unxepected result

irb(main):001:0> "aabbcc".split(/bb/)
=> ["aa", "cc"]
irb(main):002:0> "aabbcc".split(/(bb)/)
=> ["aa", "bb", "cc"]
irb(main):003:0> "aabbcc".split(/bb(c)?/)
=> ["aa", "c", "c"]

The last two result is unexpected for me. Can anybody explain it?

#split returns matching groups of the split pattern if there are capturing groups:

irb(main):008:0> "aabbcc".split(/bb(c)?/)
=> ["aa", "c", "c"]
irb(main):009:0> "aabbcc".split(/bb(?:c)?/)
=> ["aa", "c"]
irb(main):010:0> "aabbcc".split(/(bb(?:c)?)/)
=> ["aa", "bbc", "c"]

Regards

  robert

···

On 31.10.2006 10:49, Kirill Shutemov wrote:

irb(main):001:0> "aabbcc".split(/bb/)
=> ["aa", "cc"]
irb(main):002:0> "aabbcc".split(/(bb)/)
=> ["aa", "bb", "cc"]
irb(main):003:0> "aabbcc".split(/bb(c)?/)
=> ["aa", "c", "c"]

The last two result is unexpected for me. Can anybody explain it?

Kirill Shutemov wrote:

irb(main):001:0> "aabbcc".split(/bb/)
=> ["aa", "cc"]
irb(main):002:0> "aabbcc".split(/(bb)/)
=> ["aa", "bb", "cc"]
irb(main):003:0> "aabbcc".split(/bb(c)?/)
=> ["aa", "c", "c"]

The last two result is unexpected for me. Can anybody explain it?

  You have a capturing group in the last RE. In this case, split also
returns capturing groups. See for instance

str = "abcd"

=> "abcd"

p str.split(/(b)/)

["a", "b", "cd"]

  The string is split into "a" and "cd", and in the middle you get the
result of the capturing group, "b". You want a non-capturing group,
(?:c) instead of (c)

  Cheers !

  Vince

···

--
Vincent Fourmond, PhD student
http://vincent.fourmond.neuf.fr/

If there are any groups in the delimiter, they are output as well.
This is not documented in RDoc, only in the PickAxe book.

It's useful when you want to keep the delimiters.

···

On 10/31/06, Kirill Shutemov <k.shutemov@gmail.com> wrote:

irb(main):001:0> "aabbcc".split(/bb/)
=> ["aa", "cc"]
irb(main):002:0> "aabbcc".split(/(bb)/)
=> ["aa", "bb", "cc"]
irb(main):003:0> "aabbcc".split(/bb(c)?/)
=> ["aa", "c", "c"]

The last two result is unexpected for me. Can anybody explain it?