String#split: unxepected result

Kirill_Shutemov · 31 October 2006 09:49

irb(main):001:0> "aabbcc".split(/bb/)
=> ["aa", "cc"]
irb(main):002:0> "aabbcc".split(/(bb)/)
=> ["aa", "bb", "cc"]
irb(main):003:0> "aabbcc".split(/bb(c)?/)
=> ["aa", "c", "c"]

The last two result is unexpected for me. Can anybody explain it?

Robert_K1 · 31 October 2006 10:05

#split returns matching groups of the split pattern if there are capturing groups:

irb(main):008:0> "aabbcc".split(/bb(c)?/)
=> ["aa", "c", "c"]
irb(main):009:0> "aabbcc".split(/bb(?:c)?/)
=> ["aa", "c"]
irb(main):010:0> "aabbcc".split(/(bb(?:c)?)/)
=> ["aa", "bbc", "c"]

Regards

robert

···

On 31.10.2006 10:49, Kirill Shutemov wrote:

irb(main):001:0> "aabbcc".split(/bb/)
=> ["aa", "cc"]
irb(main):002:0> "aabbcc".split(/(bb)/)
=> ["aa", "bb", "cc"]
irb(main):003:0> "aabbcc".split(/bb(c)?/)
=> ["aa", "c", "c"]

The last two result is unexpected for me. Can anybody explain it?

Vincent_Fourmond · 31 October 2006 10:08

Kirill Shutemov wrote:

irb(main):001:0> "aabbcc".split(/bb/)
=> ["aa", "cc"]
irb(main):002:0> "aabbcc".split(/(bb)/)
=> ["aa", "bb", "cc"]
irb(main):003:0> "aabbcc".split(/bb(c)?/)
=> ["aa", "c", "c"]

The last two result is unexpected for me. Can anybody explain it?

You have a capturing group in the last RE. In this case, split also
returns capturing groups. See for instance

str = "abcd"

=> "abcd"

p str.split(/(b)/)

["a", "b", "cd"]

The string is split into "a" and "cd", and in the middle you get the
result of the capturing group, "b". You want a non-capturing group,
(?:c) instead of (c)

Cheers !

Vince

···

--
Vincent Fourmond, PhD student
http://vincent.fourmond.neuf.fr/

Jan_Svitok · 31 October 2006 10:09

If there are any groups in the delimiter, they are output as well.
This is not documented in RDoc, only in the PickAxe book.

It's useful when you want to keep the delimiters.

···

On 10/31/06, Kirill Shutemov <k.shutemov@gmail.com> wrote:

irb(main):001:0> "aabbcc".split(/bb/)
=> ["aa", "cc"]
irb(main):002:0> "aabbcc".split(/(bb)/)
=> ["aa", "bb", "cc"]
irb(main):003:0> "aabbcc".split(/bb(c)?/)
=> ["aa", "c", "c"]

The last two result is unexpected for me. Can anybody explain it?

Topic		Replies	Views
String#split and groups in the field separator RE ruby-talk	6	124	1 November 2007
String.split ruby-talk	13	91	14 July 2004
String#split converts string args to regexes --? ruby-talk	40	249	12 July 2002
[bug] String#split wipes result ruby-talk	2	114	31 May 2004
Splitting ruby-talk	3	91	4 March 2010

String#split: unxepected result

Related topics