What is the best way of attacking field split on ';' when the string looks like:
s = 'a;b;c\;;d;'
s.split(/???;/)
=> ["a", "b", "c\;", "d"]
Or is it best to use s.each_byte and do it by hand?
Normally this would call for fixed width lookbehind,
/(?<!\\);/
but as far as I know its not included in the ruby regexp engine.
But for further clarification:
How should 'a;b\\;;c' be split?
If backslashs can be escaped (and you'd want that because otherwise you can't have a field "b\" its more difficult.
On Thursday 30 September 2004 23:29, Simon Strandgaard wrote:
On Thursday 30 September 2004 23:15, Mark Probert wrote:
> Hi, Rubyists.
>
> What is the best way of attacking field split on ';' when the string
> looks like:
>
> s = 'a;b;c\;;d;'
> s.split(/???;/)
> => ["a", "b", "c\;", "d"]
>
> Or is it best to use s.each_byte and do it by hand?
irb(main):019:0> s.scan(/(?:\\[^.]|[^;])*/).each do |it|
irb(main):020:1* next if it.empty?
irb(main):021:1> puts " --> #{it}"
irb(main):022:1> end
--> a is a word
--> b is too
--> c\; for fun
--> d -- forget it
=> ["a is a word", "", "b is too", "", "c\\; for fun", "", "d -- forget
it", "", ""]
"Mark Probert" <probertm@nospam-acm.org> schrieb im Newsbeitrag
news:Xns95749654816D0probertmnospamtelusn@198.161.157.145...
Hi ..
>
> How about something ala
>
> irb(main):015:0> "aa;bbb\\;;abc;;d\\\\;e;".scan(/(?:\\[^.]|[^;])*;/)
> => ["aa;", "bbb\\;;", "abc;", ";", "d\\\\;", "e;"]
>
Thanks! That is close enough:
irb(main):019:0> s.scan(/(?:\\[^.]|[^;])*/).each do |it|
irb(main):020:1* next if it.empty?
irb(main):021:1> puts " --> #{it}"
irb(main):022:1> end
--> a is a word
--> b is too
--> c\; for fun
--> d -- forget it
=> ["a is a word", "", "b is too", "", "c\\; for fun", "", "d -- forget
it", "", ""]