Regexp splitting problem

Brett_S_Hallett · 29 November 2003 03:59

Hi,
I am trying to split the following line of text:

“btn Exit” “Exit Button” ( note the quotes may be
" or ’ , read from a file)

in such a way that I can say

txt = line.split(/regrex/)

and get back

txt[0] =
txt[1] = btn Exit
txt[2] = Exit Button

my current regexp

ans = tst.split(/["|’]/)

does this , except that the last set is missing ! ,

txt[0] =
txt[1] = btn Exit
txt[2] =

so how do I get the expression to continue processing the line ??
Thanks

Maik_Schmidt · 29 November 2003 06:32

Brett S Hallett wrote:

ans = tst.split(/["|']/)
Your regex can be simplified, because within a character class the pipe
character means “match a pipe character” and not “or”. Additionally, you
do not have to escape the quotes, so the resulting regex would be /["']/.

does this , except that the last set is missing ! ,

That’s not totally correct. The last set isn’t missing, but the 3rd set
is empty. For easier debugging try:

puts text.split(/[“']/).join(”\n")

so how do I get the expression to continue processing the line ??
As mentioned before: That isn’t the problem. Your are searching for a
regex that splits a line into tokens. Some of the tokens are enclosed in
quotes and some are not. Both tokens can contain whitespace. I am not
sure, if your problem can easily be solved by using a single regex. If
you can, you should change your input format.

Is the first token always enclosed in [<>] characters? Are the following
tokens always enclosed in quotes? Then it would be easier to split the
line, but you still would need more than one split call. Maybe then it
would fit in a single call of scan?

Cheers,

Robert · 1 December 2003 08:17

“Brett S Hallett” dragoncity@impulse.net.au schrieb im Newsbeitrag
news:3FC81A17.7050909@impulse.net.au…

Hi,
I am trying to split the following line of text:

“btn Exit” “Exit Button” ( note the quotes may be
" or ’ , read from a file)

in such a way that I can say

txt = line.split(/regrex/)

and get back

txt[0] =
txt[1] = btn Exit
txt[2] = Exit Button

my current regexp

ans = tst.split(/["|']/)

does this , except that the last set is missing ! ,

txt[0] =
txt[1] = btn Exit
txt[2] =

so how do I get the expression to continue processing the line ??

txt = line.scan /“[^”]" | ‘[^’]’ | \S+/x

robert

Alan_Chen1 · 3 December 2003 01:22

Brett S Hallett dragoncity@impulse.net.au wrote in message news:3FC81A17.7050909@impulse.net.au…

Hi,
I am trying to split the following line of text:

“btn Exit” “Exit Button” ( note the quotes may be
" or ’ , read from a file)

in such a way that I can say

txt = line.split(/regrex/)

and get back

txt[0] =
txt[1] = btn Exit
txt[2] = Exit Button

This works for your example, but may be somewhat fragile when you go
to expand its use over a wider range of inputs…

require ‘test/unit’

class TC_one < Test::Unit::TestCase
def test_01
str = %Q/ “btn Exit” “Exit Button”/
ans = str.split( / *["'] *"?/)

assert_equal( ["<button>", "btn Exit", "Exit Button"], ans)

end
end

Cheers,

alan

David_A_Black3 · 1 December 2003 13:19

Hi –

···

On Mon, 1 Dec 2003, Robert Klemme wrote:

“Brett S Hallett” dragoncity@impulse.net.au schrieb im Newsbeitrag
news:3FC81A17.7050909@impulse.net.au…

Hi,
I am trying to split the following line of text:

“btn Exit” “Exit Button” ( note the quotes may be
" or ’ , read from a file)

in such a way that I can say

txt = line.split(/regrex/)

and get back

txt[0] =
txt[1] = btn Exit
txt[2] = Exit Button

my current regexp

ans = tst.split(/["|']/)

does this , except that the last set is missing ! ,

txt[0] =
txt[1] = btn Exit
txt[2] =

so how do I get the expression to continue processing the line ??

txt = line.scan /“[^”]" | ‘[^’]’ | \S+/x

That preserves the quotation marks, which I don’t think Brett wanted.

The best I can do is the somewhat inelegant:

line.scan(/(?:"'["'])|(\S+)/).flatten.compact

David

–
David A. Black
dblack@wobblini.net

Topic		Replies	Views
Can't find appropriate regexp ruby-talk	16	101	24 June 2003
Strange result using String#split ruby-talk	1	112	6 August 2008
Regexp help please ruby-talk	3	96	5 July 2008
Regular expression question ruby-talk	3	73	21 August 2003
Parsing text ruby-talk	3	115	22 April 2011

Regexp splitting problem

Related topics