Operate on a range of lines in a file

I’m hoping someone can jog my memory here. Say I’ve got a file which
contains the keywords “START” and “END”, with several lines in between.
Suppose I want to operate on everything between START and END (inclusive
or not, doesn’t really matter, I don’t think).

Naively, I would do one of a couple things: either use a multiline
regexp, or read the file until START is encountered, and set some sort
of “readthis” flag which signals the program to operate on the lines
until END is encountered, at which point the flag is unset.

However… I’m pretty sure that, somewhere, sometime, I saw some much
more elegant way of doing this in ruby. I said “neat!”, and promptly
forgot about it. Something involving ranges, maybe? Anybody have an idea
what I might have been thinking of, or am I misremembering?

Any help would be much appreciated. This is a great newsgroup and
mailing list; I only wish I had time to read it all.

I’m hoping someone can jog my memory here. Say I’ve got a file which
contains the keywords “START” and “END”, with several lines in between.
Suppose I want to operate on everything between START and END (inclusive
or not, doesn’t really matter, I don’t think).

Naively, I would do one of a couple things: either use a multiline
regexp, or read the file until START is encountered, and set some sort
of “readthis” flag which signals the program to operate on the lines
until END is encountered, at which point the flag is unset.

Why not something like this?

parsing = false
File::open(‘path/to/file’).readlines.each { |line|
next unless parsing || parsing = (line =~ /^START$/)
break if line =~ /^END$/

handle parsing

}

It doesn’t use a range, but it’s relatively concise.

···

However… I’m pretty sure that, somewhere, sometime, I saw some much
more elegant way of doing this in ruby. I said “neat!”, and promptly
forgot about it. Something involving ranges, maybe? Anybody have an idea
what I might have been thinking of, or am I misremembering?

Any help would be much appreciated. This is a great newsgroup and
mailing list; I only wish I had time to read it all.


Paul Duncan pabs@pablotron.org pabs in #gah (OPN IRC)
http://www.pablotron.org/ OpenPGP Key ID: 0x82C29562

“Jim Bob” invalid@invalid.com wrote in message

However… I’m pretty sure that, somewhere, sometime, I saw some much
more elegant way of doing this in ruby. I said “neat!”, and promptly
forgot about it. Something involving ranges, maybe? Anybody have an idea
what I might have been thinking of, or am I misremembering?

You mean like this example from ruby-talk [73674] ?

while( line = gets )
if /BEGIN:VCARD/ =~ line … /END:VCARD/ =~ line
puts line
end
end

while line = gets
if line =~ /^START$/ … /^END$/
puts line
end
end

(‘puts line’ will ping all lines between and including the START and END
lines)

Regards,

Brian.

···

On Tue, Jun 24, 2003 at 11:02:58AM +0900, Jim Bob wrote:

I’m hoping someone can jog my memory here. Say I’ve got a file which
contains the keywords “START” and “END”, with several lines in between.
Suppose I want to operate on everything between START and END (inclusive
or not, doesn’t really matter, I don’t think).

Naively, I would do one of a couple things: either use a multiline
regexp, or read the file until START is encountered, and set some sort
of “readthis” flag which signals the program to operate on the lines
until END is encountered, at which point the flag is unset.

However… I’m pretty sure that, somewhere, sometime, I saw some much
more elegant way of doing this in ruby. I said “neat!”, and promptly
forgot about it. Something involving ranges, maybe? Anybody have an idea
what I might have been thinking of, or am I misremembering?

Thanks, everyone. That
line =~ /START/ … line =~ /END/
thing is exactly what I was trying to remember.

Thanks, Paul, too, for an implementation of one of the “naive” ways I
had mentioned which is, indeed, much nicer and more concise than the way
I had been doing it.

“Brian Candler” B.Candler@pobox.com schrieb im Newsbeitrag
news:20030624075527.GC57124@uk.tiscali.com

I’m hoping someone can jog my memory here. Say I’ve got a file which
contains the keywords “START” and “END”, with several lines in
between.
Suppose I want to operate on everything between START and END
(inclusive
or not, doesn’t really matter, I don’t think).

Naively, I would do one of a couple things: either use a multiline
regexp, or read the file until START is encountered, and set some sort
of “readthis” flag which signals the program to operate on the lines
until END is encountered, at which point the flag is unset.

However… I’m pretty sure that, somewhere, sometime, I saw some
much
more elegant way of doing this in ruby. I said “neat!”, and promptly
forgot about it. Something involving ranges, maybe? Anybody have an
idea
what I might have been thinking of, or am I misremembering?

while line = gets
if line =~ /^START$/ … /^END$/
puts line
end
end

(‘puts line’ will ping all lines between and including the START and END
lines)

There might be a hidden gotcha in your code if it worked for you: $_ will
be assigned during gets and this might have made it work (if at all).
However, it didn’t work for me:

irb(main):054:0* lines=“1\nSTART\nfoo\nEND\n2”
“1\nSTART\nfoo\nEND\n2”
irb(main):055:0>
irb(main):056:0* lines.each do |line|
irb(main):057:1* if line =~ /^START$/ … /^END$/
irb(main):058:2> puts line
irb(main):059:2> end
irb(main):060:1> end
START
“1\nSTART\nfoo\nEND\n2”
irb(main):061:0>

IMHO this is the correct solution:

irb(main):062:0* lines.each do |line|
irb(main):063:1* if /^START$/ =~ line … /^END$/ =~ line
irb(main):064:2> puts line
irb(main):065:2> end
irb(main):066:1> end
START
foo
END
“1\nSTART\nfoo\nEND\n2”
irb(main):067:0>

Regards

robert
···

On Tue, Jun 24, 2003 at 11:02:58AM +0900, Jim Bob wrote:

I am having a problem with regular expression.

This is my text

···

oranges apples bananas

I want to break this text into words so that I can have an array with the
following elements.
array[0] = oranges
array[1] = apples
array[2] = bananas

But it is not working…

I am trying to match the pattern with this regular expression.

/.*[^\s]/

Can anybody help me with this?
Thanks,
Rob

In article 20030624075527.GC57124@uk.tiscali.com,

···

Brian Candler B.Candler@pobox.com wrote:

while line = gets
if line =~ /^START$/ … /^END$/
puts line
end
end

Did you miss an extra line =~ there?

if line =~ /^START$/ … line =~ /^END$/
puts line
end

works better…

Mike

mike@stok.co.uk | The “`Stok’ disclaimers” apply.
http://www.stok.co.uk/~mike/ | GPG PGP Key 1024D/059913DA
mike@exegenix.com | Fingerprint 0570 71CD 6790 7C28 3D60
http://www.exegenix.com/ | 75D2 9EC4 C1C0 0599 13DA

I am having a problem with regular expression.

Have you tried String#split ?

irb
irb(main):001:0> s = “oranges apples bananas”
=> “oranges apples bananas”
irb(main):002:0> s.split(/ /)
=> [“oranges”, “apples”, “bananas”]
irb(main):003:0>

···

On Tue, 24 Jun 2003 20:45:43 +0900, Rob wrote:


Simon Strandgaard

Robert Klemme wrote:

“Brian Candler” B.Candler@pobox.com schrieb im Newsbeitrag
news:20030624075527.GC57124@uk.tiscali.com

I’m hoping someone can jog my memory here. Say I’ve got a file which
contains the keywords “START” and “END”, with several lines in

between.

Suppose I want to operate on everything between START and END

(inclusive

or not, doesn’t really matter, I don’t think).

Read in paragraph mode. Set $/ to “END”.

Sample source file:

test.txt

START
This is some text
This is some more text
foo
END
START
bar
baz
blahblahblah
END
START
Last paragraph
Hello World
END

Sample source

paratest.rb

file = “test.txt”
$/ = “END”
n = 0

IO.foreach(file){ |para|
print “\nPARA #: #{n}\n”
puts “=========”
puts para
n += 1
}

Sample output

ruby paratest.rb

PARA #: 0

···

On Tue, Jun 24, 2003 at 11:02:58AM +0900, Jim Bob wrote:
=========
START
This is some text
This is some more text
foo
END

PARA #: 1

START
bar
baz
blahblahblah
END

PARA #: 2

START
Last paragraph
Hello World
END

PARA #: 3

You may need to deal with the last element properly.

Regards,

Dan

“Simon Strandgaard” 0bz63fz3m1qt3001@sneakemail.com schrieb im
Newsbeitrag news:pan.2003.06.24.11.06.57.495092@sneakemail.com

I am having a problem with regular expression.

Have you tried String#split ?

irb
irb(main):001:0> s = “oranges apples bananas”
=> “oranges apples bananas”
irb(main):002:0> s.split(/ /)
=> [“oranges”, “apples”, “bananas”]
irb(main):003:0>

That’s the negative approach (btw: I’d prefer to use /\s+/ instead of / /
because it’s safer). A positive approach could be

irb(main):001:0> s = “oranges apples bananas”
“oranges apples bananas”
irb(main):002:0> s.scan /\w+/
[“oranges”, “apples”, “bananas”]
irb(main):003:0>

As always, there’s a lot of roads to the target… :slight_smile:

robert
···

On Tue, 24 Jun 2003 20:45:43 +0900, Rob wrote: