Extracting multiple lines from a file

See the second argument to IO#readlines. Or, you can set $/ manually -
same as Perl. :slight_smile:

Regards,

Dan

···

-----Original Message-----
From: Ron Coutts [mailto:rcoutts@envistatech.com]
Sent: Monday, December 15, 2003 2:14 PM
To: ruby-talk ML
Subject: Extracting multiple lines from a file

This should be an easy question for Ruby-heads but I can’t
seem to easily find the answer. I need to extract multiple
lines from a file where the lines are delimited by some other
text. For example, with file contents as shown below I’d
like to extract the content between the BEGIN_FOO and END_FOO
lines. Can anyone tell me how to do that?

BEGIN_FOO
Some foo stuff here.
More foo stuff.
END_FOO

I’ve done this before in Perl (can’t remember the syntax) so
I’m thinking it should be possible in Ruby.

Ron

Thanks for the quick reply Dan, but I may not have been clear as to what
I’m trying to do. IO#readlines allows one to read lines and specify the
line delimiter (which is usually ‘\n’). I’m trying to extract the
content between the lines BEGIN_FOO and END_FOO within a file where the
lines are delimited with ‘\n’.

As I look at the problem further, it seems like Ranges may be part of
the solution. In the Pickaxe book by Dave Thomas and Andrew Hunt they
give this example where a file (named “ordinal”) containing the words
‘one’ through ‘ten’, one word per line, is parsed (page 84):

file = File.open(“ordinal”)
while file.gets
print if /third/ … /fifth/
End

Unfortunately this code snippet is buggy as it prints out all ten lines
of the file, not just lines three through five as mentioned in the book.
So I’m still stuck on this problem.

I’ve seen Dave Thomas post on the list. If you read this Dave, do you
have any pointers for me or the fix for the code snippet above? Any
help would be appreciated.

Thanks,
Ron

···

-----Original Message-----
From: Berger, Daniel [mailto:djberge@qwest.com]
Sent: December 15, 2003 2:53 PM
To: ruby-talk ML
Subject: Re: Extracting multiple lines from a file

-----Original Message-----
From: Ron Coutts [mailto:rcoutts@envistatech.com]
Sent: Monday, December 15, 2003 2:14 PM
To: ruby-talk ML
Subject: Extracting multiple lines from a file

This should be an easy question for Ruby-heads but I can’t
seem to easily find the answer. I need to extract multiple
lines from a file where the lines are delimited by some other
text. For example, with file contents as shown below I’d
like to extract the content between the BEGIN_FOO and END_FOO
lines. Can anyone tell me how to do that?

BEGIN_FOO
Some foo stuff here.
More foo stuff.
END_FOO

I’ve done this before in Perl (can’t remember the syntax) so
I’m thinking it should be possible in Ruby.

Ron

See the second argument to IO#readlines. Or, you can set $/
manually -
same as Perl. :slight_smile:

Regards,

Dan

stuff=%r|^BEGIN_FOO$(.+)^END_FOO$|m

file=File.new(“file”,“r”).read

file =~ stuff

p clean_stuff = $1

Ron Coutts wrote:

···

-----Original Message-----
From: Berger, Daniel [mailto:djberge@qwest.com]
Sent: December 15, 2003 2:53 PM
To: ruby-talk ML
Subject: Re: Extracting multiple lines from a file

-----Original Message-----
From: Ron Coutts [mailto:rcoutts@envistatech.com]
Sent: Monday, December 15, 2003 2:14 PM
To: ruby-talk ML
Subject: Extracting multiple lines from a file

This should be an easy question for Ruby-heads but I can’t
seem to easily find the answer. I need to extract multiple
lines from a file where the lines are delimited by some other
text. For example, with file contents as shown below I’d
like to extract the content between the BEGIN_FOO and END_FOO
lines. Can anyone tell me how to do that?

BEGIN_FOO
Some foo stuff here.
More foo stuff.
END_FOO

I’ve done this before in Perl (can’t remember the syntax) so
I’m thinking it should be possible in Ruby.

Ron

See the second argument to IO#readlines. Or, you can set $/
manually -
same as Perl. :slight_smile:

Regards,

Dan

Thanks for the quick reply Dan, but I may not have been clear as to what
I’m trying to do. IO#readlines allows one to read lines and specify the
line delimiter (which is usually ‘\n’). I’m trying to extract the
content between the lines BEGIN_FOO and END_FOO within a file where the
lines are delimited with ‘\n’.

As I look at the problem further, it seems like Ranges may be part of
the solution. In the Pickaxe book by Dave Thomas and Andrew Hunt they
give this example where a file (named “ordinal”) containing the words
‘one’ through ‘ten’, one word per line, is parsed (page 84):

file = File.open(“ordinal”)
while file.gets
print if /third/ … /fifth/
End

Unfortunately this code snippet is buggy as it prints out all ten lines
of the file, not just lines three through five as mentioned in the book.
So I’m still stuck on this problem.

I’ve seen Dave Thomas post on the list. If you read this Dave, do you
have any pointers for me or the fix for the code snippet above? Any
help would be appreciated.

Thanks,
Ron


General Electric - CIAT
Advanced Engineering Center


Rodrigo Bermejo
Information Technologies.
Special Applications
Dial-comm : *879-0644
Phone :(+52) 442-196-0644

Yeah - Ruby 1.8 changed this. You could try

file = File.open(“ordinal”)
while line = file.gets
print if line =~ /3/… line =~ /5/
end

If the file is a reasonable size, if might be easier to do

content = File.read(“ordinal”)
content.scan(/^START(.*?)^END/m) do |match,|
p match
end

This reads in the whole file, then calls the block passing in the
chunks of it between START and END.

Cheers

Dave
Cheers

Dave

···

On Dec 15, 2003, at 16:16, Ron Coutts wrote:

file = File.open(“ordinal”)
while file.gets
print if /third/ … /fifth/
End

Unfortunately this code snippet is buggy as it prints out all ten lines
of the file, not just lines three through five as mentioned in the
book.

That’s one of the things that changed between Ruby 1.6 (which is what’s
documented in the Pickaxe book) and 1.8. There’s a list of the changes here:

ftp://ftp.ruby-lang.org/pub/ruby/1.8/changes.1.8.0

You can use the example with one small modification:
replace the bare regexes with ~ + regex:

file = File.open("ordinal")
while file.gets
    print if ~/third/ .. ~/fifth/
end

-Mark

···

On Tue, Dec 16, 2003 at 07:16:23AM +0900, Ron Coutts wrote:

file = File.open(“ordinal”)
while file.gets
print if /third/ … /fifth/
End

Unfortunately this code snippet is buggy as it prints out all ten lines
of the file, not just lines three through five as mentioned in the book.
So I’m still stuck on this problem.

If the file is a reasonable size, if might be easier to do

content = File.read(“ordinal”)
content.scan(/^START(.*?)^END/m) do |match,|
p match
end

This reads in the whole file, then calls the block passing in the
chunks of it between START and END.

Thanks Dave and Rodrigo for your regex solutions. The file is small
enough so the solution above is good. I got the part that reads the
block content down to one line because I have a multiple content blocks
to extract from the same file. Here it is.

content = File.read(“ordinal”)
block_content = content.scan(/BEGIN\n(.*?)END\n/m).join

If anyone’s doing a Ruby Cookbook, please throw in this solution, or one
like it! :slight_smile:

Thanks all for your quick replies. Much appreciated!

Ron

P.S. I was using Ruby 1.8.0

Who maintains that list? It would appear that the behavior of all the
“arity” methods have changed as well, as discussed in a previous thread
here. I don’t see that on the list anywhere.

Derek Lewis

···

On Tue, 16 Dec 2003, Mark J. Reed wrote:

That’s one of the things that changed between Ruby 1.6 (which is what’s
documented in the Pickaxe book) and 1.8. There’s a list of the changes here:

ftp://ftp.ruby-lang.org/pub/ruby/1.8/changes.1.8.0

===================================================================
Java Web-Application Developer

  Email    : email@lewisd.com
  Cellular : 604.312.2846
  Website  : http://www.lewisd.com

“If you’ve got a 5000-line JSP page that has “all in one” support
for three input forms and four follow-up screens, all controlled
by “if” statements in scriptlets, well … please don’t show it
to me :-). Its almost dinner time, and I don’t want to lose my
appetite :-).”
- Craig R. McClanahan

Hi,

···

In message “Re: Extracting multiple lines from a file” on 03/12/17, Derek Lewis lewisd@f00f.net writes:

ftp://ftp.ruby-lang.org/pub/ruby/1.8/changes.1.8.0

Who maintains that list?

By you-know-who that is incredibly stupid and so easy to forget
things. Worse, he hasn’t maintained the file at all for 1.8.1.
His name starts with “m” and ends with “z”.

I always want replacement, but it’s a tough job.

						matz.

Do you have the latest ‘ri’ (from sf/net/projects/rdoc). It’s got the
PickAxe content, updated for 1.8.

Fairly soon now this will also be available in the core Ruby
distribution.

Cheers

Dave

···

On Dec 16, 2003, at 10:27, Derek Lewis wrote:

On Tue, 16 Dec 2003, Mark J. Reed wrote:

That’s one of the things that changed between Ruby 1.6 (which is
what’s
documented in the Pickaxe book) and 1.8. There’s a list of the
changes here:

ftp://ftp.ruby-lang.org/pub/ruby/1.8/changes.1.8.0

Who maintains that list? It would appear that the behavior of all the
“arity” methods have changed as well, as discussed in a previous thread
here. I don’t see that on the list anywhere.

Unless I’ve done something wrong, it looks like Proc.arity still has the
old behavior. I downloaded ri-1.8b.tgz.

(Notice I’m also getting some strange error from ri.rb)

lewisd@derlewi:~$ ri -v
/usr/local/lib/site_ruby/ri/ri.rb:404: warning: don’t put space before
argument parentheses
ri 1.8a
lewisd@derlewi:~$ ri Proc.arity

Proc.new {|a|}.arity #=> -1

lewisd@derlewi:~$ irb
irb(main):001:0> require ‘rbconfig’
=> true
irb(main):002:0> include Config
=> Object
irb(main):003:0> CONFIG[‘ruby_version’]
=> “1.8”
irb(main):004:0> Proc.new {|a|}.arity
=> 1
irb(main):005:0> ^D
lewisd@derlewi:~$ ruby --version
ruby 1.8.1 (2003-11-11) [i386-linux]
lewisd@derlewi:~$

Derek Lewis

···

On Wed, 17 Dec 2003, Dave Thomas wrote:

Do you have the latest ‘ri’ (from sf/net/projects/rdoc). It’s got the
PickAxe content, updated for 1.8.

===================================================================
Java Web-Application Developer

  Email    : email@lewisd.com
  Cellular : 604.312.2846
  Website  : http://www.lewisd.com

“If you’ve got a 5000-line JSP page that has “all in one” support
for three input forms and four follow-up screens, all controlled
by “if” statements in scriptlets, well … please don’t show it
to me :-). Its almost dinner time, and I don’t want to lose my
appetite :-).”
- Craig R. McClanahan

Dave Thomas dave@pragprog.com wrote in message news:4F6A14B4-2FEA-11D8-BFF4-000A95676A62@pragprog.com

[snip

Do you have the latest ‘ri’ (from sf/net/projects/rdoc). It’s got the
PickAxe content, updated for 1.8.

Fairly soon now this will also be available in the core Ruby
distribution.

Dave,

I’m sorry to back up to a rather old thread, but I’m not able to find
the latest ri, the one with the PickAxe content, updated for 1.8

If I check RDoc project on SourceForge, I see that File Releases are
rather old, while browsing CVS I have found nothing related to ri.

Where should I look for?

Thanks a lot.
AA

I’m sorry to back up to a rather old thread, but I’m not able to find
the latest ri, the one with the PickAxe content, updated for 1.8

Where should I look for?

It’s in the latest Ruby CVS (from ruby-lang).

As of today, all the built-in methods apart from those in Process are
integrated into the Ruby source tree. If you download the latest CVS,
install it, and then in the main source directory issue the command

 rdoc --ri --all *.c

You’ll get them installed and available via ‘ri’.

It’s still a work in progress, but it’s coming along… :slight_smile:

Cheers

Dave

···

On Dec 30, 2003, at 11:41, Alfio Astanti wrote:

Dave Thomas wrote:

As of today, all the built-in methods apart from those in Process are
integrated into the Ruby source tree. If you download the latest CVS,
install it, and then in the main source directory issue the command

rdoc --ri --all *.c

When I try that (on Windows 2000), rdoc seems to happily run through all the source code, but then when it gets to the ri generation stage, it dies with:

Generating RI…
d:/ruby-1.8.1/lib/ruby/1.9/yaml.rb:17:in require': No such file to load -- yaml/parser (LoadError) from d:/ruby-1.8.1/lib/ruby/1.9/yaml.rb:17 from d:/ruby-1.8.1/lib/ruby/1.9/rdoc/ri/ri_descriptions.rb:1:in require’
from d:/ruby-1.8.1/lib/ruby/1.9/rdoc/ri/ri_descriptions.rb:1
from d:/ruby-1.8.1/lib/ruby/1.9/rdoc/ri/ri_reader.rb:1:in require' from d:/ruby-1.8.1/lib/ruby/1.9/rdoc/ri/ri_reader.rb:1 from d:/ruby-1.8.1/lib/ruby/1.9/rdoc/generators/ri_generator.rb:46:in require’
from d:/ruby-1.8.1/lib/ruby/1.9/rdoc/generators/ri_generator.rb:46
from d:/ruby-1.8.1/lib/ruby/1.9/rdoc/rdoc.rb:194:in require' from d:/ruby-1.8.1/lib/ruby/1.9/rdoc/rdoc.rb:194:in document’
from d:/ruby-1.8.1/bin/rdoc.bat:70

I did a search from the 1.9 directory down for parser.rb and found a number, but not one that would be seen as yaml/parser.rb; can you suggest what I need to do to fix the rdoc-ing? Or is it just that you haven’t had a chance to get it working on Windows yet? In that case, I’ll just wait, especially since I’m sure /\ndy will give us a new year’s present at some stage :-).

Cheers,

Harry O.

Just curious, will Ruby one day be distributed with the ‘ri’ data
included, as per the above command?

And is there a way to make ‘ri’ list the classes that it knows about?

Gavin

···

On Wednesday, December 31, 2003, 4:58:29 AM, Dave wrote:

On Dec 30, 2003, at 11:41, Alfio Astanti wrote:

I’m sorry to back up to a rather old thread, but I’m not able to find
the latest ri, the one with the PickAxe content, updated for 1.8

Where should I look for?

It’s in the latest Ruby CVS (from ruby-lang).

As of today, all the built-in methods apart from those in Process are
integrated into the Ruby source tree. If you download the latest CVS,
install it, and then in the main source directory issue the command

 rdoc --ri --all *.c

You’ll get them installed and available via ‘ri’.

It’s still a work in progress, but it’s coming along… :slight_smile:

Dave Thomas dave@pragprog.com wrote in message news:C4D6A53C-3AF1-11D8-A299-000A95676A62@pragprog.com

···

On Dec 30, 2003, at 11:41, Alfio Astanti wrote:

I’m sorry to back up to a rather old thread, but I’m not able to find
the latest ri, the one with the PickAxe content, updated for 1.8

Where should I look for?

It’s in the latest Ruby CVS (from ruby-lang).

[snip]

Thanks a lot, Dave, I’ll take a look.

Happy new year to everybody!
AA

Dave Thomas wrote:

As of today, all the built-in methods apart from those in Process are
integrated into the Ruby source tree. If you download the latest CVS,
install it, and then in the main source directory issue the command
rdoc --ri --all *.c

When I try that (on Windows 2000), rdoc seems to happily run through
all the source code, but then when it gets to the ri generation stage,
it dies with:

Generating RI…
d:/ruby-1.8.1/lib/ruby/1.9/yaml.rb:17:in require': No such file to load -- yaml/parser (LoadError) from d:/ruby-1.8.1/lib/ruby/1.9/yaml.rb:17 from d:/ruby-1.8.1/lib/ruby/1.9/rdoc/ri/ri_descriptions.rb:1:in require’

That looks like a problem with yaml/syck. why??

Cheers

Dave

···

On Dec 30, 2003, at 17:56, Harry Ohlsen wrote:

Just curious, will Ruby one day be distributed with the ‘ri’ data
included, as per the above command?

I haven’t discussed this with Matz, but my personal hope is that the
installation process will also install the documentation.

And is there a way to make ‘ri’ list the classes that it knows about?

type ‘ri’

Cheers

Dave

···

On Dec 30, 2003, at 21:46, Gavin Sinclair wrote:

Dave Thomas wrote:

That looks like a problem with yaml/syck. why??

Would be included in the nightly snapshot?

It could also be that I’ve updated Ruby on this machine quite a few times and there’s some kind of path problem. I’ll play with it or wait for an official /\ndy release.

Cheers,

H.