====================================
def look_for_begin
while line = gets
if line =~ /^begin/
puts line
# return
end
end
end
ARGF.each { look_for_begin }
I have files with uuencoded and yencoded
data, and some text-only files, all in all 188 files,
and the size for all are about 16 MB.
The tool needs 3.6 seconds to look for the /^begin/
in all files.
When using exceptions, or break, or return (see the
comment above) to stop reading the file after a /^begin/
was found, I got no speedup!
I tries Perl, OCaml and C and all are a lot faster.
OK, if Ruby is slower, so it is.... and I have to live
with that.
But what I can NOT accept, is that the code needs the same
time with the statements and without the statements, that stop the
further reading of the files!
====================================
def look_for_begin
while line = gets
if line =~ /^begin/
puts line
# return
end
end
end
ARGF.each { look_for_begin }
The problem is that you've got two loops here. ARGF.each calls
look_for_begin once for each line of each file passed in. Then within
look_for_begin, it has another loop that runs until there are no more
lines to process. So what happens is this: without the return
statement, the look_for_begin function is called once, and its while
loop runs through all of the lines until until there are no more to
process. The function is not called again, because the ARGF.each loop
terminates immediately, because all lines have been read.
If you put in the return, the while loop runs until it finds the first
"begin". Then the function returns. Then the ARGF.each loop calls
look_for_begin again, and it picks up where it left off, processing the
line after the one where "begin" was found.
So, either way, your function process every line of every file. The
only difference you cause by adding and removing the return statement
is whether it processes all of the lines in one call to look_for_begin,
or over multiple calls.
I think what you wanted to do is use ARGV.each instead of ARGF.each, to
iterate over the list of file names, and pass each file name into the
look_for_begin function. Within the function, you'd process only the
lines in that file. In other words, like this:
def look_for_begin(fn)
IO.foreach(fn) do |line|
if line =~ /^begin/
puts line
return
end
end
end
I think what you wanted to do is use ARGV.each instead of ARGF.each, to
iterate over the list of file names, and pass each file name into the
look_for_begin function. Within the function, you'd process only the
lines in that file. In other words, like this:
def look_for_begin(fn)
IO.foreach(fn) do |line|
if line =~ /^begin/
puts line
return
end
end
end
ARGV.each {|fn| look_for_begin(fn) }
I think, Oliver wanted to iterate all lines in the files whose names were given as command line arguments. Something like:
ARGF.each do |line|
if line =~ /^begin/
puts line
break
end
end
Theese both things looks like if they would look for *all*
occurrnces of "begin", not the first one.
I also think to look only in the first 1000 lines or so...
Ciao,
Oliver
P.S.: But I now also found files, where more than one
uuencoded section was inside...
... so, maybe reading the files complete also could make sense...
(I didn't found such files before, so I thought it would make
sense to read only until the first occurence of /^begin/)
> Hi --
>
>
>> Oliver Bandel wrote:
>>
>>> Hello,
>>>
>>>
>>> The Code:
>>>
>>> ====================================
>>> def look_for_begin
>>> while line = gets
>>> if line =~ /^begin/
>>> puts line
>>> # return
>>> end
>>> end
>>> end
>>>
>>> ARGF.each { look_for_begin }
>>> ====================================
>>
>>
>> puts ARGV.map{|f|IO.readlines(f).find{|s|s=~/^begin/}}
>
>
> Or maybe:
>
> puts ARGF.find {|s| /^begin/.match(s) }
No, this only finds one instance. Mine finds the first
in each file.
[...]
Theese both things looks like if they would look for *all*
occurrnces of "begin", not the first one.
You know too little of Ruby to tell what the code will do
just by looking at it. Try both if you want to know what
they will do.
I also think to look only in the first 1000 lines or so...
ARGV.each{|f| count = 0
IO.foreach(f) {|line|
if line =~ /^begin/
print line
break
end
count += 1
break if 1000 == count
}
}
···
dblack@wobblini.net wrote:
> On Sat, 26 Aug 2006, William James wrote:
Ciao,
Oliver
P.S.: But I now also found files, where more than one
uuencoded section was inside...
... so, maybe reading the files complete also could make sense...
(I didn't found such files before, so I thought it would make
sense to read only until the first occurence of /^begin/)
Theese both things looks like if they would look for *all*
occurrnces of "begin", not the first one.
Well, if you know how Enumerable#find works, then they look like they
find the first one (Though, as William pointed out, my code
answers the wrong question, because it only finds one for all the
files instead of one for each.)