What am I missing here (apologies if this is just my 7am brain fog
speaking):
The contents of my file (test) are:
1
2
3
4
5
6
f = File.open('test')
f.lineno = 4
puts f.readline # <= This returns 1, instead of 4 as expected
I'm trying to do fast lookups in a file based on line number, but this isn't
working like I expected it to. 1) What am I missing here? 2) If this is not
what lineno=() is designed for, then what is it supposed to be used for? 3)
is there another easy way for fast lookups by line#? Thanks.
What am I missing here (apologies if this is just my 7am brain fog
speaking):
The contents of my file (test) are:
1
2
3
4
5
6
f = File.open('test')
f.lineno = 4
puts f.readline # <= This returns 1, instead of 4 as expected [snip]
If remember that the docn says that lineno= just changes the current
value of lineno -- if you check the result of f.lineno after your
f.readline above, you'll get 5. Not sure how useful that behavior is,
but that's how it's defined....
Suggest that you read the file into memory and split it by lines
(File#readlines IIRC, not near the manual now).
it is impossible to do a fast lookup by line without building an index
first (i.e. loading everything into an array or hashing line-number to
byte position) because the program has to count the number of newlines
to the line. So the best would be to simply do
DATA = File.read("file").split(/\n/)
and use
DATA[linenumber-1]
to access the data. Try to restructure your program in such a way that
it need not read the file repeatedly. You could even go overboard and
create a multiton class that reads the file into memory and checks on
each access if the file has changed. Something like this: (Untested)
class DataFile
@@files = {}
def self.open(file)
@@files[file] ||= DataFile.new(file)
end
def initialize(filename) @filename = filename
reread()
end
def read(line)
mt = File.mtime(@filename)
reread() if not @mt or @mt < mt @mt = mt @data[line]
end
private
def reread @data = File.read(@filename).split(/\n/)
end
private :new
end
But that was more for the fun of it.
Brian
···
On 18/11/05, Belorion <belorion@gmail.com> wrote:
What am I missing here (apologies if this is just my 7am brain fog
speaking):
The contents of my file (test) are:
1
2
3
4
5
6
f = File.open('test')
f.lineno = 4
puts f.readline # <= This returns 1, instead of 4 as expected
I'm trying to do fast lookups in a file based on line number, but this isn't
working like I expected it to. 1) What am I missing here? 2) If this is not
what lineno=() is designed for, then what is it supposed to be used for? 3)
is there another easy way for fast lookups by line#? Thanks.
If remember that the docn says that lineno= just changes the current
value of lineno -- if you check the result of f.lineno after your
f.readline above, you'll get 5. Not sure how useful that behavior is,
but that's how it's defined....
I noticed that ... but, what, exactly is that useful for? (as you already
questioned) It seems like I am missing something here, because otherwise
lineno=() seems useless and misleading.
Suggest that you read the file into memory and split it by lines
(File#readlines IIRC, not near the manual now).
The only problem with that is I need to query, say, only 1000 lines in a
file with 195_199_572 lines in it.
>
> If remember that the docn says that lineno= just changes the current
> value of lineno -- if you check the result of f.lineno after your
> f.readline above, you'll get 5. Not sure how useful that behavior is,
> but that's how it's defined....
I noticed that ... but, what, exactly is that useful for? (as you already
questioned) It seems like I am missing something here, because otherwise
lineno=() seems useless and misleading.
It could be used if you read into a file and want to update lineno
manually. E.g.
File.open("f") do | f |
header = f.read(1024)
f.lineno = header.gsub(/[^\n]/, "").length
do_something_with_f_that_needs_linenumbers(f)
end
Suggest that you read the file into memory and split it by lines
> (File#readlines IIRC, not near the manual now).
>
The only problem with that is I need to query, say, only 1000 lines in a
file with 195_199_572 lines in it.
Are the lines you need known at once? Then you could do it like this:
lines = [1, 2, 4, 8, 16, 32, 1024]
lines = lines.sort.reverse
File.open("file") do | f |
while line = f.gets
if line == lines.first
puts line
lines.pop
end
end
end