I've been at this for a few hours now, and not getting much further:
I want to read a bunch of urls, listed on each seperate line in a txt
file.
I then want to use these seperate urls and do a DO EACH method (a
nokogiri parse) for all these urls.
What i'm trying now is:
f = File.open("file.txt", "r")
f.each_line do |lijn|
searchableurl = Nokogiri::HTML (lijn)
It is not giving an error, nor is it working.
Is this the wrong way of getting and then using each url?
Does it have to do with linebreaks?
I've been at this for a few hours now, and not getting much further:
I want to read a bunch of urls, listed on each seperate line in a txt
file.
I then want to use these seperate urls and do a DO EACH method (a
nokogiri parse) for all these urls.
What i'm trying now is:
f = File.open("file.txt", "r")
f.each_line do |lijn|
Better to use the block forms:
File.foreach("file.txt") do |line|
end
This way the file is properly closed.
searchableurl = Nokogiri::HTML (lijn)
That method receives the HTML string, not the URL. You need to read
its contents first:
require 'open-uri'
Nokogiri::HTML(open(line))
Jesus.
···
On Tue, Oct 9, 2012 at 4:51 PM, Sybren Kooistra <lists@ruby-forum.com> wrote:
One more question: since I'm going to use this method on a txt file with
hundreds of thousands of urls, is this the best method, or does it save
to much in memory?