Hi,
I'm new to rubi and at present i'm trying to write a sample code for
html extraction and store it in the database. The following is the code
for that :
doc = Hpricot(open("http://www.google.com/"))
(doc/"a").each do |link|
if (link.attributes['class'] == 'gb1')
href = link.inner_text.strip
con = DBI.connect("DBI:Mysql:Sample:localhost", "arunkumar",
"123456")
stat = con.prepare("Insert into hello values ('', ?, 'GK', 71,
'female')")
stat.execute("#{href}")
stat.finish
con.commit
puts "Records have been inserted"
else
puts "Sorry! No matches found."
end
end
When i execute it all works well except a warning saying :
/usr/lib/ruby/gems/1.8/gems/hpricot-0.6.164/lib/hpricot/builder.rb:26:
warning: `&' interpreted as argument prefix
I dont know where went wrong for such a warning to get displayed. Please
help.
you may want to switch to mechanize instead. Kill two birds with one stone.
···
On Mar 15, 2009, at 21:14 , Arun Kumar wrote:
Hi,
I'm new to rubi and at present i'm trying to write a sample code for
html extraction and store it in the database. The following is the code
for that :
doc = Hpricot(open("http://www.google.com/"\))
(doc/"a").each do |link|
if (link.attributes['class'] == 'gb1')
href = link.inner_text.strip
con = DBI.connect("DBI:Mysql:Sample:localhost", "arunkumar",
"123456")
stat = con.prepare("Insert into hello values ('', ?, 'GK', 71,
'female')")
stat.execute("#{href}")
stat.finish
con.commit
puts "Records have been inserted"
else
puts "Sorry! No matches found."
end
end
When i execute it all works well except a warning saying :
/usr/lib/ruby/gems/1.8/gems/hpricot-0.6.164/lib/hpricot/builder.rb:26:
warning: `&' interpreted as argument prefix
Hi,
I'm new to rubi and at present i'm trying to write a sample code for
html extraction and store it in the database. The following is the code
for that :
Apparently Google pages contain very complex HTML, to relieve strain on their servers. Then, Hpricot does not "sanitize" its input. That k variable might contain a &, which Ruby then warns about. instance_variable_set() creates an instance variable, like this:
@foo = v
where 'foo' was in k. But if k contains '&foo', you get this:
@&foo = v
You can't write that in raw Ruby, so instance_variable_set() is warning you that you should not write it in "meta-programming" Ruby either.
But none of this is your fault: It's a bug in Hpricot, which Google's advanced HTML uncovered.
The conclusion: Switch to Nokogiri. It has an Hpricot compatibility mode, but its internal engine is libxml, which is one of the industry's leading XML (and therefor HTML) implementations.
If you modify it to
ele.instance_eval(&blk)
The warning is gone.
Regards,
Park Heesob
···
2009/3/16 Phlip <phlip2005@gmail.com>:
Arun Kumar wrote:
Hi,
I'm new to rubi and at present i'm trying to write a sample code for
html extraction and store it in the database. The following is the code
for that :
On Mon, 16 Mar 2009 03:16:57 -0500, Heesob Park wrote:
2009/3/16 Phlip <phlip2005@gmail.com>:
Arun Kumar wrote:
Hi,
I'm new to rubi and at present i'm trying to write a sample code for
html extraction and store it in the database. The following is the
code for that :
If you modify it to
ele.instance_eval(&blk)
The warning is gone.
Regards,
Park Heesob
--
Chanoch (Ken) Bloom. PhD candidate. Linguistic Cognition Laboratory.
Department of Computer Science. Illinois Institute of Technology. http://www.iit.edu/~kbloom1/
I've said it before and I'll say it again: It isn't fixed until it is
released.
--
Chanoch (Ken) Bloom. PhD candidate. Linguistic Cognition Laboratory.
Department of Computer Science. Illinois Institute of Technology. http://www.iit.edu/~kbloom1/
I've said it before and I'll say it again: It isn't fixed until it is
released.
OK. Somehow I thought that assigning a version number meant it was a
release.
--
Chanoch (Ken) Bloom. PhD candidate. Linguistic Cognition Laboratory.
Department of Computer Science. Illinois Institute of Technology. http://www.iit.edu/~kbloom1/