In my test i do have :
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
Because i know this declaration is false from Lynx saying it is UTF-8, i
want to change the content attribute by :
meta=doc.at_xpath("/html/head/meta")
meta['content']="text/html; charset=UTF-8" if !meta.nil? &&
meta['http-equiv'].downcase=='content-type'
however, printing meta gives always :
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
that's to say, no change at all.
Is this behaviour implied by the fact the meta tag isn't self closed, ie
not ending by " />" ???
If yes, in that case i could unling all meta tags and create a good one
?
No quicker solution ?
···
--
« Tout le monde savait que c'était impossible à faire. Puis un jour
quelqu'un est arrivé qui ne le savait pas, et il l'a fait. »
(Winston Churchill)
This doesn't work too :
metas=doc.xpath("/html/head/meta")
metas.each {|x| x.unlink}
meta=Nokogiri::XML::Node.new "meta", doc
meta['http-equiv']="content-type"
meta['content']="text/html; charset=UTF-8"
printing head tag gives :
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<title>...</title>
<link href="..." type="text/css">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
</head>
then, i got back 2 times the meta tag, the older one and the new one -
correct...
my first Nokogiri line being :
doc=Nokogiri::HTML(html) { |config| config.noblanks.noent }
the original html starting with :
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"
/>
where the meta tag is well balanced by " />" ...
···
Une Bévue <unbewusst.sein@fai.invalid> wrote:
If yes, in that case i could unling all meta tags and create a good one
?
--
« Tout le monde savait que c'était impossible à faire. Puis un jour
quelqu'un est arrivé qui ne le savait pas, et il l'a fait. »
(Winston Churchill)
In my test i do have :
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
Because i know this declaration is false from Lynx saying it is UTF-8, i
want to change the content attribute by :
meta=doc.at_xpath("/html/head/meta")
meta['content']="text/html; charset=UTF-8" if !meta.nil? &&
meta['http-equiv'].downcase=='content-type'
Have you tried using `Document#meta_encoding=` ?
See http://nokogiri.org/search?q=meta_encoding%3D
Also, prefer nokogiri-talk to ruby-talk for Nokogiri questions. Thank you!
···
2011/5/31 Une Bévue <unbewusst.sein@fai.invalid>
however, printing meta gives always :
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
that's to say, no change at all.
Is this behaviour implied by the fact the meta tag isn't self closed, ie
not ending by " />" ???
If yes, in that case i could unling all meta tags and create a good one
?
No quicker solution ?
--
« Tout le monde savait que c'était impossible à faire. Puis un jour
quelqu'un est arrivé qui ne le savait pas, et il l'a fait. »
(Winston Churchill)
Have you tried using `Document#meta_encoding=` ?
No, I'll look at...
I did a workaround in between, using a gsub...
See http://nokogiri.org/search?q=meta_encoding%3D
Also, prefer nokogiri-talk to ruby-talk for Nokogiri questions. Thank you!
OK, i've suscribed to this list.
···
Mike Dalessio <mike.dalessio@gmail.com> wrote:
--
« Tout le monde savait que c'était impossible à faire. Puis un jour
quelqu'un est arrivé qui ne le savait pas, et il l'a fait. »
(Winston Churchill)