Bonita
(Bonita)
1
Hi
I'm using hpricot to parse the following file.
<item
rdf:about="http://del.icio.us/url/50666d1a3fe2b942b20819ec2919d2b7#morwyn">
<title>[from morwyn] * HTML for the Conceptually Challenged</title>
<link>http://del.icio.us/url/50666d1a3fe2b942b20819ec2919d2b7#morwyn</link>
<description>HTML for the Conceptually Challenged. Very basic tutorial,
plainly worded for people who hate to read instructions.</description>
<dc:creator>morwyn</dc:creator>
<dc:date>2006-10-10T07:28:28Z</dc:date>
<dc:subject>html imported webpagedesign</dc:subject>
<taxo:topics>
<rdf:Bag>
<rdf:li resource="http://del.icio.us/tag/imported" />
<rdf:li resource="http://del.icio.us/tag/html" />
<rdf:li resource="http://del.icio.us/tag/webpagedesign" />
</rdf:Bag>
</taxo:topics>
</item>
I'm trying to get the content from <dc:subject> like this
doc = Hpricot.parse(File.read("965.xhtml"))
(doc/"item").each do |t|
puts (t/"dc:subject").innerTEXT
end
but I got
<dc:subject>html internet tutorial web</dc:subject>
while I only need "html internet tutorial web"
Anyone knows what's the right function to call?
THanks
···
--
Posted via http://www.ruby-forum.com/.
Sorry for deleted your text data:image/s3,"s3://crabby-images/74947/74947a5602de9560c049ad73b6d556a689e0eefe" alt=":frowning: :frowning:"
Maybe you can try:
puts (t/"dc:subject").text
Bonita wrote:
···
I'm trying to get the content from <dc:subject> like this
doc = Hpricot.parse(File.read("965.xhtml"))
(doc/"item").each do |t|
puts (t/"dc:subject").innerTEXT
end
but I got
<dc:subject>html internet tutorial web</dc:subject>
while I only need "html internet tutorial web"
Anyone knows what's the right function to call?
THanks
--
Posted via http://www.ruby-forum.com/\.
puts (t/'dc:subject').text
Sorry for the double post but I shouldn't have copy/paste the result
directly from irb data:image/s3,"s3://crabby-images/74947/74947a5602de9560c049ad73b6d556a689e0eefe" alt=":frowning: :frowning:"
···
On Apr 13, 12:40 pm, kikij...@gmail.com wrote:
On Apr 13, 9:48 am, Bonita <abbo...@yahoo.com.tw> wrote:
> Hi
> I'm using hpricot to parse the following file.
> <item
> rdf:about="http://del.icio.us/url/50666d1a3fe2b942b20819ec2919d2b7#morwyn">
> <title>[from morwyn] * HTML for the Conceptually Challenged</title>
> <link>http://del.icio.us/url/50666d1a3fe2b942b20819ec2919d2b7#morwyn</link>
> <description>HTML for the Conceptually Challenged. Very basic tutorial,
> plainly worded for people who hate to read instructions.</description>
> <dc:creator>morwyn</dc:creator>
> <dc:date>2006-10-10T07:28:28Z</dc:date>
> <dc:subject>html imported webpagedesign</dc:subject>
> <taxo:topics>
> <rdf:Bag>
> <rdf:li resource="http://del.icio.us/tag/imported" />
> <rdf:li resource="http://del.icio.us/tag/html" />
> <rdf:li resource="http://del.icio.us/tag/webpagedesign" />
> </rdf:Bag>
> </taxo:topics>
> </item>
> I'm trying to get the content from <dc:subject> like this
> doc = Hpricot.parse(File.read("965.xhtml"))
> (doc/"item").each do |t|
> puts (t/"dc:subject").innerTEXT
> end
> but I got
> <dc:subject>html internet tutorial web</dc:subject>
> while I only need "html internet tutorial web"
> Anyone knows what's the right function to call?
> THanks
> --
> Posted viahttp://www.ruby-forum.com/.
>> puts (t/'dc:subject').text