RSS Parser Help

Gim_Ick · 19 September 2009 00:21

I am trying to parse a rss file. I use the rss module to do it.

Suppose this is the data file,

  <item>
      <title>Singapore Airlines Asia Travel - A345 All Business Class to
Asia</title>
      <pubDate>Fri, 18 Sep 2009 22:56:33 +0000</pubDate>
      <guid
isPermaLink="false">http://delicious.com/url/cc78bfa8bb00f50825d7cac52339375d#galvezcreative</guid>
      <link>http://a345.singaporeair.com/</link>
      <dc:creator><![CDATA[galvezcreative]]></dc:creator>
      <comments>http://delicious.com/url/cc78bfa8bb00f50825d7cac52339375d</comments>
      <wfw:commentRss>http://feeds.delicious.com/v2/rss/url/cc78bfa8bb00f50825d7cac52339375d</wfw:commentRss>
      <source
url="http://feeds.delicious.com/v2/rss/galvezcreative">galvezcreative's
bookmarks</source>
      <category
domain="http://delicious.com/galvezcreative/">Industry-Airlines</category>
      <category
domain="http://delicious.com/galvezcreative/">marketing</category>
    </item>

How do I parse to get value in category( In the above example it is
Industry-Airlines and marketing).

When i try rss.items[0].category , I get the entire element( In the
above case, <category
domain="http://delicious.com/galvezcreative/">Industry-Airlines</category>)

···

--
Posted via http://www.ruby-forum.com/.

Kouhei_Sutou1 · 21 September 2009 23:29

Hi,

In <8a1c683053d981d86fb3dcea416ca87d@ruby-forum.com>
"RSS Parser Help.." on Sat, 19 Sep 2009 09:21:36 +0900,

I am trying to parse a rss file. I use the rss module to do it.

Suppose this is the data file,

  <item>
      <title>Singapore Airlines Asia Travel - A345 All Business Class to
Asia</title>
      <pubDate>Fri, 18 Sep 2009 22:56:33 +0000</pubDate>
      <guid
isPermaLink="false">http://delicious.com/url/cc78bfa8bb00f50825d7cac52339375d#galvezcreative</guid>
      <link>http://a345.singaporeair.com/</link>
      <dc:creator><![CDATA[galvezcreative]]></dc:creator>
      <comments>http://delicious.com/url/cc78bfa8bb00f50825d7cac52339375d</comments>
      <wfw:commentRss>http://feeds.delicious.com/v2/rss/url/cc78bfa8bb00f50825d7cac52339375d</wfw:commentRss>
      <source
url="http://feeds.delicious.com/v2/rss/galvezcreative">galvezcreative's
bookmarks</source>
      <category
domain="http://delicious.com/galvezcreative/">Industry\-Airlines</category>
      <category
domain="http://delicious.com/galvezcreative/">marketing</category>
    </item>

How do I parse to get value in category( In the above example it is
Industry-Airlines and marketing).

  rss.items[0].categories.each do |category|
    p category.content
  end

When i try rss.items[0].category , I get the entire element( In the
above case, <category
domain="http://delicious.com/galvezcreative/">Industry\-Airlines</category>\)

rss.items[0].category returns Category object not "<category
...>...</category>" string. (Hint: Category object has #to_s
method that returns "<category ...>...</category>" string)

Thanks,

···

Gim Ick <gimmickivek@gmail.com> wrote:
--
kou

Richard.Williams.20 · 6 October 2009 14:55

Alternate biterscripting script.

# Script category.txt
var str rss ; cat "file.rss" > $rss
while ( { sen -r -c "^^" $rss } > 0 )
do
    var str category ; stex -r -c "^<category&\>&</category\>^" $rss >
$category
    stex -r -c "^<category&\>^]" $category > null ; stex -r -c "[^</
category\>^" $category > null
    echo $category
done

For documentation on stex (string extractor) command, see
http://www.biterscripting.com/helppages/stex.html

Richard

···

On Sep 18, 8:21 pm, Gim Ick <gimmicki...@gmail.com> wrote:

I am trying to parse a rss file. I use the rss module to do it.

Suppose this is the data file,

<item>
<title>Singapore Airlines Asia Travel - A345 All Business Class to
Asia</title>
<pubDate>Fri, 18 Sep 2009 22:56:33 +0000</pubDate>
<guid
isPermaLink="false">http://delicious.com/url/cc78bfa8bb00f50825d7cac52339375d#galvezcreative</guid>
<link>http://a345.singaporeair.com/</link>
<dc:creator><![CDATA[galvezcreative]]></dc:creator>
<comments>http://delicious.com/url/cc78bfa8bb00f50825d7cac52339375d</comments>
<wfw:commentRss>http://feeds.delicious.com/v2/rss/url/cc78bfa8bb00f50825d7cac52339375d</wfw:commentRss>
<source
url="http://feeds.delicious.com/v2/rss/galvezcreative">galvezcreative's
bookmarks</source>
<category
domain="http://delicious.com/galvezcreative/">Industry\-Airlines</category>
<category
domain="http://delicious.com/galvezcreative/">marketing</category>
</item>

How do I parse to get value in category( In the above example it is
Industry-Airlines and marketing).

When i try rss.items[0].category , I get the entire element( In the
above case, <category
domain="http://delicious.com/galvezcreative/">Industry\-Airlines</category>\)
--
Posted viahttp://www.ruby-forum.com/.

Gim_Ick · 22 September 2009 17:02

Thanks! I was using regular expressions to do this task!

···

--
Posted via http://www.ruby-forum.com/.

Topic		Replies	Views
Trying to parse an array of rss feeds without success ruby-talk	2	198	14 January 2009
RSS::Parser trouble ruby-talk	2	75	29 May 2007
Lil problem with RSS::Parser ruby-talk	1	109	12 March 2004
Get media from rss feed? ruby-talk	0	125	1 November 2005
[ANN] rubyrss-1.0 ruby-talk	0	84	9 June 2006

RSS Parser Help

Related topics