RSS Parser Help

I am trying to parse a rss file. I use the rss module to do it.

Suppose this is the data file,

  <item>
      <title>Singapore Airlines Asia Travel - A345 All Business Class to
Asia</title>
      <pubDate>Fri, 18 Sep 2009 22:56:33 +0000</pubDate>
      <guid
isPermaLink="false">http://delicious.com/url/cc78bfa8bb00f50825d7cac52339375d#galvezcreative</guid>
      <link>http://a345.singaporeair.com/</link>
      <dc:creator><![CDATA[galvezcreative]]></dc:creator>
      <comments>http://delicious.com/url/cc78bfa8bb00f50825d7cac52339375d</comments>
      <wfw:commentRss>http://feeds.delicious.com/v2/rss/url/cc78bfa8bb00f50825d7cac52339375d</wfw:commentRss>
      <source
url="http://feeds.delicious.com/v2/rss/galvezcreative">galvezcreative's
bookmarks</source>
      <category
domain="http://delicious.com/galvezcreative/">Industry-Airlines</category>
      <category
domain="http://delicious.com/galvezcreative/">marketing</category>
    </item>

How do I parse to get value in category( In the above example it is
Industry-Airlines and marketing).

When i try rss.items[0].category , I get the entire element( In the
above case, <category
domain="http://delicious.com/galvezcreative/">Industry-Airlines</category>)

···

--
Posted via http://www.ruby-forum.com/.

Hi,

In <8a1c683053d981d86fb3dcea416ca87d@ruby-forum.com>
  "RSS Parser Help.." on Sat, 19 Sep 2009 09:21:36 +0900,

I am trying to parse a rss file. I use the rss module to do it.

Suppose this is the data file,

  <item>
      <title>Singapore Airlines Asia Travel - A345 All Business Class to
Asia</title>
      <pubDate>Fri, 18 Sep 2009 22:56:33 +0000</pubDate>
      <guid
isPermaLink="false">http://delicious.com/url/cc78bfa8bb00f50825d7cac52339375d#galvezcreative&lt;/guid&gt;
      <link>http://a345.singaporeair.com/&lt;/link&gt;
      <dc:creator><![CDATA[galvezcreative]]></dc:creator>
      <comments>http://delicious.com/url/cc78bfa8bb00f50825d7cac52339375d&lt;/comments&gt;
      <wfw:commentRss>http://feeds.delicious.com/v2/rss/url/cc78bfa8bb00f50825d7cac52339375d&lt;/wfw:commentRss&gt;
      <source
url="http://feeds.delicious.com/v2/rss/galvezcreative&quot;&gt;galvezcreative&#39;s
bookmarks</source>
      <category
domain="http://delicious.com/galvezcreative/&quot;&gt;Industry\-Airlines&lt;/category&gt;
      <category
domain="http://delicious.com/galvezcreative/&quot;&gt;marketing&lt;/category&gt;
    </item>

How do I parse to get value in category( In the above example it is
Industry-Airlines and marketing).

  rss.items[0].categories.each do |category|
    p category.content
  end

When i try rss.items[0].category , I get the entire element( In the
above case, <category
domain="http://delicious.com/galvezcreative/&quot;&gt;Industry\-Airlines&lt;/category&gt;\)

rss.items[0].category returns Category object not "<category
...>...</category>" string. (Hint: Category object has #to_s
method that returns "<category ...>...</category>" string)

Thanks,

···

Gim Ick <gimmickivek@gmail.com> wrote:
--
kou

Alternate biterscripting script.

# Script category.txt
var str rss ; cat "file.rss" > $rss
while ( { sen -r -c "^^" $rss } > 0 )
do
    var str category ; stex -r -c "^<category&\>&</category\>^" $rss >
$category
    stex -r -c "^<category&\>^]" $category > null ; stex -r -c "[^</
category\>^" $category > null
    echo $category
done

For documentation on stex (string extractor) command, see
http://www.biterscripting.com/helppages/stex.html

Richard

···

On Sep 18, 8:21 pm, Gim Ick <gimmicki...@gmail.com> wrote:

I am trying to parse a rss file. I use the rss module to do it.

Suppose this is the data file,

<item>
<title>Singapore Airlines Asia Travel - A345 All Business Class to
Asia</title>
<pubDate>Fri, 18 Sep 2009 22:56:33 +0000</pubDate>
<guid
isPermaLink="false">http://delicious.com/url/cc78bfa8bb00f50825d7cac52339375d#galvezcreative&lt;/guid&gt;
<link>http://a345.singaporeair.com/&lt;/link&gt;
<dc:creator><![CDATA[galvezcreative]]></dc:creator>
<comments>http://delicious.com/url/cc78bfa8bb00f50825d7cac52339375d&lt;/comments&gt;
<wfw:commentRss>http://feeds.delicious.com/v2/rss/url/cc78bfa8bb00f50825d7cac52339375d&lt;/wfw:commentRss&gt;
<source
url="http://feeds.delicious.com/v2/rss/galvezcreative&quot;&gt;galvezcreative&#39;s
bookmarks</source>
<category
domain="http://delicious.com/galvezcreative/&quot;&gt;Industry\-Airlines&lt;/category&gt;
<category
domain="http://delicious.com/galvezcreative/&quot;&gt;marketing&lt;/category&gt;
</item>

How do I parse to get value in category( In the above example it is
Industry-Airlines and marketing).

When i try rss.items[0].category , I get the entire element( In the
above case, <category
domain="http://delicious.com/galvezcreative/&quot;&gt;Industry\-Airlines&lt;/category&gt;\)
--
Posted viahttp://www.ruby-forum.com/.

Thanks! I was using regular expressions to do this task!

···

--
Posted via http://www.ruby-forum.com/.