Parsing an XML document

angela_ebirim · 18 August 2016 12:38

Hi,

Hoping someone can help me.

What is the best way to parse an XML document? I've been using Nokogiri and
I'm still having difficulty extracting this data

For example:

<Disruption id="134309">
   <status>Active</status>
   <severity>Moderate</severity>
   <levelOfInterest>Medium</levelOfInterest>
   <category>Infrastructure Issue</category>
   <subCategory>Traffic Signal</subCategory>
   <startTime>2016-08-17T20:30:00Z</startTime>
   <location>[A3212] ABCD Embankment (W3) (KKKKKKKK &
CCCCCCCCC)</location>
   <corridor>Western Cross Route</corridor>
   <comments>A3212] ABCD Embankment (W3) (All Directions) at the junction
of Albert Bridge - Traffic lights are all out</comments>
   <currentUpdate>Approach with care</currentUpdate>
   <remarkTime>2016-08-17T20:32:19Z</remarkTime>
   <lastModTime>2016-08-17T20:34:11Z</lastModTime>
   <CauseArea>
    <DisplayPoint>
     <Point>
      <coordinatesEN>527352.119,177652.696</coordinatesEN>
      <coordinatesLL>-.167329,51.483426</coordinatesLL>
     </Point>
</Disruption id>

If I was trying to extract the coordinatesEN data in the sample, what is
way to do this?

Many thanks

Karim_Tarek · 18 August 2016 13:00

Have you tried this GitHub - jnunemaker/crack: Really simple JSON and XML parsing, ripped from Merb and Rails.? It will parse your
XML file to Ruby hash and you can then take it from there.

···

On Thu, 18 Aug 2016 at 08:39 angela ebirim <cebirim@gmail.com> wrote:

Hi,

Hoping someone can help me.

What is the best way to parse an XML document? I've been using Nokogiri
and I'm still having difficulty extracting this data

For example:

<Disruption id="134309">
   <status>Active</status>
   <severity>Moderate</severity>
   <levelOfInterest>Medium</levelOfInterest>
   <category>Infrastructure Issue</category>
   <subCategory>Traffic Signal</subCategory>
   <startTime>2016-08-17T20:30:00Z</startTime>
   <location>[A3212] ABCD Embankment (W3) (KKKKKKKK &
CCCCCCCCC)</location>
   <corridor>Western Cross Route</corridor>
   <comments>A3212] ABCD Embankment (W3) (All Directions) at the junction
of Albert Bridge - Traffic lights are all out</comments>
   <currentUpdate>Approach with care</currentUpdate>
   <remarkTime>2016-08-17T20:32:19Z</remarkTime>
   <lastModTime>2016-08-17T20:34:11Z</lastModTime>
   <CauseArea>
    <DisplayPoint>
     <Point>
      <coordinatesEN>527352.119,177652.696</coordinatesEN>
      <coordinatesLL>-.167329,51.483426</coordinatesLL>
     </Point>
</Disruption id>

If I was trying to extract the coordinatesEN data in the sample, what is
way to do this?

Many thanks

Unsubscribe: <mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk>

angela_ebirim · 18 August 2016 13:06

sounds perfect!

thanks karim!

···

On 18 August 2016 at 14:00, Karim Tarek <karimmtarek@gmail.com> wrote:

Have you tried this https://github.com/jnunemaker/crack? It will parse
your XML file to Ruby hash and you can then take it from there.

On Thu, 18 Aug 2016 at 08:39 angela ebirim <cebirim@gmail.com> wrote:

Hi,

Hoping someone can help me.

What is the best way to parse an XML document? I've been using Nokogiri
and I'm still having difficulty extracting this data

For example:

<Disruption id="134309">
   <status>Active</status>
   <severity>Moderate</severity>
   <levelOfInterest>Medium</levelOfInterest>
   <category>Infrastructure Issue</category>
   <subCategory>Traffic Signal</subCategory>
   <startTime>2016-08-17T20:30:00Z</startTime>
   <location>[A3212] ABCD Embankment (W3) (KKKKKKKK &
CCCCCCCCC)</location>
   <corridor>Western Cross Route</corridor>
   <comments>A3212] ABCD Embankment (W3) (All Directions) at the junction
of Albert Bridge - Traffic lights are all out</comments>
   <currentUpdate>Approach with care</currentUpdate>
   <remarkTime>2016-08-17T20:32:19Z</remarkTime>
   <lastModTime>2016-08-17T20:34:11Z</lastModTime>
   <CauseArea>
    <DisplayPoint>
     <Point>
      <coordinatesEN>527352.119,177652.696</coordinatesEN>
      <coordinatesLL>-.167329,51.483426</coordinatesLL>
     </Point>
</Disruption id>

If I was trying to extract the coordinatesEN data in the sample, what is
way to do this?

Many thanks

Unsubscribe: <mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk>

Unsubscribe: <mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk>

Robert_K1 · 18 August 2016 13:38

Hoping someone can help me.

Yes, we can!

What is the best way to parse an XML document? I've been using Nokogiri and
I'm still having difficulty extracting this data

I suggest to dig into XPath, e.g. here:

If I was trying to extract the coordinatesEN data in the sample, what is way
to do this?

It is actually fairly easy:

require 'nokogiri'

doc = Nokogiri.XML(<<'DOC')
<Disruption id="134309">
   <status>Active</status>
   <severity>Moderate</severity>
   <levelOfInterest>Medium</levelOfInterest>
   <category>Infrastructure Issue</category>
   <subCategory>Traffic Signal</subCategory>
   <startTime>2016-08-17T20:30:00Z</startTime>
   <location>[A3212] ABCD Embankment (W3) (KKKKKKKK & CCCCCCCCC)</location>
   <corridor>Western Cross Route</corridor>
   <comments>A3212] ABCD Embankment (W3) (All Directions) at the
junction of Albert Bridge - Traffic lights are all out</comments>
   <currentUpdate>Approach with care</currentUpdate>
   <remarkTime>2016-08-17T20:32:19Z</remarkTime>
   <lastModTime>2016-08-17T20:34:11Z</lastModTime>
   <CauseArea>
    <DisplayPoint>
     <Point>
      <coordinatesEN>527352.119,177652.696</coordinatesEN>
      <coordinatesLL>-.167329,51.483426</coordinatesLL>
     </Point>
</Disruption id>
DOC

puts doc.at_xpath('//coordinatesEN/text()')

Kind regards

robert

···

On Thu, Aug 18, 2016 at 2:38 PM, angela ebirim <cebirim@gmail.com> wrote:

--
[guy, jim, charlie].each {|him| remember.him do |as, often| as.you_can
- without end}
http://blog.rubybestpractices.com/

Leam_Hall · 18 August 2016 13:44

I had to read up on XPATH for Nokogiri to make more sense. I'm still learning, but it did help.

Leam

···

On 08/18/16 09:38, Robert Klemme wrote:

On Thu, Aug 18, 2016 at 2:38 PM, angela ebirim <cebirim@gmail.com> wrote:

What is the best way to parse an XML document? I've been using Nokogiri and
I'm still having difficulty extracting this data

I suggest to dig into XPath, e.g. here:
XML and XPath

Robert_K1 · 18 August 2016 13:52

Good thing is XPath is a standard, so you can use it in _many_ XML
tools, e.g. xmlstarlet (command line), various XML editors, for XSLT
it is a must. And I also find you that working with XPath improves
one's understanding of XML conceptually.

Kind regards

robert

···

On Thu, Aug 18, 2016 at 3:44 PM, Leam Hall <leamhall@gmail.com> wrote:

On 08/18/16 09:38, Robert Klemme wrote:

I suggest to dig into XPath, e.g. here:
XML and XPath

I had to read up on XPATH for Nokogiri to make more sense. I'm still
learning, but it did help.

--
[guy, jim, charlie].each {|him| remember.him do |as, often| as.you_can
- without end}
http://blog.rubybestpractices.com/

Topic		Replies	Views
Basic xml parsing question ruby-talk	3	80	27 March 2009
Help with Nokogiri ruby-talk	4	122	13 November 2013
Ruby and XML ruby-talk	8	91	5 September 2011
Nokogiri help parsing HTML ruby-talk	17	485	29 March 2013
Parsing xml dict files ruby-talk	6	134	15 June 2012

Parsing an XML document

Related Topics