REXML to extract only values from XML?

Say I have an XML record like

<Subscriber>
   <name>
     <firstName>CHRIS</firstName>
     <lastName>MCMAHON</lastName>
   </name>
   <ssn>111223333</ssn>
   </Subscriber>

I'd like to extract each value of each tag (without regard to
hierarchy) and add it to an array:

["CHRIS","MCMAHON","111223333"]

The REXML docs don't seem to address this. I've tried various
methods, but I can't seem to find the way to address only the contents
of each tag. Any suggestions would be welcome

require 'rexml/document'

xml = "<Subscriber>
    <name>
      <firstName>CHRIS</firstName>
      <lastName>MCMAHON</lastName>
    </name>
    <ssn>111223333</ssn>
    </Subscriber> "

p REXML::Document.new( xml ).elements.to_a( "//*[text()]").map { |e|
  e.text.strip.empty? ? nil : e.text.strip}.compact

Note that elements that appear to have only child elements also have newline characters, which you probably don't want.

(There may be a better way to ignore that sort of white space.)

James

···

christopher.mcmahon@gmail.com wrote:

Say I have an XML record like

<Subscriber>
   <name>
     <firstName>CHRIS</firstName>
     <lastName>MCMAHON</lastName>
   </name>
   <ssn>111223333</ssn>
   </Subscriber>

I'd like to extract each value of each tag (without regard to
hierarchy) and add it to an array:

["CHRIS","MCMAHON","111223333"]

The REXML docs don't seem to address this. I've tried various
methods, but I can't seem to find the way to address only the contents
of each tag. Any suggestions would be welcome

--

http://www.ruby-doc.org - The Ruby Documentation Site
http://www.rubyxml.com - News, Articles, and Listings for Ruby & XML
http://www.rubystuff.com - The Ruby Store for Ruby Stuff
http://www.jamesbritt.com - Playing with Better Toys

xml = "<Subscriber>
    <name>
      <firstName> CHRIS </firstName>
      <lastName>MCMAHON</lastName>
    </name>
    <ssn>111223333</ssn>
    </Subscriber> "

p xml.split( / <.*?> (?: \s* <.*?> )* /xm )[1 .. -2]

  ---> [" CHRIS ", "MCMAHON", "111223333"]

···

christopher.mcmahon@gmail.com wrote:

Say I have an XML record like

<Subscriber>
   <name>
     <firstName>CHRIS</firstName>
     <lastName>MCMAHON</lastName>
   </name>
   <ssn>111223333</ssn>
   </Subscriber>

I'd like to extract each value of each tag (without regard to
hierarchy) and add it to an array:

["CHRIS","MCMAHON","111223333"]

The REXML docs don't seem to address this. I've tried various
methods, but I can't seem to find the way to address only the contents
of each tag. Any suggestions would be welcome

Works nicely, thanks!
-Chris