Parsing poorly written XML

I've found an application which dumps all the registry settings for
installed applications into an .xml file for me (you can find it as
"myuninstall" via google).

I can't figure out the ruby code I need from REXML to do the following:
-find a particular "product_name" in the xml file
-if found, save off the "uninstall_string" so I can execute it on the
commandline and uninstall the app.

I'm a newbie to Ruby, and I'm even newer to parsing XML with Ruby, so
I'm struggling a bit. It seems there should be a way to navigate within
that "item" after I match on the product name, but I can't figure out
how. Any guidance would be very VERY appreciated!

Here's a snippet of the problem xml:
<?xml version="1.0" encoding="ISO-8859-1" ?>
<installed_app version="1.0">
<item>
<entry_name>Devicescape</entry_name>
<product_name>DevicescapeDesktop</product_name>
<version>2.0.5</version>
<company>Devicescape</company>
<description>DevicescapeDesktop</description>
<obsolete>No</obsolete>
<uninstall>Yes</uninstall>
<installation_folder>C:\Program
Files\Devicescape\Devicescape</installation_folder>
<web_site>http://www.devicescape.com/</web_site>
<installation_date>1/22/2008 4:42:14 PM</installation_date>
<uninstall_string>MsiExec.exe
/I{C3BFE4FC-C9F6-494F-B3AB-ABD75556DDC5}</uninstall_string>
<quiet_uninstall>No</quiet_uninstall>
<registry_key>{C3BFE4FC-C9F6-494F-B3AB-ABD75556DDC5}</registry_key>
<installer>Windows Installer</installer>
<root_key>HKEY_LOCAL_MACHINE</root_key>
</item>
<item>

···

--
Posted via http://www.ruby-forum.com/.

require 'rexml/document'
doc = REXML::Document.new( DATA.read )

# Find 'item' elements at any level
# that have a child 'name' element whose text value is 'foo'
doc.each_element( "//item[name='foo']" ){ |item|
  # Find every child element of the item element
  # whose name is 'value', get the first one, and get its text
  puts item.get_elements( "value" ).first.text
}
#=> 42
#=> 17

__END__
<root>
  <item>
    <name>foo</name>
    <value>42</value>
  </item>
  <item>
    <name>bar</name>
    <value>17</value>
  </item>
  <item>
    <name>foo</name>
    <value>17</value>
  </item>
</root>

···

On Jan 23, 3:35 pm, Patrick Callahan <hoc...@youskate.com> wrote:

I've found an application which dumps all the registry settings for
installed applications into an .xml file for me (you can find it as
"myuninstall" via google).

I can't figure out the ruby code I need from REXML to do the following:
-find a particular "product_name" in the xml file
-if found, save off the "uninstall_string" so I can execute it on the
commandline and uninstall the app.

Or, using XPath more liberally (with the same DATA section as quoted
above):

require 'rexml/document'
doc = REXML::Document.new( DATA.read )
p REXML::XPath.match( doc, "//item[name='foo']/value/text()" )
#=> ["42", "17"]

···

On Jan 23, 4:07 pm, Phrogz <phr...@mac.com> wrote:

On Jan 23, 3:35 pm, Patrick Callahan <hoc...@youskate.com> wrote:

> I've found an application which dumps all the registry settings for
> installed applications into an .xml file for me (you can find it as
> "myuninstall" via google).

> I can't figure out the ruby code I need from REXML to do the following:
> -find a particular "product_name" in the xml file
> -if found, save off the "uninstall_string" so I can execute it on the
> commandline and uninstall the app.

require 'rexml/document'
doc = REXML::Document.new( DATA.read )

# Find 'item' elements at any level
# that have a child 'name' element whose text value is 'foo'
doc.each_element( "//item[name='foo']" ){ |item|
  # Find every child element of the item element
  # whose name is 'value', get the first one, and get its text
  puts item.get_elements( "value" ).first.text}

#=> 42
#=> 17

__END__
<root>
  <item>
    <name>foo</name>
    <value>42</value>
  </item>
  <item>
    <name>bar</name>
    <value>17</value>
  </item>
  <item>
    <name>foo</name>
    <value>17</value>
  </item>
</root>

Gavin Kistner wrote:

I've found an application which dumps all the registry settings for
installed applications into an .xml file for me (you can find it as
"myuninstall" via google).

I can't figure out the ruby code I need from REXML to do the following:
-find a particular "product_name" in the xml file
-if found, save off the "uninstall_string" so I can execute it on the
commandline and uninstall the app.

require 'rexml/document'
doc = REXML::Document.new( DATA.read )

# Find 'item' elements at any level
# that have a child 'name' element whose text value is 'foo'
doc.each_element( "//item[name='foo']" ){ |item|
  # Find every child element of the item element
  # whose name is 'value', get the first one, and get its text
  puts item.get_elements( "value" ).first.text
}
#=> 42
#=> 17

__END__
<root>
  <item>
    <name>foo</name>
    <value>42</value>
  </item>
  <item>
    <name>bar</name>
    <value>17</value>
  </item>
</root>

Wow! That worked on the first try! I knew there had to be a way, but
I'm trying to whip this script out and I didn't want to spend a week
studying XML parsing options in Ruby to do it. You've saved me a bunch
of effort. Now I just need to figure out how to gracefully bail out
and continue on with the rest of the script if "foo" isn't there at all.
Uhm... I seem to be having trouble with that piece too. I'm sure it's
easier than the first problem, but maybe it's just 'cause my brain is
cooked right now...

Thanks a bunch!

···

On Jan 23, 3:35 pm, Patrick Callahan <hoc...@youskate.com> wrote:

--
Posted via http://www.ruby-forum.com/\.

The first piece of code I wrote has a potential to be cranky.

  doc.each_element( "//item[name='foo']" ){ |item|
    puts item.get_elements( "value" ).first.text
  }

If no <item>...<name>foo</name> elements can be found, the block will
never be called. However, if it finds such an element, but the <item>
doesn't have a <value> child, then item.get_elements( "value" ) will
return an empty array. Calling .first on an empty array will return
nil, and then you'll have an error when you try to run the 'text'
method of nil.

You could modify the above code to guard for this:
  doc.each_element( "//item[name='foo']" ){ |item|
    first_value = item.get_elements( "value" ).first
    if first_value
      puts first_value.text
    end
  }

However, the second (single XPath query) solution I wrote is both
shorter and also more fault tolerant. If you assume that there's only
one <item> that will match the <name>foo</name> criterion, then you
can change from REXML::XPath.match (which returns an array, possibly
empty) to REXML::XPath.first (which returns either a single value, or
nil).

Something like:
  value = REXML::XPath.first( doc, "//item[name='foo']/value/text()" )
  if value
    # do my stuff here
  end

Or, to use your exact case:

  # Returns the uninstall command run (if found),
  # or nil if the necessary info couldn't be found.
  def uninstall_product( xml_string, prod_name )
    require 'rexml'
    include REXML
    uninstall_command = XPath.first(
      Document.new( xml_string ),
      "//item[product_name='#{prod_name}']/uninstall_string/text()"
    )
    if uninstall_command
      # Do the uninstall; maybe as simple as:
      `#{uninstall_command}`
    else
      warn "Either product '#{prod_name}' doesn't exist, " +
           "or it doesn't have an uninstall string."
    end
    uninstall_command
  end

···

On Jan 23, 5:08 pm, Patrick Callahan <hoc...@youskate.com> wrote:

Gavin Kistner wrote:
Wow! That worked on the first try! I knew there had to be a way, but
I'm trying to whip this script out and I didn't want to spend a week
studying XML parsing options in Ruby to do it. You've saved me a bunch
of effort. Now I just need to figure out how to gracefully bail out
and continue on with the rest of the script if "foo" isn't there at all.
Uhm... I seem to be having trouble with that piece too. I'm sure it's
easier than the first problem, but maybe it's just 'cause my brain is
cooked right now...