Hi Anne,
welcome back!
Thank you, the first formulation works.
I had tried the second one on the complete xml file and it does not
work.
Do you have an idea why? Is there a typo I am not seeing?
Here is a test file a little closer to the XML file I am working with
require "rexml/document"
include REXML
string = <<EOF
<dataformats>
<dataformat>
<name>NARSAD recognition</name>
<fileidentifiers>
<fileidentifier>NARSAD</fileidentifier>
</fileidentifiers>
</dataformat>
<dataformat>
<name>SPFT</name>
<fileidentifiers>
<fileidentifier>SPFT</fileidentifier>
<fileidentifier>SPPT</fileidentifier>
</fileidentifiers>
</dataformat>
</dataformats>
EOF
doc = Document.new string
xpathquery="//dataformat[contains(., 'SPPT')]"
p 'yours1'
p XPath.first(doc,xpathquery).to_s
xpathquery="//dataformat[contains(fileidentifiers/
fileidentifier,'SPPT')]"
p 'yours2'
p XPath.first(doc,xpathquery).to_s
I believe "contains" is the wrong function as it does a textual
comparison and I have no idea whether a node is actually allowed as
input. I believe the correct XPath expression is this:
"//dataformat[descendant::fileidentifier[text()='SPPT']]"
Here are some expressions that you may want to try:
# find the correct fileidentifier
XPath.each doc, "//fileidentifier[text()='SPPT']" do |elm|
puts elm
end
puts '-------------'
# go upwards from there to find the dataformat node
XPath.each doc, "//fileidentifier[text()='SPPT']/ancestor::dataformat" do |elm|
puts elm
end
puts '-------------'
# select all dataformats that contain a fileidentifier with text "SPPT"
# this seems to best reflect what you want
XPath.each doc,
"//dataformat[descendant::fileidentifier[text()='SPPT']]" do |elm|
puts elm
end
Btw, I have these bookmarked and they serve me well with regard to
XPath issues (I always have to look them up):
http://www.w3schools.com/xpath/default.asp
http://www.zvon.org/xxl/XPathTutorial/General/examples.html
(I use the first one most of the time.)
Kind regards
robert
···
2008/8/11 anne001 <anne@wjh.harvard.edu>:
--
use.inject do |as, often| as.you_can - without end