How can i search value from xml file such as I want to find from *pubdate *and
return* **biblioentry
*Please give me some source code for further study*
···
**
<?xml version="1.0" encoding="ISO-8859-15"?>
<!DOCTYPE bibliography PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
<bibliography id="personal_identity">
<biblioentry id="FHIW13C-1234">
<author>
<firstname>Godfrey</firstname>
<surname>Vesey</surname>
</author>
<title>Personal Identity: A Philosophical Analysis</title>
<publisher>
<publishername>Cornell University Press</publishername>
</publisher>
<pubdate>1977</pubdate>
</biblioentry>
<biblioentry id="FHIW13C-125">
<author>
<firstname>Geoffrey</firstname>
<surname>Madell</surname>
</author>
<title>The Identity of the Self</title>
<publisher>
<publishername>Edinburgh University Press</publishername>
</publisher>
<pubdate>1981</pubdate>
</biblioentry>
<biblioentry id="FHIW13C-1260">
<author>
<firstname>Sydney</firstname>
<surname>Shoemaker</surname>
</author>
<author>
<firstname>Richard</firstname>
<surname>Swinburne</surname>
</author>
<title>Personal Identity</title>
<publisher>
<publishername>Basil Blackwell</publishername>
</publisher>
<pubdate>1984</pubdate>
</biblioentry>
<biblioentry id="FHIW13C-1288-3">
<author>
<firstname>Jonathan</firstname>
<surname>Glover</surname>
</author>
<title>The Philosophy and Psychology of Personal Identity</title>
<publisher>
<publishername>Penguin</publishername>
</publisher>
<pubdate>1988</pubdate>
</biblioentry>
<biblioentry id="FHIW13C-1289-1">
<author>
<firstname>Harold</firstname>
<othername>W.</othername>
<surname>Noonan</surname>
</author>
<title>Personal Identity</title>
<publisher>
<publishername>Routledge</publishername>
</publisher>
<pubdate>1989</pubdate>
</biblioentry>
<biblioentry id="FHIW13C-1291-2">
<author>
<firstname>Ren</firstname>
<surname>Marres</surname>
</author>
<title>Persoonlijke identiteit na het verval van de ziel</title>
<publisher>
<publishername>Coutinho</publishername>
</publisher>
<pubdate>1991</pubdate>
</biblioentry>
<biblioentry id="FHIW13C-1293-1">
<author>
<firstname>James</firstname>
<surname>Baillie</surname>
</author>
<title>Problems in Personal Identity</title>
<publisher>
<publishername>Paragon House</publishername>
</publisher>
<pubdate>1993</pubdate>
</biblioentry>
<biblioentry id="FHIW13C-1298-4">
<author>
<firstname>Brian</firstname>
<surname>Garrett</surname>
</author>
<title>Personal Identity and Self-Consciousness</title>
<publisher>
<publishername>Routledge</publishername>
</publisher>
<pubdate>1998</pubdate>
</biblioentry>
<biblioentry id="FHIW13CX-1202-1">
<author>
<firstname>John</firstname>
<surname>Perry</surname>
</author>
<title>Identity, Personal Identity, and the Self</title>
<publisher>
<publishername>Hackett</publishername>
</publisher>
<pubdate>2002</pubdate>
</biblioentry>
</bibliography>
How can i search value from xml file such as I want to find from *pubdate *and
return* **biblioentry
*Please give me some source code for further study*
**
<?xml version="1.0" encoding="ISO-8859-15"?>
<!DOCTYPE bibliography PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
<bibliography id="personal_identity">
<biblioentry id="FHIW13C-1234">
<author>
<firstname>Godfrey</firstname>
<surname>Vesey</surname>
</author>
<title>Personal Identity: A Philosophical Analysis</title>
<publisher>
<publishername>Cornell University Press</publishername>
</publisher>
<pubdate>1977</pubdate>
</biblioentry>
<biblioentry id="FHIW13C-125">
<author>
<firstname>Geoffrey</firstname>
<surname>Madell</surname>
</author>
<title>The Identity of the Self</title>
<publisher>
<publishername>Edinburgh University Press</publishername>
</publisher>
<pubdate>1981</pubdate>
</biblioentry>
<biblioentry id="FHIW13C-1260">
<author>
<firstname>Sydney</firstname>
<surname>Shoemaker</surname>
</author>
<author>
<firstname>Richard</firstname>
<surname>Swinburne</surname>
</author>
<title>Personal Identity</title>
<publisher>
<publishername>Basil Blackwell</publishername>
</publisher>
<pubdate>1984</pubdate>
</biblioentry>
<biblioentry id="FHIW13C-1288-3">
<author>
<firstname>Jonathan</firstname>
<surname>Glover</surname>
</author>
<title>The Philosophy and Psychology of Personal Identity</title>
<publisher>
<publishername>Penguin</publishername>
</publisher>
<pubdate>1988</pubdate>
</biblioentry>
class String
def xtag(s)
scan( %r! ( < #{s} [^>]* > ) ( .*? ) </ #{s} > !mx )
end
end
gets(nil).xtag("biblioentry").each { |tag,data|
if data.xtag("pubdate")[0][1] > "1984"
print tag, data, "\n"
end
}
How can i search value from xml file such as I want to find from *pubdate *and
return* **biblioentry
*Please give me some source code for further study*
**
<?xml version="1.0" encoding="ISO-8859-15"?>
<!DOCTYPE bibliography PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
<bibliography id="personal_identity">
<biblioentry id="FHIW13C-1234">
<author>
<firstname>Godfrey</firstname>
<surname>Vesey</surname>
</author>
<title>Personal Identity: A Philosophical Analysis</title>
<publisher>
<publishername>Cornell University Press</publishername>
</publisher>
<pubdate>1977</pubdate>
</biblioentry>
class String
def xtag(s)
scan( %r! < #{s} (?: \s+ ( [^>]* ) )? >
( .*? ) </ #{s} > !mx ).
map{ |attr, data| h = { }
if attr
attr.scan( %r! ( \S+ ) = " ( [^"]* ) " !x ){ |k,v|
h[k] = v }
end
[ h, data ]
}
end
end
gets(nil).xtag("biblioentry").each { |attr,data|
if data.xtag("pubdate")[0][1] > "1984"
print attr["id"], data, "\n"
end
}
How can i search value from xml file such as I want to find from *pubdate *and
return* **biblioentry
*Please give me some source code for further study*
**
<?xml version="1.0" encoding="ISO-8859-15"?>
<!DOCTYPE bibliography PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
<bibliography id="personal_identity">
<biblioentry id="FHIW13C-1234">
<author>
<firstname>Godfrey</firstname>
<surname>Vesey</surname>
</author>
<title>Personal Identity: A Philosophical Analysis</title>
<publisher>
<publishername>Cornell University Press</publishername>
</publisher>
<pubdate>1977</pubdate>
</biblioentry>
class String
def xtag(s)
scan( %r! < #{s} (?: \s+ ( [^>]* ) )? >
( .*? ) </ #{s} > !mx ).
map{ |attr, data| h = { }
if attr
attr.scan( %r! ( \S+ ) = " ( [^"]* ) " !x ){ |k,v|
h[k] = v }
end
[ h, data ]
}
end
end
gets(nil).xtag("biblioentry").each { |attr,data|
if data.xtag("pubdate")[0][1] > "1984"
print attr["id"], data, "\n"
end
}
How can i search value from xml file such as I want to find from *pubdate *and
return* **biblioentry
*Please give me some source code for further study*
**
<?xml version="1.0" encoding="ISO-8859-15"?>
<!DOCTYPE bibliography PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
<bibliography id="personal_identity">
<biblioentry id="FHIW13C-1234">
<author>
<firstname>Godfrey</firstname>
<surname>Vesey</surname>
</author>
<title>Personal Identity: A Philosophical Analysis</title>
<publisher>
<publishername>Cornell University Press</publishername>
</publisher>
<pubdate>1977</pubdate>
</biblioentry>
class String
def xtag(s)
scan( %r!
< #{s} (?: \s+ ( [^>]* ) )? / >
>
< #{s} (?: \s+ ( [^>]* ) )? >
( .*? ) </ #{s} >
!mx ).
map{ |unpaired, attr, data| h = { }
attr = ( unpaired || attr )
if attr
attr.scan( %r! ( \S+ ) = " ( [^"]* ) " !x ){ |k,v|
h[k] = v }
end
[ h, data ]
}
end
def xshow( depth=0 )
text = ""
split( /<([^>]*)>/ ).each_with_index{ |s,i|
if 0 == i % 2
text = s.strip
else
indent = " " * ( depth * 2 )
case
when s[0,1] == "/"
depth -= 1
puts text.map{|x| indent + x.strip } if text != ""
when s[-1,1] == "/"
puts indent + s
else
puts indent + s
depth += 1
end
end
}
end
end
gets(nil).xtag("biblioentry").each { |attr,data|
if data.xtag("pubdate")[0][1] > "1997"
puts attr["id"]
data.xshow( 1 )
end
}
Output:
FHIW13C-1298-4
author
firstname
Brian
surname
Garrett
title
Personal Identity and Self-Consciousness
publisher
publishername
Routledge
pubdate
1998
FHIW13CX-1202-1
author
firstname
John
surname
Perry
title
Identity, Personal Identity, and the Self
publisher
publishername
Hackett
pubdate
2002
How can i search value from xml file such as I want to find from *pubdate *and
return* **biblioentry
*Please give me some source code for further study*
**
<?xml version="1.0" encoding="ISO-8859-15"?>
<!DOCTYPE bibliography PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
<bibliography id="personal_identity">
<biblioentry id="FHIW13C-1234">
<author>
<firstname>Godfrey</firstname>
<surname>Vesey</surname>
</author>
<title>Personal Identity: A Philosophical Analysis</title>
<publisher>
<publishername>Cornell University Press</publishername>
</publisher>
<pubdate>1977</pubdate>
</biblioentry>
class String
def xtag(s)
end
def xshow( depth=0 )
end
gets(nil).xtag("biblioentry").each { |attr,data|
if data.xtag("pubdate")[0][1] > "1997"
puts attr["id"]
data.xshow( 1 )
end
}
Still doesn't support namespaces, entities and CDATA...
(Or nested tags like <div><div></div></div>.)
Every time I've tried to use REXML for something I've found it to be
incredibly slow and painful on large files. Usually I start with
REXML, get annoyed, and then install QuiXML
(http://quixml.rubyforge.org/\). Though it doesn't have bells and
whistles, it's a heck of a lot faster. Anyhow, I'm certainly looking
forward to your libxml2 bindings!
-Pawel
···
On 2/21/06, Ross Bamford <rossrt@roscopeco.co.uk> wrote:
(Also, REXML does support XPath, so you should be able to modify the
above to work with that. Just to be sure, I tried it 100 times over:
### XPath ###
user system total real
rexml 9.840000 0.080000 9.920000 ( 10.046963)
libxml2 0.090000 0.000000 0.090000 ( 0.139592)
class String
def xtag(s)
scan( %r! ( < #{s} [^>]* > ) ( .*? ) </ #{s} > !mx )
end
end
gets(nil).xtag("biblioentry").each { |tag,data|
if data.xtag("pubdate")[0][1] > "1984"
print tag, data, "\n"
end
}
I hope you are joking...
Actually, in real-world usage, Mark Pilgrim's Python Feed Parser[0] falls back to regular expressions to get the data required if the XML is not well-formed.
Admittedly this is a real problem for RSS hackers, less so with other XML messages, but the approach does have merit if (a) you can't guarantee well-formedness and (b) you absolutely have to have the data.