How can I search value from xml

How can i search value from xml file such as I want to find from *pubdate *and
return* **biblioentry
*Please give me some source code for further study*

···

**
<?xml version="1.0" encoding="ISO-8859-15"?>
<!DOCTYPE bibliography PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
          "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
<bibliography id="personal_identity">
    <biblioentry id="FHIW13C-1234">
      <author>
        <firstname>Godfrey</firstname>
        <surname>Vesey</surname>
      </author>
      <title>Personal Identity: A Philosophical Analysis</title>
      <publisher>
        <publishername>Cornell University Press</publishername>
      </publisher>
      <pubdate>1977</pubdate>
   </biblioentry>
   <biblioentry id="FHIW13C-125">
      <author>
        <firstname>Geoffrey</firstname>
        <surname>Madell</surname>
      </author>
      <title>The Identity of the Self</title>
      <publisher>
        <publishername>Edinburgh University Press</publishername>
      </publisher>
      <pubdate>1981</pubdate>
   </biblioentry>
   <biblioentry id="FHIW13C-1260">
      <author>
        <firstname>Sydney</firstname>
        <surname>Shoemaker</surname>
      </author>
      <author>
         <firstname>Richard</firstname>
         <surname>Swinburne</surname>
      </author>
      <title>Personal Identity</title>
      <publisher>
        <publishername>Basil Blackwell</publishername>
      </publisher>
      <pubdate>1984</pubdate>
    </biblioentry>
    <biblioentry id="FHIW13C-1288-3">
      <author>
        <firstname>Jonathan</firstname>
        <surname>Glover</surname>
      </author>
      <title>The Philosophy and Psychology of Personal Identity</title>
      <publisher>
        <publishername>Penguin</publishername>
      </publisher>
      <pubdate>1988</pubdate>
    </biblioentry>
    <biblioentry id="FHIW13C-1289-1">
      <author>
        <firstname>Harold</firstname>
        <othername>W.</othername>
        <surname>Noonan</surname>
      </author>
      <title>Personal Identity</title>
      <publisher>
        <publishername>Routledge</publishername>
      </publisher>
      <pubdate>1989</pubdate>
    </biblioentry>
    <biblioentry id="FHIW13C-1291-2">
      <author>
        <firstname>Ren</firstname>
        <surname>Marres</surname>
      </author>
      <title>Persoonlijke identiteit na het verval van de ziel</title>
      <publisher>
        <publishername>Coutinho</publishername>
      </publisher>
      <pubdate>1991</pubdate>
    </biblioentry>
    <biblioentry id="FHIW13C-1293-1">
      <author>
        <firstname>James</firstname>
        <surname>Baillie</surname>
      </author>
      <title>Problems in Personal Identity</title>
      <publisher>
        <publishername>Paragon House</publishername>
      </publisher>
      <pubdate>1993</pubdate>
    </biblioentry>
    <biblioentry id="FHIW13C-1298-4">
      <author>
        <firstname>Brian</firstname>
        <surname>Garrett</surname>
      </author>
      <title>Personal Identity and Self-Consciousness</title>
      <publisher>
        <publishername>Routledge</publishername>
      </publisher>
      <pubdate>1998</pubdate>
    </biblioentry>
    <biblioentry id="FHIW13CX-1202-1">
      <author>
        <firstname>John</firstname>
        <surname>Perry</surname>
      </author>
      <title>Identity, Personal Identity, and the Self</title>
      <publisher>
        <publishername>Hackett</publishername>
      </publisher>
      <pubdate>2002</pubdate>
    </biblioentry>
</bibliography>

*Thank You
--
Artit Satanakulpanich

http://www.rubybox.net (Thai Language)

Artit Satanakulpanich wrote:

How can i search value from xml file such as I want to find from
*pubdate *and return* **biblioentry

http://www.germane-software.com/software/rexml/

    robert

You'll propably want to use REXML and XPath:

require 'rexml/document'
require 'rexml/xpath'

include REXML

bibliography = Document.new( ARGV[0] )

XPath.each( bibliography, "/biblioentry[pubdate > 1993]") do

biblioentry>

  # do something with biblioentry here
end

Not entirely sure if that works, as the PC I'm on doesnt have Ruby
installed :frowning:

Scott

Artit Satanakulpanich wrote:

How can i search value from xml file such as I want to find from *pubdate *and
return* **biblioentry
*Please give me some source code for further study*
**
<?xml version="1.0" encoding="ISO-8859-15"?>
<!DOCTYPE bibliography PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
          "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd&quot;&gt;
<bibliography id="personal_identity">
    <biblioentry id="FHIW13C-1234">
      <author>
        <firstname>Godfrey</firstname>
        <surname>Vesey</surname>
      </author>
      <title>Personal Identity: A Philosophical Analysis</title>
      <publisher>
        <publishername>Cornell University Press</publishername>
      </publisher>
      <pubdate>1977</pubdate>
   </biblioentry>
   <biblioentry id="FHIW13C-125">
      <author>
        <firstname>Geoffrey</firstname>
        <surname>Madell</surname>
      </author>
      <title>The Identity of the Self</title>
      <publisher>
        <publishername>Edinburgh University Press</publishername>
      </publisher>
      <pubdate>1981</pubdate>
   </biblioentry>
   <biblioentry id="FHIW13C-1260">
      <author>
        <firstname>Sydney</firstname>
        <surname>Shoemaker</surname>
      </author>
      <author>
         <firstname>Richard</firstname>
         <surname>Swinburne</surname>
      </author>
      <title>Personal Identity</title>
      <publisher>
        <publishername>Basil Blackwell</publishername>
      </publisher>
      <pubdate>1984</pubdate>
    </biblioentry>
    <biblioentry id="FHIW13C-1288-3">
      <author>
        <firstname>Jonathan</firstname>
        <surname>Glover</surname>
      </author>
      <title>The Philosophy and Psychology of Personal Identity</title>
      <publisher>
        <publishername>Penguin</publishername>
      </publisher>
      <pubdate>1988</pubdate>
    </biblioentry>

class String
  def xtag(s)
    scan( %r! ( < #{s} [^>]* > ) ( .*? ) </ #{s} > !mx )
  end
end

gets(nil).xtag("biblioentry").each { |tag,data|
  if data.xtag("pubdate")[0][1] > "1984"
    print tag, data, "\n"
  end
}

Artit Satanakulpanich wrote:

How can i search value from xml file such as I want to find from *pubdate *and
return* **biblioentry
*Please give me some source code for further study*
**
<?xml version="1.0" encoding="ISO-8859-15"?>
<!DOCTYPE bibliography PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
          "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd&quot;&gt;
<bibliography id="personal_identity">
    <biblioentry id="FHIW13C-1234">
      <author>
        <firstname>Godfrey</firstname>
        <surname>Vesey</surname>
      </author>
      <title>Personal Identity: A Philosophical Analysis</title>
      <publisher>
        <publishername>Cornell University Press</publishername>
      </publisher>
      <pubdate>1977</pubdate>
   </biblioentry>

class String
  def xtag(s)
    scan( %r! < #{s} (?: \s+ ( [^>]* ) )? >
               ( .*? ) </ #{s} > !mx ).
      map{ |attr, data| h = { }
        if attr
          attr.scan( %r! ( \S+ ) = " ( [^"]* ) " !x ){ |k,v|
            h[k] = v }
        end
        [ h, data ]
      }
  end
end

gets(nil).xtag("biblioentry").each { |attr,data|
  if data.xtag("pubdate")[0][1] > "1984"
    print attr["id"], data, "\n"
  end
}

Artit Satanakulpanich wrote:

How can i search value from xml file such as I want to find from *pubdate *and
return* **biblioentry
*Please give me some source code for further study*
**
<?xml version="1.0" encoding="ISO-8859-15"?>
<!DOCTYPE bibliography PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
          "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd&quot;&gt;
<bibliography id="personal_identity">
    <biblioentry id="FHIW13C-1234">
      <author>
        <firstname>Godfrey</firstname>
        <surname>Vesey</surname>
      </author>
      <title>Personal Identity: A Philosophical Analysis</title>
      <publisher>
        <publishername>Cornell University Press</publishername>
      </publisher>
      <pubdate>1977</pubdate>
   </biblioentry>

class String
  def xtag(s)
    scan( %r! < #{s} (?: \s+ ( [^>]* ) )? >
               ( .*? ) </ #{s} > !mx ).
      map{ |attr, data| h = { }
        if attr
          attr.scan( %r! ( \S+ ) = " ( [^"]* ) " !x ){ |k,v|
            h[k] = v }
        end
        [ h, data ]
      }
  end
end

gets(nil).xtag("biblioentry").each { |attr,data|
  if data.xtag("pubdate")[0][1] > "1984"
    print attr["id"], data, "\n"
  end
}

Artit Satanakulpanich wrote:

How can i search value from xml file such as I want to find from *pubdate *and
return* **biblioentry
*Please give me some source code for further study*
**
<?xml version="1.0" encoding="ISO-8859-15"?>
<!DOCTYPE bibliography PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
          "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd&quot;&gt;
<bibliography id="personal_identity">
    <biblioentry id="FHIW13C-1234">
      <author>
        <firstname>Godfrey</firstname>
        <surname>Vesey</surname>
      </author>
      <title>Personal Identity: A Philosophical Analysis</title>
      <publisher>
        <publishername>Cornell University Press</publishername>
      </publisher>
      <pubdate>1977</pubdate>
   </biblioentry>

class String
  def xtag(s)
    scan( %r!
              < #{s} (?: \s+ ( [^>]* ) )? / >
              >
              < #{s} (?: \s+ ( [^>]* ) )? >
              ( .*? ) </ #{s} >
          !mx ).
      map{ |unpaired, attr, data| h = { }
        attr = ( unpaired || attr )
        if attr
          attr.scan( %r! ( \S+ ) = " ( [^"]* ) " !x ){ |k,v|
            h[k] = v }
        end
        [ h, data ]
      }
  end
  def xshow( depth=0 )
    text = ""
    split( /<([^>]*)>/ ).each_with_index{ |s,i|
      if 0 == i % 2
        text = s.strip
      else
        indent = " " * ( depth * 2 )
        case
          when s[0,1] == "/"
            depth -= 1
            puts text.map{|x| indent + x.strip } if text != ""
          when s[-1,1] == "/"
            puts indent + s
          else
            puts indent + s
            depth += 1
        end
      end
    }
  end
end

gets(nil).xtag("biblioentry").each { |attr,data|
  if data.xtag("pubdate")[0][1] > "1997"
    puts attr["id"]
    data.xshow( 1 )
  end
}

Output:

FHIW13C-1298-4
  author
    firstname
      Brian
    surname
      Garrett
  title
    Personal Identity and Self-Consciousness
  publisher
    publishername
      Routledge
  pubdate
    1998
FHIW13CX-1202-1
  author
    firstname
      John
    surname
      Perry
  title
    Identity, Personal Identity, and the Self
  publisher
    publishername
      Hackett
  pubdate
    2002

I want to use rexml or any library ,please
thankz

···

On 2/20/06, Artit Satanakulpanich <rubybox@gmail.com> wrote:

How can i search value from xml file such as I want to find from *pubdate
*and
return* **biblioentry
*Please give me some source code for further study*
**
<?xml version="1.0" encoding="ISO-8859-15"?>
<!DOCTYPE bibliography PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
          "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd&quot;&gt;
<bibliography id="personal_identity">
    <biblioentry id="FHIW13C-1234">
      <author>
        <firstname>Godfrey</firstname>
        <surname>Vesey</surname>
      </author>
      <title>Personal Identity: A Philosophical Analysis</title>
      <publisher>
        <publishername>Cornell University Press</publishername>
      </publisher>
      <pubdate>1977</pubdate>
   </biblioentry>
   <biblioentry id="FHIW13C-125">
      <author>
        <firstname>Geoffrey</firstname>
        <surname>Madell</surname>
      </author>
      <title>The Identity of the Self</title>
      <publisher>
        <publishername>Edinburgh University Press</publishername>
      </publisher>
      <pubdate>1981</pubdate>
   </biblioentry>
   <biblioentry id="FHIW13C-1260">
      <author>
        <firstname>Sydney</firstname>
        <surname>Shoemaker</surname>
      </author>
      <author>
         <firstname>Richard</firstname>
         <surname>Swinburne</surname>
      </author>
      <title>Personal Identity</title>
      <publisher>
        <publishername>Basil Blackwell</publishername>
      </publisher>
      <pubdate>1984</pubdate>
    </biblioentry>
    <biblioentry id="FHIW13C-1288-3">
      <author>
        <firstname>Jonathan</firstname>
        <surname>Glover</surname>
      </author>
      <title>The Philosophy and Psychology of Personal Identity</title>
      <publisher>
        <publishername>Penguin</publishername>
      </publisher>
      <pubdate>1988</pubdate>
    </biblioentry>
    <biblioentry id="FHIW13C-1289-1">
      <author>
        <firstname>Harold</firstname>
        <othername>W.</othername>
        <surname>Noonan</surname>
      </author>
      <title>Personal Identity</title>
      <publisher>
        <publishername>Routledge</publishername>
      </publisher>
      <pubdate>1989</pubdate>
    </biblioentry>
    <biblioentry id="FHIW13C-1291-2">
      <author>
        <firstname>Ren</firstname>
        <surname>Marres</surname>
      </author>
      <title>Persoonlijke identiteit na het verval van de ziel</title>
      <publisher>
        <publishername>Coutinho</publishername>
      </publisher>
      <pubdate>1991</pubdate>
    </biblioentry>
    <biblioentry id="FHIW13C-1293-1">
      <author>
        <firstname>James</firstname>
        <surname>Baillie</surname>
      </author>
      <title>Problems in Personal Identity</title>
      <publisher>
        <publishername>Paragon House</publishername>
      </publisher>
      <pubdate>1993</pubdate>
    </biblioentry>
    <biblioentry id="FHIW13C-1298-4">
      <author>
        <firstname>Brian</firstname>
        <surname>Garrett</surname>
      </author>
      <title>Personal Identity and Self-Consciousness</title>
      <publisher>
        <publishername>Routledge</publishername>
      </publisher>
      <pubdate>1998</pubdate>
    </biblioentry>
    <biblioentry id="FHIW13CX-1202-1">
      <author>
        <firstname>John</firstname>
        <surname>Perry</surname>
      </author>
      <title>Identity, Personal Identity, and the Self</title>
      <publisher>
        <publishername>Hackett</publishername>
      </publisher>
      <pubdate>2002</pubdate>
    </biblioentry>
</bibliography>

*Thank You
--
Artit Satanakulpanich

http://www.rubybox.net (Thai Language)

--
Artit Satanakulpanich

http://www.rubybox.net (Thai Language)

"William James" <w_a_x_man@yahoo.com> writes:

class String
  def xtag(s)
    scan( %r! ( < #{s} [^>]* > ) ( .*? ) </ #{s} > !mx )
  end
end

gets(nil).xtag("biblioentry").each { |tag,data|
  if data.xtag("pubdate")[0][1] > "1984"
    print tag, data, "\n"
  end
}

I hope you are joking...

···

--
Christian Neukirchen <chneukirchen@gmail.com> http://chneukirchen.org

William James wrote:

class String
  def xtag(s)
    <snip>
  end

gets(nil).xtag("biblioentry").each { |attr,data|
  <snip>
}

Please stop. To the OP, use rexml.

···

--
Posted via http://www.ruby-forum.com/\.

As others say, for now REXML is probably the way to go, but *very* soon
now you'll be able to use Libxml2 also if things keep going to plan over
here.

  require 'xml/libxml'

  d = XML::Parser.file('test.xml').parse
  p d.find('//biblioentry[pubdate = 1977]').to_a

If you want to try it before we get to release go to CVS:
http://rubyforge.org/scm/?group_id=494

(Also, REXML does support XPath, so you should be able to modify the
above to work with that. Just to be sure, I tried it 100 times over:

### XPath ###
                          user system total real
rexml 9.840000 0.080000 9.920000 ( 10.046963)
libxml2 0.090000 0.000000 0.090000 ( 0.139592)

:wink:

···

On Tue, 2006-02-21 at 16:17 +0900, Artit Satanakulpanich wrote:

I want to use rexml or any library ,please
thankz

--
Ross Bamford - rosco@roscopeco.REMOVE.co.uk

"William James" <w_a_x_man@yahoo.com> writes:

Artit Satanakulpanich wrote:

How can i search value from xml file such as I want to find from *pubdate *and
return* **biblioentry
*Please give me some source code for further study*
**
<?xml version="1.0" encoding="ISO-8859-15"?>
<!DOCTYPE bibliography PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
          "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd&quot;&gt;
<bibliography id="personal_identity">
    <biblioentry id="FHIW13C-1234">
      <author>
        <firstname>Godfrey</firstname>
        <surname>Vesey</surname>
      </author>
      <title>Personal Identity: A Philosophical Analysis</title>
      <publisher>
        <publishername>Cornell University Press</publishername>
      </publisher>
      <pubdate>1977</pubdate>
   </biblioentry>

class String
  def xtag(s)
  end
  def xshow( depth=0 )
end

gets(nil).xtag("biblioentry").each { |attr,data|
  if data.xtag("pubdate")[0][1] > "1997"
    puts attr["id"]
    data.xshow( 1 )
  end
}

Still doesn't support namespaces, entities and CDATA... :wink:
(Or nested tags like <div><div></div></div>.)

···

--
Christian Neukirchen <chneukirchen@gmail.com> http://chneukirchen.org

Christian Neukirchen wrote:

"William James" <w_a_x_man@yahoo.com> writes:

> class String
> def xtag(s)
> scan( %r! ( < #{s} [^>]* > ) ( .*? ) </ #{s} > !mx )
> end
> end
>
> gets(nil).xtag("biblioentry").each { |tag,data|
> if data.xtag("pubdate")[0][1] > "1984"
> print tag, data, "\n"
> end
> }

I hope you are joking...

I hope you're joking.

Every time I've tried to use REXML for something I've found it to be
incredibly slow and painful on large files. Usually I start with
REXML, get annoyed, and then install QuiXML
(http://quixml.rubyforge.org/\). Though it doesn't have bells and
whistles, it's a heck of a lot faster. Anyhow, I'm certainly looking
forward to your libxml2 bindings!

-Pawel

···

On 2/21/06, Ross Bamford <rossrt@roscopeco.co.uk> wrote:

(Also, REXML does support XPath, so you should be able to modify the
above to work with that. Just to be sure, I tried it 100 times over:

### XPath ###
                          user system total real
rexml 9.840000 0.080000 9.920000 ( 10.046963)
libxml2 0.090000 0.000000 0.090000 ( 0.139592)

Christian Neukirchen wrote:

"William James" <w_a_x_man@yahoo.com> writes:

class String
  def xtag(s)
    scan( %r! ( < #{s} [^>]* > ) ( .*? ) </ #{s} > !mx )
  end
end

gets(nil).xtag("biblioentry").each { |tag,data|
  if data.xtag("pubdate")[0][1] > "1984"
    print tag, data, "\n"
  end
}

I hope you are joking...

Actually, in real-world usage, Mark Pilgrim's Python Feed Parser[0] falls back to regular expressions to get the data required if the XML is not well-formed.

Admittedly this is a real problem for RSS hackers, less so with other XML messages, but the approach does have merit if (a) you can't guarantee well-formedness and (b) you absolutely have to have the data.

-dave

[0] http://feedparser.org/