XML Parser (SOAP Request)

First of all, sorry for my English.

Hello, I've been working on a new MSNP (Version 13) client for Ruby, the
only problem that I Have is that this version of the protocol has lots
of SOAP Requests and I've been trying the smart way of parsing it.

So, what i receive from the server is:

<?xml version=‘1.0’ encoding=‘utf-8’?>
<soap:Envelope xmlns:soap=“http://schemas.xmlsoap.org/soap/envelope/”>
   <soap:Header xmlns:soap=“http://schemas.xmlsoap.org/soap/envelope/”>
       <ServiceHeader
xmlns=“http://www.msn.com/webservices/AddressBook”>
           <Version
xmlns=“http://www.msn.com/webservices/AddressBook”>11.02.1331.0000</Version>
       </ServiceHeader>
   </soap:Header>
   <soap:Body xmlns:soap=“http://schemas.xmlsoap.org/soap/envelope/”>
       <FindMembershipResponse
xmlns=“http://www.msn.com/webservices/AddressBook”>
           <FindMembershipResult
xmlns=“http://www.msn.com/webservices/AddressBook”>
               <Services
xmlns=“http://www.msn.com/webservices/AddressBook”>
                   <Service
xmlns=“http://www.msn.com/webservices/AddressBook”>
                       <Memberships
xmlns=“http://www.msn.com/webservices/AddressBook”>
                           <Membership
xmlns=“http://www.msn.com/webservices/AddressBook”>
                               <MemberRole
xmlns=“http://www.msn.com/webservices/AddressBook”>Allow</MemberRole>
                               <Members
xmlns=“http://www.msn.com/webservices/AddressBook”>
                                   <Member
xmlns="http://www.msn.com/webservices/AddressBook"
xsi:type="PassportMember"
xmlns:xsi=“http://www.w3.org/2001/XMLSchema-instance” >
                                       <MembershipId
xmlns=“http://www.msn.com/webservices/AddressBook”>2</MembershipId>
                                       <Type
xmlns=“http://www.msn.com/webservices/AddressBook”>Passport</Type>
                                       <State
xmlns=“http://www.msn.com/webservices/AddressBook”>Accepted</State>
                                       <Deleted
xmlns=“http://www.msn.com/webservices/AddressBook”>false</Deleted>
                                       <LastChanged
xmlns=“http://www.msn.com/webservices/AddressBook”>2005-08-05T17:34:12.7870000-07:00</LastChanged>
                                       <Changes
xmlns=“http://www.msn.com/webservices/AddressBook”/>
                                       <PassportName
xmlns=“http://www.msn.com/webservices/AddressBook”>alice@passport.com</PassportName>
                                       <IsPassportNameHidden
xmlns=“http://www.msn.com/webservices/AddressBook”>false</IsPassportNameHidden>
                                       <PassportId
xmlns=“http://www.msn.com/webservices/AddressBook”>0</PassportId>
                                       <CID
xmlns=“http://www.msn.com/webservices/AddressBook”>0</CID>
                                       <PassportChanges
xmlns=“http://www.msn.com/webservices/AddressBook”/>
                                   </Member>
                               </Members>
                           </Membership>

What I need to retrieve is everything inside the
<membership></membership> tags, for example (following the xml above):

* in MemberRole, I need what it's of the tag -> Allow
* in Member, I need the value of the attribute xsi:type ->
PassportMember
* in everything else, I need what it's inside of the next tags
   -> 2
   -> Passport
   -> Accepted
   -> False
   ...
   until </Membership>

* Note: everything that I receive from the server it's not splited by
end-lines.

Any help of parsing this?

Thanks in advance.

···

--
Posted via http://www.ruby-forum.com/.

This is what I Have:

require 'rexml/document'

include REXML

def strip_html(txt)
  txt.gsub(/<\/?[^>]*>/, '')
end

xml_path = 'soap:Envelope/soap:Body/'
xml_path << 'FindMembershipResponse/'
xml_path << 'FindMembershipResult/Services/'
xml_path << 'Service/Memberships/Membership'
doc = Document.new(File.new('c:/soap.xml'))
doc.elements.each(xml_path) {
  >element>
  xpath = element.to_s.dup
  xpath.split("<(.*?)>")
  xpath.each {
    >line>
    case line
      when /<MemberRole*/i
        puts strip_html(line)
      when /<MembershipId*/i
        puts strip_html(line)
      when /<Type*/i
        puts strip_html(line)
      when /<State*/i
        puts strip_html(line)
      when /<PassportName*/i
        puts strip_html(line)
    end
  }

}

But it's very slow.

···

--
Posted via http://www.ruby-forum.com/.

Eder Quiñones wrote:

First of all, sorry for my English.

Hello, I've been working on a new MSNP (Version 13) client for Ruby, the
only problem that I Have is that this version of the protocol has lots
of SOAP Requests and I've been trying the smart way of parsing it.

I'd give hpricot a shot:

require 'rubygems'
require 'hpricot'

data = Hpricot.XML(File.read("response.xml"))

data.search("//Membership").each do |membership|
  puts "Membership Info"
  puts membership.at("/MemberRole").inner_html
  membership.search("/Members/Member").each do |member|
    puts "==Member Info=="
    puts member['xsi:type']
    puts member.at("/MembershipId").inner_html
    puts member.at("/Type").inner_html
    puts member.at("/State").inner_html
    puts member.at("/Deleted").inner_html
    puts member.at("/LastChanged").inner_html
    puts member.at("/Changes").inner_html
    puts member.at("/PassportName").inner_html
    puts member.at("/IsPassportNameHidden").inner_html
    puts member.at("/PassportId").inner_html
    puts member.at("/CID").inner_html
    puts member.at("/PassportChanges").inner_html
  end
end

···

--
Posted via http://www.ruby-forum.com/\.

Is there a WSDL? If so, you can use SOAP4R and it will do the parsing
for you. Try the getting started guide here:
http://markthomas.org/2007/09/12/getting-started-with-soap4r/
and see if that helps.

Problem Solved;

doc = Document.new(File.new('c:/soap.xml'))
xml_path = 'soap:Envelope/soap:Body/'
xml_path << 'FindMembershipResponse/'
xml_path << 'FindMembershipResult/Services/'
xml_path << 'Service/Memberships/Membership'

doc.each_element(xml_path) {
  # Everything inside <MemberShip></MemberShip>
  # <Membership>
  # <MemberRole></MemberRole>
  # <Members>
  # <Member>
  # ...
  # </Member>
  # </Members>
  # <Membership>
  >element>
  element.each_element {
    # <MemberRole>
    >mr>
    puts mr.get_text

  }

  element.each_element('Members/Member') {
    # Everything inside <Member></Member>
    >mm>
    # Every children inside <Member>
    children = mm.children
    children.each {
      >child>
      begin
        puts child.get_text
      rescue
        # do nothing
      end
    }
  }

}

···

--
Posted via http://www.ruby-forum.com/.

Thank you for your responses, but the fact that I'm using 'rexml' is
because rexml comes with the ruby package. The problem is that if I
ever release my work I would need to work with gems.

···

--
Posted via http://www.ruby-forum.com/.