XmlConfigFile usage

Hello,

Given the following XML config file, is there a better way to parse it than
the method below?

Goal:

servers = {

“HQ” => {

“host” => “orbite.eurocontrol.be”,

“dn” => “uid=roberto”,

“pass” => “xxxxxxx”,

“base” => “o=eurocontrol”,

“filt” => “(eurocontrolsiteid=Bretigny)”,

“attributes” => [

“uid”,

“sn”,

“givenname”,

“departmentname”,

“eurocontrolroomid”,

“telephonenumber”,

“mail”,

“userpassword”,

],

},

“EEC” => {

“host” => “ldap.eurocontrol.fr”,

“dn” => “”,

“pass” => “xxxxx”,

“base” => “o=eurocontrol”,

“filt” => “(&(objectclass=inetOrgPerson)(objectclass=mailRecipient))”,

“attributes” => [

“uid”,

“sn”,

“givenname”,

“ou”,

“roomnumber”,

“telephonenumber”,

“mail”,

“userpassword”,

],

},

}

···

-=-=-
$debug = true

def parse_config(config_file)
begin
config = XmlConfigFile.new(config_file)
rescue => err
$stderr.puts "Error loading #{config_file}: #{err}\n"
end

server_list = Array.new
config.get_parameter_array("//servers/name").each {|h|
h.each_value {|v| server_list << v }
}

Now for every server, get all attributes

servers = Hash.new
server_list.each {|s|
#
# New server
#
puts “#{s}:” if $debug
servers[s] = Hash.new

# Get everything (attributes will be treated later)
#
config.get_parameter_array("//server[@name='#{s}']/*").each {|h|
  h.each {|k,v|
    attr = (k.split(/\./))[1]
    servers[s][attr] = v
    puts "  #{attr} -> #{v}" if $debug
  }      
}
# Now get all attributes/* values
#
servers[s]["attributes"] = Array.new
config.get_parameter_array("//server[@name='#{s}']/attributes/*").each {|h|
  h.each {|k,v|
    attr = (k.split(/\./))[2]
    servers[s]["attributes"] << v
    puts "    attributes -> #{v}" if $debug
  }      
}

}
return server_list
end
-=-=-

config.xml
-=-=-

<?xml version="1.0" encoding="iso-8859-1" standalone="yes" ?>


HQ
EEC


orbite.eurocontrol.be
uid=roberto
xxxxx
o=eurocontrol
(eurocontrolsiteid=Bretigny)

uid
sn
givenname
departmentname
eurocontrolroomid
telephonenumber
mail
userpassword



ldap.eurocontrol.fr
cn=Directory Manager
xxxxxx
o=eurocontrol
(objectclass=inetOrgPerson)

uid
sn
givenname
ou
roomnumber
telephonenumber
mail
userpassword




telephonenumber
telephonenumber
>attr.sub(%r{^.*?(\d{4})$}, ‘\1’)


eurocontrolroomid
roombumber





telephonenumber
telephonenumber
>"+33-1-5555" + attr


roombumber
eurocontrolroomid




-=-=-

Ollivier ROBERT -=- Eurocontrol EEC/ITM -=- roberto@eurocontrol.fr
Usenet Canal Historique FreeBSD: The Power to Serve!

Given the following XML config file, is there a better way to parse it
than
the method below?

No disrespect for XmlConfigFile and Maik’s work intended, I thought I’d
throw out my XmlSerialization lib for consideration. Assuming your Ruby
classes match the structure of the Xml, there’s not much Ruby code required
to load your file. You can find it in the RAA if you’re interested.

Chris

Given the following XML config file, is there a better way to parse it than
the method below?

Currently there is no better way, I think. But on my “todo list” there
is still
a point dealing with the interfaces of the methods get_parameters and
get_parameter_array. As I told Daniel Carrera before, I am trying to
implement functions that will fit yours and Daniel’s needs.

The general problem is, that XmlConfigFile was meant to handle XML
configuration files, that I often use. I provided functionality for
reloading and so on and the next version - to be released in the next
days - provides possibilites to change a configuration, to store it and
to add observers to a configuration file that will be notified, if it
changes.

I underestimated the need for true XML serialization and I think it’s
time to finalize Chris’ great work and to integrate it. Providing the
stuff many people obviously seem to miss, will be a piece of cake then.
It is also the right time to do such things, because the rest of the
interface is stable.

Cheers,

I underestimated the need for true XML serialization and I think it’s
time to finalize Chris’ great work and to integrate it.

Well, I wouldn’t call it great (maybe groovy) … :slight_smile: … but we seem to have
some convergence here. If there’s anything we can do to merge our efforts,
I’m for it (though I unfortunately don’t have a lot of time to give, just
pockets here and there).

Do you think there’s a need for a separate XmlConfigFile lib? I don’t have
an opinion myself, I haven’t mulled it over at all. Is there anything
missing from my serialization lib (well, there definitely is in general, but
as relates to a conf file)? Does your approach have advantages in some
cases?

YAML gets more traffic here then XML does – maybe a generic
Conf/Serialization lib that could use either XML or YAML might be in order.
I dunno, just typing out loud.

Chris

Chris Morris wrote:

Well, I wouldn’t call it great (maybe groovy) … :slight_smile: … but we seem to have
some convergence here. If there’s anything we can do to merge our efforts,
I’m for it (though I unfortunately don’t have a lot of time to give, just
pockets here and there).
That would make absolutely sense, I think, and I will think about
concrete actions on weekend.

Do you think there’s a need for a separate XmlConfigFile lib? I don’t have
an opinion myself, I haven’t mulled it over at all. Is there anything
missing from my serialization lib (well, there definitely is in general, but
as relates to a conf file)? Does your approach have advantages in some
cases?
The point is IMHO that serialization and deserialization is very low
level stuff. There are a lot more requirements to a configuration file
than simply mapping file formats to memory structures and vice versa.
For example you want your configuration file to be reloaded
periodically, so you do not have to restart your application just
because your configuration did change. In this context it is a “must
have” feature that you can register observers with your configuration
that will be notified, if something has changed.
Access to a configuration file should be as convenient as possible. I do
not want to code classes by myself just to access my application’s
configuration. Additionally, XPath seems to be a very convenient way to
walk through an XML file in general and especially through typical
configuration files. If you take Ollivier’s problem for example: The
access still isn’t convenient enough. He simply wants to get his bunch
of parameters as a Hash and the next version of XmlConfigFile will
provide just that:

config = XmlConfigFile.new(‘config.xml’, 300) # Reload every 5 min.
servers = config.to_hash(’/server’)

That’s it!

If clxmlserial makes this easy, I will definitely integrate it, but the
user shouldn’t see it in my opinion.

YAML gets more traffic here then XML does
I did not have a close look at YAML and only skimmed the documentation
on yaml.org. Doing so, I did not really understand which benefits I
would get by using it. XML is a very mature technology that is supported
on every platform and by every language I know and especially its
"satellite technologies" (XPath, XSL, XLink, etc.) are very helpful. I
have been working with XML technologies for more than three years now in
C++, Perl, Java and Ruby and never missed a thing. But if so many people
are so enthusiastic about YAML, I definitely have to read the specs.

– maybe a generic Conf/Serialization lib that could use either XML or YAML might be in order.
I dunno, just typing out loud.
At least we should try to define a standard interface that will hide the
low level stuff I have mentioned above.

Best wishes!

And/or YAXML. I'd love to see something which does this, as you can then go
with the object semantics of YAML, whilst having the ability to wrap it in
the syntax of XML.
http://yaml.org/xml.html

YAXML may be a subset of XML, but is probably sufficient for both
serialisation and config files? It's the only object serialisation
'standard' for XML that I've seen so far, apart from SOAP which is
unbelievably complex and ugly.

Regards,

Brian.

···

On Fri, Feb 28, 2003 at 01:53:47AM +0900, Chris Morris wrote:

YAML gets more traffic here then XML does -- maybe a generic
Conf/Serialization lib that could use either XML or YAML might be in order.
I dunno, just typing out loud.

Access to a configuration file should be as convenient as possible. I do
not want to code classes by myself just to access my application’s
configuration.

This is a good point. One feature for my lib suggested by someone (I believe
it was someone who worked on pickle - the Python xml serializer) was to have
Ruby auto-gen the class based on the read in xml structure, a very cool
idea. This solution would streamline things a bit more. I’m assuming there’d
be no hang-ups with implementation.

YAML gets more traffic here then XML does – maybe a generic
Conf/Serialization lib that could use either XML or YAML might be in order.
I dunno, just typing out loud.

Not sure why there is more traffic for YAML over XML (though I have some guesses), but for those displeased with XML, for whatever
reasons, there are many alternatives:

http://www.pault.com/pault/pxml/xmlalternatives.html

And/or YAXML. I’d love to see something which does this, as you can then go
with the object semantics of YAML, whilst having the ability to wrap it in
the syntax of XML.
http://yaml.org/xml.html

YAXML may be a subset of XML, but is probably sufficient for both
serialisation and config files? It’s the only object serialisation
’standard’ for XML that I’ve seen so far, apart from SOAP which is
unbelievably complex and ugly.

YAXML isn’t so easy in the eyes, either.

In addition to SOAP serialization, there’s also XML-RPC
http://www.xmlrpc.com/spec

(I’m sure there are more, with varying degrees of “standard”-ness, but XML-ROC is what comes to mind.)

In the long run, the choice depends on whether you foresee the need for features available in one format that are not available in
others.

James

···

On Fri, Feb 28, 2003 at 01:53:47AM +0900, Chris Morris wrote:

Regards,

Brian.

This is a good point. One feature for my lib suggested by someone (I believe
it was someone who worked on pickle - the Python xml serializer) was to have
Ruby auto-gen the class based on the read in xml structure, a very cool
idea. This solution would streamline things a bit more. I’m assuming there’d
be no hang-ups with implementation.

That’s definitely the right way and that would be the data binding I
have always looked for. All the Java data binding tools currently
available (Sun, Castor, Relaxer, etc.) do nearly exactly what was
described above, but they all do need an additional compile step. I.e.,
you put a DTD or an XML schema into a compiler, that generates Java
classes for XML de-/serializing. That’s definitely an extreme overkill,
because you would have to write a DTD or Schema for your configuration
file and you definitely do not want to do that, do you?

The dynamic nature of Ruby makes it possible to do all this without an
extra compiler and even without a DTD or a Schema: Just generate code
on the fly from a document instance … coooool …

Cheers,

> > YAML gets more traffic here then XML does -- maybe a generic
> > Conf/Serialization lib that could use either XML or YAML might be in order.
> > I dunno, just typing out loud.

Not sure why there is more traffic for YAML over XML (though I have some guesses), but for those displeased with XML, for whatever
reasons, there are many alternatives:

http://www.pault.com/pault/pxml/xmlalternatives.html

Not really "being displeased", but XML is a language for marking up text,
and doesn't by default have semantics for doing data-handling jobs (like
storing config files or serialising objects). You can't say "I'll just
serialise this object to XML using xxxlibrary" and expect it to interoperate
with another platform; you'll be able to _parse_ it anywhere, but you'll end
up writing code to interpret the structure.

Also, some of those XML 'alternatives' are just different syntax for the
same thing, i.e. using indentation instead of <tag>...</tag>

YAXML isn't so easy in the eyes, either.

It doesn't look too bad to me. At least, a hash expands to
    <key1>value1</key1>
    <key2>value2</key2>
which is how you'd instinctively design an XML schema to do that. But I
haven't found an implementation to try.

In addition to SOAP serialization, there's also XML-RPC
http://www.xmlrpc.com/spec

Yep, played with that one too. I like its simplicity and I like having RPC
semantics. It's not very rich though (e.g. it can't serialise 'nil' and it
can't handle aliasing)

In the long run, the choice depends on whether you foresee the need for features available in one format that are not available in
others.

And taking a punt on whether it will be around in a year or two, which may
be important in some cases (e.g. if you standardise on a particular method
for interfacing with your customers). On that basis, I may end up having to
bite the bullet and learn SOAP.

Regards,

Brian.

···

On Fri, Feb 28, 2003 at 04:59:39AM +0900, jbritt@ruby-doc.org wrote:

> On Fri, Feb 28, 2003 at 01:53:47AM +0900, Chris Morris wrote:

> In the long run, the choice depends on whether you foresee the need for features available in one format that are not
available in
> others.

And taking a punt on whether it will be around in a year or two,

Yeah; I guess I'd add that to the list of desirable features. :slight_smile:

···

Regards,

Brian.

“Brian Candler” B.Candler@pobox.com schrieb im Newsbeitrag
news:20030227205352.A91492@linnet.org

http://www.pault.com/pault/pxml/xmlalternatives.html

Btw: I couldn’t access this site (no DNS). Has anybody experienced the
same?

Not really “being displeased”, but XML is a language for marking up text,
and doesn’t by default have semantics for doing data-handling jobs (like
storing config files or serialising objects).

I’d say this is at least not 100% true. XML application is typically
distinguished as “document centric” and “data centric”. The first is the
markup category while the second focuses on representing tree structures.
An XML document is data centric, if an XML element either contains PCDATA
OR other elements, but not mixed content.

You can’t say “I’ll just
serialise this object to XML using xxxlibrary” and expect it to
interoperate
with another platform; you’ll be able to parse it anywhere, but you’ll
end
up writing code to interpret the structure.

… unless you use some tools to do that for you. Typically configuration
information is interpreted, too - so this might not be too much of a
disadvantage for XML in this case.

Also, some of those XML ‘alternatives’ are just different syntax for the
same thing, i.e. using indentation instead of …

YAXML isn’t so easy in the eyes, either.

It doesn’t look too bad to me. At least, a hash expands to
value1
value2
which is how you’d instinctively design an XML schema to do that. But I
haven’t found an implementation to try.

Personally I would model an XML schema for a hash different. This would
look like


key1
val1


or

val1 ...

But I would definitely not name the elements after the key. It’s a bit
more verbose but IMHO this reflects the data modeling better and it lends
itself better for automated processing.

Kind regards

robert

"Brian Candler" <B.Candler@pobox.com> schrieb im Newsbeitrag
news:20030227205352.A91492@linnet.org...
> > http://www.pault.com/pault/pxml/xmlalternatives.html

Btw: I couldn't access this site (no DNS). Has anybody experienced the
same?

Right now the DNS is OK (it maps to an IP) but the webserver is down:

$ telnet www.pault.com 80
Trying 66.33.50.162...
telnet: connect to address 66.33.50.162: Connection refused
telnet: Unable to connect to remote host

> Not really "being displeased", but XML is a language for marking up text,
> and doesn't by default have semantics for doing data-handling jobs (like
> storing config files or serialising objects).

I'd say this is at least not 100% true. XML application is typically
distinguished as "document centric" and "data centric".

What I was trying to say is, the XML *spec* does not define any particular
"data centric" ways of working. I can have a very simple database (say one
table with rows of customers) and I can make up my own way of turning this
into XML; but someone else will have chosen a different way.

Some examples might be:

  <customers>
    <customer id=1>
      <name>Joe Bloggs</name>
      <address>1 Disk Drive</address>
    </customer>
  </customers>

  <customers>
    <customer>
      <id>
        <integer>1</integer>
      </id>
      <name>
        <string>Joe Bloggs</string>
      </name>
      <address>
        <string>1 Disk Drive</string>
      </address>
    </customer>
  </customers>

And here is the version which XML RPC would give:

irb(main):002:0> require 'xmlrpc/marshal'
true
irb(main):003:0> XMLRPC::Marshal.dump_call('addcust',[{"Name"=>"Joe Bloggs","Address"=>"1 Disk Drive"}])

<?xml version="1.0" ?>
<methodCall>
<methodName>addcust</methodName>
<params>
  <param>
   <value>
    <array>
     <data>
      <value>
       <struct>
        <member>
         <name>Name</name>
         <value>
          <string>Joe Bloggs</string>
         </value>
        </member>
        <member>
         <name>Address</name>
         <value>
          <string>1 Disk Drive</string>
         </value>
        </member>
       </struct>
      </value>
     </data>
    </array>
   </value>
  </param>
</params>
</methodCall>

So this isn't a very good situation to be in: if I write (say) Ruby code to
implement the first case, and then I need to import the XML into (say) Perl,
I will have to rewrite everything from scratch in Perl, save the actual
low-level parsing of the XML.

Personally I would model an XML schema for a hash different.

That's the point - it's (unfortunately) a personal choice - even for
something as simple as a hash.

With YAML, it's defined for you: and therefore, it will interoperate with
other YAML implementations, in the sense that a hash created on machine A
will be turned into the same hash on machine B. It also defines ways of
encoding common scalar data types, graph structures where the same object
appears at several points, user-defined types, and so forth.

Regards,

Brian.

···

On Fri, Feb 28, 2003 at 10:23:28PM +0900, Robert Klemme wrote:

Now I get
"Brian Candler" B.Candler@pobox.com schrieb im Newsbeitrag
news:20030228142429.A92246@linnet.org

“Brian Candler” B.Candler@pobox.com schrieb im Newsbeitrag
news:20030227205352.A91492@linnet.org

http://www.pault.com/pault/pxml/xmlalternatives.html

Btw: I couldn’t access this site (no DNS). Has anybody experienced the
same?

Right now the DNS is OK (it maps to an IP) but the webserver is down:

$ telnet www.pault.com 80
Trying 66.33.50.162…
telnet: connect to address 66.33.50.162: Connection refused
telnet: Unable to connect to remote host

Yes, same for me. I overlooked the statement DNS failed OR server down.
ping does work though.

What I was trying to say is, the XML spec does not define any
particular
"data centric" ways of working. I can have a very simple database (say
one
table with rows of customers) and I can make up my own way of turning
this
into XML; but someone else will have chosen a different way.

:-)

Personally I would model an XML schema for a hash different.

That’s the point - it’s (unfortunately) a personal choice - even for
something as simple as a hash.

With YAML, it’s defined for you: and therefore, it will interoperate with
other YAML implementations, in the sense that a hash created on machine A
will be turned into the same hash on machine B. It also defines ways of
encoding common scalar data types, graph structures where the same object
appears at several points, user-defined types, and so forth.

Ah, I see. Thanks for clarifying!

robert
···

On Fri, Feb 28, 2003 at 10:23:28PM +0900, Robert Klemme wrote:

Brian, your comments are spot on.

Note that the array in the below XML-RPC call would look like so in YAML:

  • Name: Joe Bloggs
    Address: 1 Disk Drive

The RPC call itself is expressed by YAML.rb’s RPC protocol like so:

— !okay/rpc
addcust:
- Name: Joe Bloggs
Address: 1 Disk Drive

···

On Friday 28 February 2003 07:24 am, Brian Candler wrote:

And here is the version which XML RPC would give:

irb(main):002:0> require 'xmlrpc/marshal’
true
irb(main):003:0> XMLRPC::Marshal.dump_call(‘addcust’,[{“Name”=>“Joe
Bloggs”,“Address”=>“1 Disk Drive”}])

<?xml version="1.0" ?> addcust Name Joe Bloggs Address 1 Disk Drive

So this isn’t a very good situation to be in: if I write (say) Ruby code to
implement the first case, and then I need to import the XML into (say) Perl,
I will have to rewrite everything from scratch in Perl, save the actual
low-level parsing of the XML.

With YAML, it’s defined for you: and therefore, it will interoperate with
other YAML implementations, in the sense that a hash created on machine A
will be turned into the same hash on machine B. It also defines ways of
encoding common scalar data types, graph structures where the same object
appears at several points, user-defined types, and so forth.

True. Which suggest one should consider starting out with an XML format known to interoperate.
It’s no different than if I invent my own CVS++ syntax to format my data, then try to pass it to a Perl app that expects YAML.

The comparisons between XML and YAML miss the mark because one is a general syntax specification for creating markup languages, and
the other is a specific markup language format. It makes more sense, perhaps, to compare YAML with (X)HTML, or SOAP/XML-RPC
serialization, or RD, ot RDoc, or some other format that has predefined semantics.

James

···

Regards,

Brian.

Hi,

From: jbritt@ruby-doc.org
Sent: Saturday, March 01, 2003 2:02 AM

So this isn’t a very good situation to be in: if I write (say) Ruby code to
implement the first case, and then I need to import the XML into (say) Perl,
I will have to rewrite everything from scratch in Perl, save the actual
low-level parsing of the XML.

You can use SOAP and XML-RPC for over-language interoperability
though XML-RPC is has several constraints about interoperable object
as said in this thread (multi-ref object graph, multi-byte char like
Japanese, user defined type, nil and empty string).

With YAML, it’s defined for you: and therefore, it will interoperate with
other YAML implementations, in the sense that a hash created on machine A
will be turned into the same hash on machine B. It also defines ways of
encoding common scalar data types, graph structures where the same object
appears at several points, user-defined types, and so forth.

SOAP as well. From data serialization point of view,
YAML is XML + XML Namespace(sort-of) + SAX(stream model) +
DOM(tree model) + XML Schema Datatypes Part2(build-in types) +
SOAP Encoding(Collection and Mapping), all-in-one spec. [ruby-talk:54657]

The comparisons between XML and YAML miss the mark because one is a general syntax specification for creating markup languages,
and
the other is a specific markup language format. It makes more sense, perhaps, to compare YAML with (X)HTML, or SOAP/XML-RPC
serialization, or RD, ot RDoc, or some other format that has predefined semantics.

Sure. (I think ‘RDoc’ is for an embedding format and I call the markup
’SimpleMarkup’ from its class name, though.)

There’s a brief summary of marshal/data serialization in Ruby.
http://rrr.jin.gr.jp/rwiki?cmd=view;name=Marshal

I should check detail of YAML implementation and add it here…

Regards,
// NaHi

Hi,

Here’s SOAP! Is there somebody listening?!

From: “why the lucky stiff” ruby-talk@whytheluckystiff.net
Sent: Saturday, March 01, 2003 1:40 AM

Note that the array in the below XML-RPC call would look like so in YAML:

  • Name: Joe Bloggs
    Address: 1 Disk Drive

The RPC call itself is expressed by YAML.rb’s RPC protocol like so:

— !okay/rpc
addcust:
- Name: Joe Bloggs
Address: 1 Disk Drive

And here is the version which XML RPC would give:

irb(main):002:0> require 'xmlrpc/marshal’
true
irb(main):003:0> XMLRPC::Marshal.dump_call(‘addcust’,[{“Name”=>“Joe
Bloggs”,“Address”=>“1 Disk Drive”}])

<?xml version="1.0" ?> addcust Name Joe Bloggs Address 1 Disk Drive

$ ruby -rsoap/marshal -e ‘puts SOAPMarshal.dump({“Name”=>“Joe Bloggs”, “Address”=>“1 Disk Drive”})’

<?xml version="1.0" encoding="utf-8" ?>

<env:Envelope xmlns:xsd=“http://www.w3.org/2001/XMLSchema” xmlns:env="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsi=“http://www.w3.org/2001/XMLSchema-instance”>
env:Body


Name
Joe Bloggs


Address
1 Disk Drive


</env:Body>
</env:Envelope>

Ugly? Indented by hand.

<?xml version="1.0" encoding="utf-8" ?>

<env:Envelope …>
env:Body
<Hash …>

Name
Joe Bloggs


Address
1 Disk Drive


</env:Body>
</env:Envelope>

In Hash, key might be an object so value and
value are not enough for general usage.
(I agree these are enough for configfile.)

Regards,
// NaHi

There’s a brief summary of marshal/data serialization in Ruby.
http://rrr.jin.gr.jp/rwiki?cmd=view;name=Marshal

I should check detail of YAML implementation and add it here…

Since this is something of a permathread, I had thought about creating a page on the Ruby Garden wiki, but seeing this page now I
don’t want to duplicate information. (And, truth be told, I don’t have the time right now to do a proper job summarizing the various
ways one can serialize objects or read/write config data in Ruby, listing pros and cons for each option).

James

···

Regards,
// NaHi

Hi,

From: NAKAMURA, Hiroshi [mailto:nahi@mwd.biglobe.ne.jp]
Sent: Saturday, March 01, 2003 10:45 AM

There’s a brief summary of marshal/data serialization in Ruby.
http://rrr.jin.gr.jp/rwiki?cmd=view;name=Marshal

I should check detail of YAML implementation and add it here…

I did it on weekend. See
http://rrr.jin.gr.jp/rwiki?cmd=view;name=Marshal
AMarshal is a must to check.

Cavert: I tested tools from object marshalling point of view.
Someone can consider about what features are needed for
configfile storing/restoring.

Regards,
// NaHi