Hello,
I need to read XML documents from an open network socket.
Rexml’s Document.parse_stream works fine, but is a tad slow
for this application.
I installed libxml, but while there is a wrapper for the SAX
parser, I haven’t found a way to set the callbacks. This limits
it’s usefulness severly. It isn’t in the TODO file either.
Have I missed something here ?
None if the DOM parsers seem to have an option: Stop when you
have a valid (complete) document, which would be just what I
need.
Any suggestions ? (apart from don’t use XML).
Cheers,
Han Holl
I need to read XML documents from an open network socket. Rexml's
Document.parse_stream works fine, but is a tad slow for this
application.
I installed libxml, but while there is a wrapper for the SAX
parser, I haven't found a way to set the callbacks. This limits
it's usefulness severly. It isn't in the TODO file either.
Have I missed something here ?
None if the DOM parsers seem to have an option: Stop when you
have a valid (complete) document, which would be just what I
need.
Any suggestions ? (apart from don't use XML).
I haven't completed the SAX handlers, there's just some infrastructure
for it. I prefer the text reader interface over SAX, but I haven't
done either since I haven't had a need to support SAX. DOM + XPath
has satisfied all of my needs to date. patches welcome though.
Nag me enough about it, and I'll get to it, but it's not that high on
my list of things to work on.
http://people.FreeBSD.org/~seanc/TODO
-sc
···
--
Sean Chittenden
Sean Chittenden sean@chittenden.org wrote in message news:20030417011450.GN79923@perrin.int.nxad.com…
I haven’t completed the SAX handlers, there’s just some infrastructure
for it. I prefer the text reader interface over SAX, but I haven’t
done either since I haven’t had a need to support SAX. DOM + XPath
has satisfied all of my needs to date. patches welcome though.
Nag me enough about it, and I’ll get to it, but it’s not that high on
my list of things to work on.
No, I won’t nag you about it. If you don’t need it, you don’t need it.
It’s just surprising how little choice there is if you need to grab
xml-documents from an input stream.
As far as I can see, libxml2 doesn’t support this at all, so a ruby
wrapper, even with SAX would do me no good.
I tried xmlparser, which claims to have an stream constructor, but this
is so bug-ridden I had to give up.
I could yet try xmlscan, but it’s pure ruby, and I doubt the performance
win over REXML would be earth-shattering. And xmlscan isn’t really well
documented.
So I’ll stick with REXML for the time being.
Cheers,
Han Holl
> I haven't completed the SAX handlers, there's just some
> infrastructure for it. I prefer the text reader interface over
> SAX, but I haven't done either since I haven't had a need to
> support SAX. DOM + XPath has satisfied all of my needs to date.
> patches welcome though. Nag me enough about it, and I'll get
> to it, but it's not that high on my list of things to work on.
No, I won't nag you about it. If you don't need it, you don't need
it. It's just surprising how little choice there is if you need to
grab xml-documents from an input stream.
Agreed.
As far as I can see, libxml2 doesn't support this at all, so a ruby
wrapper, even with SAX would do me no good.
As a matter of fact, it would, and libxml2's arguably the fastest SAX
parser out there. It's DOM is constructed via a set of SAX callbacks.
http://xmlbench.sourceforge.net/results/benchmark/index.html
Unless you're parsing documents that are hundreds of MB in size,
libxml2's pretty efficient. -sc
···
--
Sean Chittenden
Sean Chittenden sean@chittenden.org wrote in message news:20030417225405.GX79923@perrin.int.nxad.com…
As far as I can see, libxml2 doesn’t support this at all, so a ruby
wrapper, even with SAX would do me no good.
As a matter of fact, it would, and libxml2’s arguably the fastest SAX
parser out there. It’s DOM is constructed via a set of SAX callbacks.
http://xmlbench.sourceforge.net/results/benchmark/index.html
Unless you’re parsing documents that are hundreds of MB in size,
libxml2’s pretty efficient. -sc
I don’t doubt for a moment that you know libxml2 better than I, but the only
constructors I could find in the docs are xmlSAXUserParseFile and
xmlSAXUserParseMemory. The first takes a filename, and the second a pointer to
char.
So I assumed that libxml2 doesn’t do stream parsing.
I hope I’m wrong.
Cheers,
Han Holl