I'm trying to read a gzipped xml-file into rexml, but I don't quite succeed. Perhaps someone can help me. Till now I tried this:
#!/usr/bin/ruby -w
require 'zlib'
require 'rexml/document'
Zlib::GzipReader.open('file.dia') {|gz|
print gz.read
}
# this prints everything nicely and it works
f = Zlib::GzipReader.open("file.dia")
s = f.read
p s
# now the ungzipped contents are in s, they look like this however:
# "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<dia:diagram
# xmlns:dia=\"http://www.lysator.liu.se/~alla/dia/">\n
# <dia:diagramdata>\n <dia:attribute name=\"background\">\n
# all in one line and encapsulated in ""
# so of course the next command fails
xmldoc = REXML::Document.new s
p xmldoc
# gives: <UNDEFINED> ... </>
Any ideas on this? This can't be too difficult, I think...
Guido
I'm trying to read a gzipped xml-file into rexml, but I don't quite
succeed. Perhaps someone can help me. Till now I tried this:
#!/usr/bin/ruby -w
require 'zlib'
require 'rexml/document'
Zlib::GzipReader.open('file.dia') {|gz|
print gz.read
}
# this prints everything nicely and it works
f = Zlib::GzipReader.open("file.dia")
s = f.read
p s
You're not closing f here.
# now the ungzipped contents are in s, they look like this however:
# "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<dia:diagram
# xmlns:dia=\"http://www.lysator.liu.se/~alla/dia/\">\\n
# <dia:diagramdata>\n <dia:attribute name=\"background\">\n
# all in one line and encapsulated in ""
# so of course the next command fails
I'm not so sure that it fails just because of this.
xmldoc = REXML::Document.new s
p xmldoc
# gives: <UNDEFINED> ... </>
Any ideas on this? This can't be too difficult, I think...
Guido
irb(main):003:0> xmldoc = Zlib::GzipReader.open('file.dia') {|gz| REXML::Document.new gz}
RuntimeError: Zlib::GzipReader is not a valid input stream. It must be
either a String, IO, StringIO or Source.
I'm not so sure that it fails just because of this.
It shouldn't, because ruby will determine the end of the file by
itself and close the handle on exiting.
I didn't mean to say that it fails because of the open file. My point
with the first remark was that it's a good habit to open files for only as
long as they are actually used. The block form is the idiom of choice
here: it's not much longer as a simple File.open() or File.new() and it
ensures the file is always properly closed.
irb(main):003:0> xmldoc = Zlib::GzipReader.open('file.dia') {|gz|
REXML::Document.new gz}
RuntimeError: Zlib::GzipReader is not a valid input stream. It must
be either a String, IO, StringIO or Source.
This doesn't work either, I'm afraid...
I guess this is because GzipReader doesn't inherit IO:
If there is no exception during GZIP reading I guess there might be a bug
somewhere. As a test I'd write the gunzipped content to another file and
do a diff on the plain xml and this output to see whether GzipReader
actually yields the same content.
Btw, did you actually try to make REXML read the first variant? Maybe you
have a problem in your XML file.
I'm not so sure that it fails just because of this.
It shouldn't, because ruby will determine the end of the file by
itself and close the handle on exiting.
I didn't mean to say that it fails because of the open file. My point
with the first remark was that it's a good habit to open files for only as
long as they are actually used. The block form is the idiom of choice
here: it's not much longer as a simple File.open() or File.new() and it
ensures the file is always properly closed.
You are right of course, I will try to do this in the future.
If there is no exception during GZIP reading I guess there might be a bug
somewhere. As a test I'd write the gunzipped content to another file and
do a diff on the plain xml and this output to see whether GzipReader
actually yields the same content.
They do
Btw, did you actually try to make REXML read the first variant? Maybe you
have a problem in your XML file.