Is there any way to get any useful data out of REXML::ParseException when you're working on a String? It never sets @position or @line or anything. I need to figure out exactly where the error-causing tag starts, and save it.
Any ideas?
-- rakaur
Is there any way to get any useful data out of REXML::ParseException when you're working on a String? It never sets @position or @line or anything. I need to figure out exactly where the error-causing tag starts, and save it.
Any ideas?
-- rakaur
Out of curiosity, I tried libxml. It has nice error messages:
foo.xml:3:
parser error :
Opening and ending tag mismatch: title line 3 and txitle
<title>Foo</txitle>
^
but they go to stdout. You can capture them by registering an error
handler. Sample code:
parser = XML::Parser.new
parser.filename = "foo.xml"
msgs = []
XML::Parser.register_error_handler lambda { |msg| msgs << msg }
begin
parser.parse
rescue Exception => e
puts "Error: #{msgs}"
end
-- Mark.
On Sep 29, 10:56 pm, Eric Will <rak...@malkier.net> wrote:
Is there any way to get any useful data out of REXML::ParseException
when you're working on a String? It never sets @position or @line or
anything. I need to figure out exactly where the error-causing tag
starts, and save it.Any ideas?
Out of curiosity, I tried libxml. It has nice error messages:
foo.xml:3:
parser error :
Opening and ending tag mismatch: title line 3 and txitle
<title>Foo</txitle>
^but they go to stdout. You can capture them by registering an error
handler. Sample code:parser = XML::Parser.new
parser.filename = "foo.xml"msgs = []
XML::Parser.register_error_handler lambda { |msg| msgs << msg }begin
parser.parse
rescue Exception => e
puts "Error: #{msgs}"
end
Interesting. I was thinking about doing libxml anyway. I do not like REXML.
Thanks.
-- Mark.
-- rakaur
On Tue, Sep 30, 2008 at 9:59 AM, Mark Thomas <mark@thomaszone.com> wrote:
Actually, this isn't working for me. I'm using the SAX parser, and it
just calls Listener#on_parser_error with a string. Not helping me.
Why would you want to do that? You already have the XML as a string.
The only reason to put up with the awful interface and extra
complexity of SAX would be if your file doesn't fit into memory. And I
don't think the SAX interface to libxml is as complete/robust yet as
the DOM interface.
Go with the DOM interface. With libxml it's plenty fast.
-- Mark.
On Sep 30, 12:46 pm, Eric Will <rak...@malkier.net> wrote:
Actually, this isn't working for me. I'm using the SAX parser, and it
just calls Listener#on_parser_error with a string. Not helping me.
My situation requires SAX, unfortunately.
I need to parse and react to each tag as in comes in. If there's a
broken one, all tags up to the broken one must be processed, and the
broken one must be stored. I cannot do this in DOM, because if there's
an error, DOM will not process anything.
Also, I don't think those error messages can help me location the
position in the string of the bad XML. They're pretty, for sure, but
not very useful to anyone but a human.
If you can receive an entire document at a time, libxml has a
'recover' mode that will correct what it can and process the entire
document -- even if it is not well-formed. It works surprisingly well.
Another option is writing your own recursive descent parser. See
http://snippets.dzone.com/posts/show/2190 for a starting point.
-- Mark.
On Sep 30, 2:09 pm, Eric Will <rak...@malkier.net> wrote:
My situation requires SAX, unfortunately.
I need to parse and react to each tag as in comes in. If there's a
broken one, all tags up to the broken one must be processed, and the
broken one must be stored. I cannot do this in DOM, because if there's
an error, DOM will not process anything.