Identifying where bad XML is

I am somewhat new to ruby and would appreciate any help on this matter. I have a custom assertion for running tests "assert_valid_xml_response" - which will parse the output and check for any bad xml..

My problem is, that with many changes constantly taking place, and many of these assertions taking place, it is sometimes hard to pinpoint exactly where the bad XML is hiding.

Has anyone stumbled across a solution to easily zero in on the location of bad XML?

Ben Burdick wrote:

I am somewhat new to ruby and would appreciate any help on this matter. I have a custom assertion for running tests "assert_valid_xml_response" - which will parse the output and check for any bad xml..

My problem is, that with many changes constantly taking place, and many of these assertions taking place, it is sometimes hard to pinpoint exactly where the bad XML is hiding.

Has anyone stumbled across a solution to easily zero in on the location of bad XML?

Um, depends on what is making it bad.

If I have access to the complete XML document, then I try loading it into IE or Firefox, which are pretty good at locating problems and issuing useful info.

Or I run it though tidy or xmllint or something.

Often the problem is a stray '<' or '&' someplace, so grep + a decent regexp could track it down, too.

If the XML is a set of repeated structures you could try using the REXML stream or pull parser to read in chunks at a time and assemble smaller sub-docs, and check those for well-formedness. If it fails it's easier to then to write out the failing XML and inspect it.

James

···

--

http://www.ruby-doc.org - The Ruby Documentation Site
http://www.rubyxml.com - News, Articles, and Listings for Ruby & XML
http://www.rubystuff.com - The Ruby Store for Ruby Stuff
http://www.jamesbritt.com - Playing with Better Toys

What exactly do you mean by "bad"? Do you mean "ill formed"? In that case any decent XML parser (validating and non validating) should give you information about where it chokes. If you mean by "bad" that it doesn't fit an XML Schema or DTD then a validating parser should point you where the error lies.

Kind regards

    robert

···

Ben Burdick <bburdick@gmail.com> wrote:

I am somewhat new to ruby and would appreciate any help on this
matter. I have a custom assertion for running tests
"assert_valid_xml_response" - which will parse the output and check
for any bad xml..

My problem is, that with many changes constantly taking place, and
many of these assertions taking place, it is sometimes hard to
pinpoint exactly where the bad XML is hiding.

Has anyone stumbled across a solution to easily zero in on the
location of bad XML?