Rexml - extreme speed issue

Its_Me · 21 April 2004 21:34

I am having extreme speed issues with some simple rexml code and want to
know if I am doing something findamentally wrong. The source xml document
has a few thousand elements, attributes are not used much, text nodes are
present. I need to do some surgery on the tree, walking sequentially through
the initial very flat tree and moving things into a 3-level structure based
on known target levels for the element types. Just takes a single pass, and
does include some delete_element and add_element calls.The main method is:

@doc = REXML::Document.new @infile

target_level_0_nodes = [“A”, “B”, “C”]
target_level_1_nodes = [“a”, “b”, “c”]
target_level_2_nodes = [“x”,“y”, “z”]

def relocate_siblings ()
root = @doc.root
level_0 = nil
level_1 = nil
level_2 = nil
root.each_element do |curr|
p curr
if target_level_0_nodes.include? curr.name
level_0 = curr
else
if target_level_1_nodes.include? curr.name
curr.parent.delete_element(curr)
level_0.add_element(curr)
level_1 = curr
else
curr.parent.delete_element(curr)
level_1.add_element(curr)
level_2 = curr
end
end
end
end

This runs excruciatingly slowly, literally taking seconds between printing
nodes. I could edit the tree in XMLSpy just marginally slower. The disk is
not awfully busy, CPU is at right about 100%.

Help!! What am I doing wrong?

I do not want to have to deal with XSLT!

Its_Me · 21 April 2004 22:14

“Its Me” itsme213@hotmail.com wrote in message

I am having extreme speed issues with some simple rexml code and want to
know if I am doing something findamentally wrong.

Browsing the rexml code, it looks like deleting an element can be very
expensive (unless you know its index in its parent), in the order of
O(number_of_siblings). Perhaps this is the cause of the slowdown for me.
Looking at the Element API, I don’t think I have any real algorithmic
options. hmmm

Its_Me · 21 April 2004 22:29

“Its Me” itsme213@hotmail.com wrote in message
news:FgChc.8618$NR5.4154@fe1.texas.rr.com…

“Its Me” itsme213@hotmail.com wrote in message

I am having extreme speed issues with some simple rexml code and want to
know if I am doing something findamentally wrong.

Browsing the rexml code, it looks like deleting an element can be very
expensive (unless you know its index in its parent), in the order of
O(number_of_siblings). Perhaps this is the cause of the slowdown for me.
Looking at the Element API, I don’t think I have any real algorithmic
options. hmmm

One option I had not clearly considered was … turning on my brain!

I did not have to delete nodes; using more functional-programming style I
now create a new tree. Runs like a champ!

Also learned: avoid deleting nodes in rexml if possible.

SER1 · 24 April 2004 18:19

“Its Me” itsme213@hotmail.com wrote in message news:axChc.13247$Dn1.9452@fe2.texas.rr.com…

I did not have to delete nodes; using more functional-programming style I
now create a new tree. Runs like a champ!

Also learned: avoid deleting nodes in rexml if possible.

Yeah, that’s a problem I need to see if I can address. Parsing and
printing trees are reasonably fast (well, about as fast as you can get
given the pure Ruby implementation), but modifying the tree can be
painful.

I don’t have a solution at the moment, but I am aware of the issue.

— SER

Topic		Replies	Views
REXML Speed Question ruby-talk	3	108	8 April 2011
REXML document creation speed ruby-talk	2	72	19 February 2008
REXML again ruby-talk	0	105	17 August 2002
Rexml to slow ruby-talk	18	92	1 April 2008
REXML ... performance & memory usage ruby-talk	13	99	9 November 2006

Rexml - extreme speed issue

Related topics