I am having extreme speed issues with some simple rexml code and want to
know if I am doing something findamentally wrong. The source xml document
has a few thousand elements, attributes are not used much, text nodes are
present. I need to do some surgery on the tree, walking sequentially through
the initial very flat tree and moving things into a 3-level structure based
on known target levels for the element types. Just takes a single pass, and
does include some delete_element and add_element calls.The main method is:
def relocate_siblings ()
root = @doc.root
level_0 = nil
level_1 = nil
level_2 = nil
root.each_element do |curr|
p curr
if target_level_0_nodes.include? curr.name
level_0 = curr
else
if target_level_1_nodes.include? curr.name
curr.parent.delete_element(curr)
level_0.add_element(curr)
level_1 = curr
else
curr.parent.delete_element(curr)
level_1.add_element(curr)
level_2 = curr
end
end
end
end
This runs excruciatingly slowly, literally taking seconds between printing
nodes. I could edit the tree in XMLSpy just marginally slower. The disk is
not awfully busy, CPU is at right about 100%.
I am having extreme speed issues with some simple rexml code and want to
know if I am doing something findamentally wrong.
Browsing the rexml code, it looks like deleting an element can be very
expensive (unless you know its index in its parent), in the order of
O(number_of_siblings). Perhaps this is the cause of the slowdown for me.
Looking at the Element API, I don’t think I have any real algorithmic
options. hmmm
I am having extreme speed issues with some simple rexml code and want to
know if I am doing something findamentally wrong.
Browsing the rexml code, it looks like deleting an element can be very
expensive (unless you know its index in its parent), in the order of
O(number_of_siblings). Perhaps this is the cause of the slowdown for me.
Looking at the Element API, I don’t think I have any real algorithmic
options. hmmm
One option I had not clearly considered was … turning on my brain!
I did not have to delete nodes; using more functional-programming style I
now create a new tree. Runs like a champ!
Also learned: avoid deleting nodes in rexml if possible.
I did not have to delete nodes; using more functional-programming style I
now create a new tree. Runs like a champ!
Also learned: avoid deleting nodes in rexml if possible.
Yeah, that’s a problem I need to see if I can address. Parsing and
printing trees are reasonably fast (well, about as fast as you can get
given the pure Ruby implementation), but modifying the tree can be
painful.
I don’t have a solution at the moment, but I am aware of the issue.