Values overwritten into XML file when using REXML with string sub! function

Hi All,
        I m facing an issue while parsing the XML file using REXML. I
really don't know either this is proper flow or not.
Here is my scenario,
        I have created the REXML object for a XML file , using XPath.match
method I m retrieving values. I stored the results into a variable, then
doing further processing . When I used sub! function with any of those
variables, it affects the xml content too. see my example below.

My code snippet

* require 'rexml/document' include REXML # Opening xml
file in the REXML document xml_doc =
Document.new(File.open("sample.xml")) # => <UNDEFINED> ... </>
maths_score=XPath.match(xml_doc,"college/students/student/marks/mathematics/@value")
# => [value='42'] maths_score=maths_score.pop.to_s # => "42"
puts "Initial match score is #{maths_score}" puts "Going to assign new
value for maths score" maths_score="99"
maths_score=XPath.match(xml_doc,"college/students/student/marks/mathematics/@value")
# => [value='42'] puts "maths_score after update : #{maths_score}" #
No college_name=XPath.match(xml_doc,"college/@name") # => [name='ABC
College'] college_name=college_name.pop.to_s # => "ABC College"
puts "Initial college name : #{college_name}" puts "college name
variable class name : #{college_name.class}" # Performing the
substitution to remove the college. # I except that this changes would
happens only at string variable college_name # But it doesn't
college_name.sub!(/(.*)College/,'\1') # => "ABC " puts "After
substitution the college name is #{college_name}" # Now I tried
to get the original value in the xml file, org_college_name =
XPath.match(xml_doc,"college/@name") # => "ABC " # If the
substitution succeed the value gets overwritten into the XML content *

* puts "Original college name : #{org_college_name}"*

* # This update doesn't updated for the maths score. *
Output for my snippet :

*Initial match score is 42Going to assign new value for maths
scoremaths_score after update : [value='42']Initial college name : ABC
Collegecollege name variable class name : StringAfter substitution the
college name is ABC Original college name is [name='ABC ']Original maths
score is [value='42']*
XML file :
<?xml version="1.0" encoding="UTF-8" ?>
<college id="12" name="ABC College" address="12th ABC nagar,Chennai,India">
    <students>
        <student>
            <name value="karthickraja"/>
            <regno value="101425"/>
            <marks>
                <language value="90"/>
                <second_language value="100"/>
                <mathematics value="42"/>
                <commerce value="89"/>
                <total value="321"/>
            </marks>
        </student>
    </students>
</college>

Here my doubt is ,
    Is there any memory related problem , can any one explain whats
happening here.

P.S
My ruby version is 2.3.0p0 running on a debian jessie

Thanks in advance!

$ irb -r rexml/document
irb(main):001:0> dom = REXML::Document.new <<STR
irb(main):002:0" <foo>bar</foo>
irb(main):003:0" STR
=> <UNDEFINED> ... </>
irb(main):004:0> dom.root
=> <foo> ... </>
irb(main):005:0> dom.root.text
=> "bar"
irb(main):006:0> puts dom
<foo>bar</foo>
=> nil
irb(main):007:0> dom.root.text << "HAHA"
=> "barHAHA"
irb(main):008:0> puts dom
<foo>bar</foo>
=> nil
irb(main):009:0> dom.root
=> <foo> ... </>
irb(main):010:0> puts dom.root
<foo>bar</foo>
=> nil
irb(main):011:0> dom.root.text
=> "barHAHA"
irb(main):012:0> puts dom
<foo>bar</foo>
=> nil

You can see, the node's XML looks different than the member value. I
think this comes from a misuse of the API: you get a value object and
change it without REXML having any chance to notice. You could argue
that REXML then should freeze the value and that is a valid point IMO.

If you modify properly, i.e. by setting the value you get proper results:

irb(main):014:0> dom.root.text = "changed"
=> "changed"
irb(main):015:0> puts dom
<foo>changed</foo>

So, I would never modify strings obtained from REXML and instead work
with copies or properly assign.

These days I would rather use Nokogiri for manipulating tag based
languages. I think it's the superior technology - and probably faster,
too.

Kind regards

robert

···

On Mon, May 30, 2016 at 8:24 PM, karthick bk <karthickraja.bksystems@gmail.com> wrote:

Hi All,
        I m facing an issue while parsing the XML file using REXML. I really
don't know either this is proper flow or not.
Here is my scenario,
        I have created the REXML object for a XML file , using XPath.match
method I m retrieving values. I stored the results into a variable, then
doing further processing . When I used sub! function with any of those
variables, it affects the xml content too. see my example below.

Here my doubt is ,
    Is there any memory related problem , can any one explain whats
happening here.

--
[guy, jim, charlie].each {|him| remember.him do |as, often| as.you_can
- without end}
http://blog.rubybestpractices.com/