REXML element reading <br /> error

When reading in the site element from my xml file using rexml it seems
to be chopping the rest of the text off after the first <br/>

The value in the XML file is below
<Site>123 street<br/>amstown<br/>amserland</Site>

element = REXML::XPath.first(doc, '//Site')

puts element.text #shows 123 Street

How can i get the full data and once i have it i can remove the <br/>
I cant find any information on this???

JB

···

--
Posted via http://www.ruby-forum.com/.

I'd suggest using a bit more XPath, both text() and a each {} to
iterate through the text nodes (which are distinct):

$ irb -r rexml/document --prompt xmp
a = REXML::Document.new("<Site>123 street<br/>amstown<br/>amserland</Site>")
# => <UNDEFINED> ... </>
REXML::XPath.first(a, '//Site').text
# => "123 street"
REXML::XPath.first(a, '//Site/text()').to_s
# => "123 street"
REXML::XPath.each(a, '//Site/text()') {|el| puts el}
123 street
amstown
amserland
# => ["123 street", "amstown", "amserland"]

HTH,
Keith

···

On 8/31/07, John Butler <johnnybutler7@gmail.com> wrote:

When reading in the site element from my xml file using rexml it seems
to be chopping the rest of the text off after the first <br/>

The value in the XML file is below
<Site>123 street<br/>amstown<br/>amserland</Site>

element = REXML::XPath.first(doc, '//Site')

When reading in the site element from my xml file using rexml it seems
to be chopping the rest of the text off after the first <br/>

Not quite. It gives you the *first* text element.

The value in the XML file is below
<Site>123 street<br/>amstown<br/>amserland</Site>

element = REXML::XPath.first(doc, '//Site')

puts element.text #shows 123 Street

How can i get the full data and once i have it i can remove the <br/> I
cant find any information on this???

You can't find any specific info because there isn't anything specific.
You have an XML element that contains a text node, an empty element named
br, another text node, another empty element named br and another text
node. In the XML world, <br/> is a node like any other.

The REXML::Element.texts method is what you are looking for:

$ irb
irb(main):001:0> require "rexml/document"
=> true

irb(main):002:0> doc=REXML::Document.new("<Site>123 street<br/>amstown<br/

amserland</Site>")

=> <UNDEFINED> ... </>

irb(main):003:0> doc.root.texts
=> ["123 street", "amstown", "amserland"]

irb(main):004:0> doc.root.texts.join " "
=> "123 street amstown amserland"

Enjoy!

···

On Sat, 01 Sep 2007 04:57:57 +0900, John Butler wrote:

Hi,

At Sat, 1 Sep 2007 05:18:48 +0900,
Keith Fahlgren wrote in [ruby-talk:266990]:

I'd suggest using a bit more XPath, both text() and a each {} to
iterate through the text nodes (which are distinct):

$ irb -r rexml/document --prompt xmp
a = REXML::Document.new("<Site>123 street<br/>amstown<br/>amserland</Site>")
# => <UNDEFINED> ... </>
REXML::XPath.first(a, '//Site').text
# => "123 street"

Seems like that just REXML::XPath.first(a, '//Site').to_s
returns the whole content.

···

--
Nobu Nakada

Keith Fahlgren wrote:

REXML::XPath.each(a, '//Site/text()') {|el| puts el}

The assert_xpath plugin wraps that up in this convenient method:

    class REXML::Element
      def inner_text
        return self.each_element( './/text()' ){}.join( '' )
      end
    end
...
    def test_absolve_breaks
      a = REXML::Document.new("<Site>123
street<br/>amstown<br/>amserland</Site>")
      assert_equal "123 streetamstownamserland", a.inner_text
    end

Come to think of it, that's not terribly programmer-friendly! Let's upgrade
it a little...

      assert_equal "123 streetamstownamserland", a.inner_text
      assert_equal "123 street\namstown\namserland", a.inner_text("\n")
...
  def inner_text(interstitial = '')
    return self.each_element( './/text()' ){}.join(interstitial)
  end

···

--
  Phlip
  Test Driven Ajax (on Rails) [Book]
  "Test Driven Ajax (on Rails)"
  assert_xpath, assert_javascript, & assert_ajax