How to extract something in between a pattern

Hi experts,

I'm not very familiar with ruby's library. I wonder if there a method
can extract something in a pattern? For example,

I have a string: a=aabbcc<p>ccddee</p>

I wanna get the anything between <p> and </p>, which is ccddee

Thanks in advance.

···

--
Posted via http://www.ruby-forum.com/.

Cheyne Li wrote:

Hi experts,

I'm not very familiar with ruby's library. I wonder if there a method
can extract something in a pattern? For example,

I have a string: a=aabbcc<p>ccddee</p>

I wanna get the anything between <p> and </p>, which is ccddee

Thanks in advance.

C:\Users\Alex>irb
irb(main):001:0> require 'hpricot'
=> true
irb(main):002:0> a="aabbcc<p>ccddee</p>"
=> "aabbcc<p>ccddee</p>"
irb(main):003:0>
irb(main):004:0* doc=Hpricot(a)
=> #<Hpricot::Doc "aabbcc" {elem <p> "ccddee" </p>}>
irb(main):005:0> p doc.at('p').inner_text
"ccddee"
=> nil
irb(main):006:0>

Li

···

--
Posted via http://www.ruby-forum.com/\.

If you want to use regexp, a quick and dirty way would be :

(a.split %r{</?p>})[1]

···

On Sep 22, 12:58 pm, Cheyne Li <happy.go.lucky....@gmail.com> wrote:

Hi experts,

I'm not very familiar with ruby's library. I wonder if there a method
can extract something in a pattern? For example,

I have a string: a=aabbcc<p>ccddee</p>

I wanna get the anything between <p> and </p>, which is ccddee

Thanks in advance.
--
Posted viahttp://www.ruby-forum.com/.

Ya, in this case (HTML/XML), Hpricot is your best bet.

Otherwise, standard regex stuff would apply, imo.

···

On Mon, Sep 22, 2008 at 12:44 PM, Li Chen <chen_li3@yahoo.com> wrote:

Cheyne Li wrote:

Hi experts,

I'm not very familiar with ruby's library. I wonder if there a method
can extract something in a pattern? For example,

I have a string: a=aabbcc<p>ccddee</p>

I wanna get the anything between <p> and </p>, which is ccddee

Thanks in advance.

C:\Users\Alex>irb
irb(main):001:0> require 'hpricot'
=> true
irb(main):002:0> a="aabbcc<p>ccddee</p>"
=> "aabbcc<p>ccddee</p>"
irb(main):003:0>
irb(main):004:0* doc=Hpricot(a)
=> #<Hpricot::Doc "aabbcc" {elem <p> "ccddee" </p>}>
irb(main):005:0> p doc.at('p').inner_text
"ccddee"
=> nil
irb(main):006:0>

Li
--
Posted via http://www.ruby-forum.com/\.

--
todb@planb-security.net | ICQ: 335082155 | Note: Due to Google's
privacy policy <http://tinyurl.com/5xbtl&gt; and the United States'
policy on electronic surveillance <http://tinyurl.com/muuyl&gt;,
please do not IM/e-mail me anything you wish to remain secret.

suroot57@gmail.com wrote:

  

Hi experts,

I'm not very familiar with ruby's library. I wonder if there a method
can extract something in a pattern? For example,

I have a string: a=aabbcc<p>ccddee</p>

I wanna get the anything between <p> and </p>, which is ccddee

Thanks in advance.
--
Posted viahttp://www.ruby-forum.com/.
    
If you want to use regexp, a quick and dirty way would be :

(a.split %r{</?p>})[1]
  

or:
irb(main):001:0> a = 'aabbcc<p>ccddee</p>'
=> "aabbcc<p>ccddee</p>"
irb(main):002:0> a[%r{<p>(.*)</p>}, 1]
=> "ccddee"

···

On Sep 22, 12:58 pm, Cheyne Li <happy.go.lucky....@gmail.com> wrote:

--
Ittay Dror <ittayd@tikalk.com>
Tikal <http://www.tikalk.com>
Tikal Project <http://tikal.sourceforge.net>

--
--
Ittay Dror <ittay.dror@gmail.com>

Or:

irb(main):004:0> a = 'aabbcc<p>ccddee</p>ccc<p>eee</p>'
=> "aabbcc<p>ccddee</p>ccc<p>eee</p>"
irb(main):005:0> a.scan(%r{<p>([^<]*)</p>})
=> [["ccddee"], ["eee"]]

I perfer to specify what character(s) not to match explicitly.

Ittay Dror wrote:

···

or:
irb(main):001:0> a = 'aabbcc<p>ccddee</p>'
=> "aabbcc<p>ccddee</p>"
irb(main):002:0> a[%r{<p>(.*)</p>}, 1]
=> "ccddee"