Regexp Help

Hi,

I am looking for a way to do the following:

I have some html in strings, mostly links (i.e. <a
href="http://mysite.com">). I need to delete these lines. I've been
trying to do this, but it's not working out:

question.gsub! ("<a href=\"\/([.]+)\/\" >$/i", "")
or also: question.gsub! ("<a href=\"\/([a-zA-z0-9]+)\/\" >$/i", "")

(and various permutations of these two).

How would one go about this?

Thanks,
Jillian

···

--
Posted via http://www.ruby-forum.com/.

Jillian Kozyra wrote:

Hi,

I am looking for a way to do the following:

I have some html in strings, mostly links (i.e. <a
href="http://mysite.com">). I need to delete these lines. I've been
trying to do this, but it's not working out:

question.gsub! ("<a href=\"\/([.]+)\/\" >$/i", "")
or also: question.gsub! ("<a href=\"\/([a-zA-z0-9]+)\/\" >$/i", "")

(and various permutations of these two).

How would one go about this?

Thanks,
Jillian

q = 'Hello <a href="http://mysite.com">world<a
href="http://mysite.com">.'

result = q.gsub(/<a.*?>/, "")
puts result

--output:--
Hello world.

···

--
Posted via http://www.ruby-forum.com/\.

Why not =>

irb(main):001:0> require 'nokogiri'
  => true
irb(main):002:0> doc = Nokogiri::HTML(<<-eohtml)
irb(main):003:1" <html>
irb(main):004:1" <body>
irb(main):005:1" <a href="http://mysite.com">Bla</a>
irb(main):006:1" </body>
irb(main):007:1" </html>
irb(main):008:1" eohtml

doc.xpath('//a')[0].attributes["href"].to_s
  => "http://mysite.com"

http://tenderlovemaking.com/2008/10/30/nokogiri-is-released/
Cheers

···

On 28.07.2009, at 08:12, Jillian Kozyra wrote:

Hi,

I am looking for a way to do the following:

I have some html in strings, mostly links (i.e. <a
href="http://mysite.com">). I need to delete these lines. I've been
trying to do this, but it's not working out:

question.gsub! ("<a href=\"\/([.]+)\/\" >$/i", "")
or also: question.gsub! ("<a href=\"\/([a-zA-z0-9]+)\/\" >$/i", "")

(and various permutations of these two).

How would one go about this?

Thanks,
Jillian
--
Posted via http://www.ruby-forum.com/\.

I would write like

result = q.gsub(/<[^>]*>/, "")

Thanks

Actually, doc.content would do what the op requested.

Ray

···

On Tue, Jul 28, 2009 at 10:20 AM, Kai König <kai@kairichardkoenig.com>wrote:

Why not =>

doc.xpath('//a')[0].attributes["href"].to_s
=> "http://mysite.com"