Making one reg ex out of two

Jim_Burgess1 · 9 October 2009 12:46

I have a string:

course_info =
"<course><date>10.10.09<date><title>Maths</title></course>"

I want to use a reg ex to extract the title from the string, in this
case "Maths"

So far I came up with:

course_info.gsub!(/^(.*)<title>/, "")
course_info.gsub!(/<\/title>(.*)$/, "").chomp!
p course_info
# -> "Maths"

Is it possible amalgamate these two regular expressions into one or to
improve this in any way?
Am grateful for any help.

···

--
Posted via http://www.ruby-forum.com/.

Gavin_Kistner3 · 9 October 2009 13:10

Slim2:~ phrogz$ irb
irb(main):001:0> course_info =
"<course><date>10.10.09<date><title>Maths</title></course>"
=> "<course><date>10.10.09<date><title>Maths</title></course>"

irb(main):002:0> course_info[ /<title>([^<]+)/, 1 ]
=> "Maths"

irb(main):003:0> %r{<title>(.+?)</title>}.match(course_info).to_a
=> ["<title>Maths</title>", "Maths"]

···

On Oct 9, 6:46 am, Jim Burgess <jack.ze...@gmail.com> wrote:

course_info =
"<course><date>10.10.09<date><title>Maths</title></course>"

I want to use a reg ex to extract the title from the string, in this
case "Maths"

Mark_Thomas · 12 October 2009 19:10

This looks like an XML string. Did course_info come from an XML
document? An XML parser like Hpricot or Nokogiri would be an
improvement.

···

On Oct 9, 8:46 am, Jim Burgess <jack.ze...@gmail.com> wrote:

I have a string:

course_info =
"<course><date>10.10.09<date><title>Maths</title></course>"

I want to use a reg ex to extract the title from the string, in this
case "Maths"

So far I came up with:

course_info.gsub!(/^(.*)<title>/, "")
course_info.gsub!(/<\/title>(.*)$/, "").chomp!
p course_info
# -> "Maths"

Is it possible amalgamate these two regular expressions into one or to
improve this in any way?

Alexey_Bovanenko · 13 October 2009 06:20

Hi! I suggest the following:

text="<course><date>10.10.09<date><title>Maths</title></course>"

text.scan(/<title>([^<]+)</title>/) do
puts "#{$1}"
end

···

On Fri, Oct 9, 2009 at 4:46 PM, Jim Burgess <jack.zelig@gmail.com> wrote:

I have a string:

course_info =
"<course><date>10.10.09<date><title>Maths</title></course>"

I want to use a reg ex to extract the title from the string, in this
case "Maths"

So far I came up with:

course_info.gsub!(/^(.*)<title>/, "")
course_info.gsub!(/<\/title>(.*)$/, "").chomp!
p course_info
# -> "Maths"

Is it possible amalgamate these two regular expressions into one or to
improve this in any way?
Am grateful for any help.
--
Posted via http://www.ruby-forum.com/\.

Jim_Burgess1 · 9 October 2009 13:27

Thanks very much for that.
It'll take me a while to figure out what you've done, but that seems to
work perfectly!

Gavin Kistner wrote:

···

Slim2:~ phrogz$ irb
irb(main):001:0> course_info =
"<course><date>10.10.09<date><title>Maths</title></course>"
=> "<course><date>10.10.09<date><title>Maths</title></course>"

irb(main):002:0> course_info[ /<title>([^<]+)/, 1 ]
=> "Maths"

irb(main):003:0> %r{<title>(.+?)</title>}.match(course_info).to_a
=> ["<title>Maths</title>", "Maths"]

--
Posted via http://www.ruby-forum.com/\.

Jim_Burgess1 · 13 October 2009 05:44

This looks like an XML string. Did course_info come from an XML
document? An XML parser like Hpricot or Nokogiri would be an
improvement.

That's a good idea. Thanks, I'll look into that.

···

--
Posted via http://www.ruby-forum.com/\.

Jim_Burgess1 · 13 October 2009 07:39

This looks like an XML string. Did course_info come from an XML
document? An XML parser like Hpricot or Nokogiri would be an
improvement.

Quick update:
Just installed and followed a quick tutorial on Hpricot.
While effectively accomplishing the same task, this parser makes the
code much easier to read. I can now write:

doc = Hpricot.parse(File.read("courses.xml"))
(doc/:course).each do |course|
  if (course/:date).inner_html.match "#{date}"
    groups_that_had_lessons_this_month << (course/:title).inner_html
  end
end

as opposed to this (or worse):

File.open("courses.txt", 'r') do |datei|
  datei.readlines.select do |line|
    if line.match "#{date}"
      groups_that_had_lessons_this_month << line[ /<title>([^<]+)/, 1 ]
    end
  end
end

Thanks for the recommendation, and thanks everyone else for the answers
too.

···

--
Posted via http://www.ruby-forum.com/\.

Bertram_Scharpf · 9 October 2009 15:04

Hi,

···

Am Freitag, 09. Okt 2009, 22:27:01 +0900 schrieb Jim Burgess:

Gavin Kistner wrote:
> Slim2:~ phrogz$ irb
> irb(main):001:0> course_info =
> "<course><date>10.10.09<date><title>Maths</title></course>"
> => "<course><date>10.10.09<date><title>Maths</title></course>"
>
> irb(main):002:0> course_info[ /<title>([^<]+)/, 1 ]
> => "Maths"
>
> irb(main):003:0> %r{<title>(.+?)</title>}.match(course_info).to_a
> => ["<title>Maths</title>", "Maths"]

Thanks very much for that.
It'll take me a while to figure out what you've done, but that seems to
work perfectly!

Shorter:

course_info =~ %r{<title>(.+?)</title>}
$1 == "Maths"

Bertram

--
Bertram Scharpf
Stuttgart, Deutschland/Germany
http://www.bertram-scharpf.de

Topic		Replies	Views
Regular expressions - Again ruby-talk	13	101	8 March 2007
Regular expression ruby-talk	7	123	23 March 2009
Cutting a piece of text ruby-talk	13	113	13 February 2006
Regexp Parsing -- What's the right way? ruby-talk	6	133	12 August 2006
Ruby regexpresion ruby-talk	6	134	17 September 2010

Making one reg ex out of two

Related topics