Hello. I think your approach with using gsub is not the best possible
here. It's better to simply find the matching part using match and
substitute it for the whole string, like this:
place1 = string1.match(/(\d+)th\b/)[1]
The \b ensures that the next character after 'th' is not a word
character (\b is word boundary), and [1] at the end is extracting the
first bracketed group. It also makes it possible to skip the .* at both
ends, which is a bit ugly.
Apart from that, a useful piece of knowledge about regexps:
/.*?(\d+)th.*/ will match what you want, because the first .*? will be
reluctant to eat up more characters, so it will pass to \d+ as many
digits as it can.
Hello. I think your approach with using gsub is not the best possible
here.
Agree.
It's better to simply find the matching part using match and
substitute it for the whole string, like this:
place1 = string1.match(/(\d+)th\b/)[1]
For extraction there is a simpler solution
irb(main):002:0> "He is the 20th."[/(\d+)th\b/, 1]
=> "20"
irb(main):003:0> "25th"[/(\d+)th\b/, 1]
=> "25"
The \b ensures that the next character after 'th' is not a word
character (\b is word boundary), and [1] at the end is extracting the
first bracketed group. It also makes it possible to skip the .* at both
ends, which is a bit ugly.
Right.
Apart from that, a useful piece of knowledge about regexps:
/.*?(\d+)th.*/ will match what you want, because the first .*? will be
reluctant to eat up more characters, so it will pass to \d+ as many
digits as it can.
But reluctant is slow (see my benchmark from a few days ago).
Cheer
robert
···
2008/9/26 Thomas B. <tpreal@gmail.com>:
--
use.inject do |as, often| as.you_can - without end
It's better to simply find the matching part using match and
substitute it for the whole string, like this:
place1 = string1.match(/(\d+)th\b/)[1]
For extraction there is a simpler solution
irb(main):002:0> "He is the 20th."[/(\d+)th\b/, 1]
=> "20"
irb(main):003:0> "25th"[/(\d+)th\b/, 1]
=> "25"
Yes, I forgot about this one. +1
Apart from that, a useful piece of knowledge about regexps:
/.*?(\d+)th.*/ will match what you want, because the first .*? will be
reluctant to eat up more characters, so it will pass to \d+ as many
digits as it can.
But reluctant is slow (see my benchmark from a few days ago).
OK. I guess reluctant is slow especially when the string that it has to
cover is long. And I agree that it's not a very good idea to use
reluctant regexps in time-critical applications, and the first solution
is much better here. I mentioned them just to let the original poster
gain some knowledge. I use reluctant patterns when not in hurry, because
they make things much easier sometimes.