Regexp help

Hello everyone,

I have a string of the form

2h 3m

or

3m 2h

or

2h 3minutes

or

2hour 3min

and so on

Is there a smart regexp one liner that could produce

[2, 3]

If anyone types just for example

2

than that should produce [2]

for any of the above input? I know that there will be an m or an h.

/Marcus

Hello

I have a string of the form

2h 3m

or

3m 2h
[....]
Is there a smart regexp one liner that could produce

[2, 3]

  If you want to get [2,3] in both cases, that will be really difficult.
As far as I know, you can only do that in C#, which has named capturing
groups. In all the other languages I know, the capturing groups are
numbered when they are found... That rules it out.

  By the way, would it be difficult to implement named capturing groups
in regular expressions ? Would that interest someone ?

  Cheers !

  Vince

Marcus Bristav wrote:

Is there a smart regexp one liner that could produce

[2, 3]

r = Regexp.new(/(\d+)h.*(\d+)m/)
s1 = "2h 3m"
s2 = "2h 3minutes"
s3 = "2hour 3min"
m = r.match(s1)
p [m[1].to_i, m[2].to_i] # => [2, 3]
m = r.match(s2)
p [m[1].to_i, m[2].to_i] # => [2, 3]
m = r.match(s3)
p [m[1].to_i, m[2].to_i] # => [2, 3]

Regards,
Jordan

Hi,

From: "Marcus Bristav" <marcus.bristav@gmail.com>
Reply-To: ruby-talk@ruby-lang.org
To: ruby-talk@ruby-lang.org (ruby-talk ML)
Subject: Regexp help
Date: Fri, 29 Sep 2006 18:04:16 +0900

Hello everyone,

I have a string of the form

2h 3m

or

3m 2h

or

2h 3minutes

or

2hour 3min

and so on

Is there a smart regexp one liner that could produce

[2, 3]

If anyone types just for example

2

than that should produce [2]

for any of the above input? I know that there will be an m or an h.

/Marcus

str = "2h 3m" # or somthing
str.scan(/(\d+)(\w*)/).sort_by{|x|x[1]}.collect{|x|x[0].to_i}

Regards,

Park Heesob

I have a string of the form

[...]

Is there a smart regexp one liner that could produce

Hello Marcus,

here's my take on it:

times = %w{ 2hour3min 2h3minutes 3m2h 2h3m }
=> ["2hour3min", "2h3minutes", "3m2h", "2h3m"]

times.map{ |t| [t[/\d+h(a-z)*/].to_i, t[/\d+m(a-z)*/].to_i] }
=> [[2, 3], [2, 3], [2, 3], [2, 3]]

Probably a little slower than the other solutions but perhaps easier to grasp.

Regards
Matthias

Not so difficult, but it's not, as far as I can see, a
one liner. I am working something up at the moment
using an array of regexps.

--- Vincent Fourmond <vincent.fourmond@9online.fr>
wrote:

···

  Hello

> I have a string of the form
>
> 2h 3m
>
> or
>
> 3m 2h
> [....]
> Is there a smart regexp one liner that could
produce
>
> [2, 3]

  If you want to get [2,3] in both cases, that will
be really difficult.
As far as I know, you can only do that in C#, which
has named capturing
groups. In all the other languages I know, the
capturing groups are
numbered when they are found... That rules it out.

  By the way, would it be difficult to implement
named capturing groups
in regular expressions ? Would that interest someone
?

  Cheers !

  Vince

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around

ah neat, Jordan, and more elegant than parsing an
arrayh of regexps:)

···

--- MonkeeSage <MonkeeSage@gmail.com> wrote:

Marcus Bristav wrote:
> Is there a smart regexp one liner that could
produce
>
> [2, 3]

r = Regexp.new(/(\d+)h.*(\d+)m/)
s1 = "2h 3m"
s2 = "2h 3minutes"
s3 = "2hour 3min"
m = r.match(s1)
p [m[1].to_i, m[2].to_i] # => [2, 3]
m = r.match(s2)
p [m[1].to_i, m[2].to_i] # => [2, 3]
m = r.match(s3)
p [m[1].to_i, m[2].to_i] # => [2, 3]

Regards,
Jordan

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around

Nice one, thanks a lot!

/Marcus

···

On 9/29/06, Park Heesob <phasis68@hotmail.com> wrote:

str = "2h 3m" # or somthing
str.scan(/(\d+)(\w*)/).sort_by{|x|x[1]}.collect{|x|x[0].to_i}

Regards,

Park Heesob

Hello again !

I have a string of the form

2h 3m

or

3m 2h
[....]
Is there a smart regexp one liner that could produce

[2, 3]

  If you want to get [2,3] in both cases, that will be really difficult.
As far as I know, you can only do that in C#, which has named capturing
groups. In all the other languages I know, the capturing groups are
numbered when they are found... That rules it out.

  Well, just to contradict myself, although this is no one-liner:

def scan(str)
  re = Regexp.new(/(\d+)h.*(\d+)m|(\d+)m.*(\d+)h/)
  if m = re.match(str)
    return [m[1], m[2]] if m[1]
    return [m[4], m[3]]
  end
end

p scan("2h 3m")
p scan("3m 2h")

  Cheers !

  Vince

Park Heesob schrieb:

str = "2h 3m" # or somthing
str.scan(/(\d+)(\w*)/).sort_by{|x|x[1]}.collect{|x|x[0].to_i}

Very nice idea, Park! I wouldn't have thought of that. Slightly shorter:

   str.scan(/(\d+)(\w)/).sort_by{|n,u|u}.map{|n,u|n.to_i}

Regards,
Pit

  Hello

> I have a string of the form
>
> 2h 3m
>
> or
>
> 3m 2h
> [....]
> Is there a smart regexp one liner that could produce
>
> [2, 3]

  If you want to get [2,3] in both cases, that will be really difficult.
As far as I know, you can only do that in C#, which has named capturing
groups. In all the other languages I know, the capturing groups are
numbered when they are found... That rules it out.

a

=> ["2h 3m", "3m 2h", "2h 3minutes", "2hour 3min", "2"]

re

=> /(?=.*\b(\d+)(?=h|\b))(?=.*\b(\d+)m|)/

a.map {|x| x.match(re).captures}

=> [["2", "3"], ["2", "3"], ["2", "3"], ["2", "3"], ["2", nil]]

···

On Fri, 29 Sep 2006, Vincent Fourmond wrote:

--
Relm

But, of course, that *won't* capture "3m 2h", like you described...

···

On 29/09/06, Steve Callaway <sjc2000_uk@yahoo.com> wrote:

ah neat, Jordan, and more elegant than parsing an
arrayh of regexps:)

--- MonkeeSage <MonkeeSage@gmail.com> wrote:

> Marcus Bristav wrote:
> > Is there a smart regexp one liner that could
> produce
> >
> > [2, 3]
>
> r = Regexp.new(/(\d+)h.*(\d+)m/)
> s1 = "2h 3m"
> s2 = "2h 3minutes"
> s3 = "2hour 3min"
> m = r.match(s1)
> p [m[1].to_i, m[2].to_i] # => [2, 3]
> m = r.match(s2)
> p [m[1].to_i, m[2].to_i] # => [2, 3]
> m = r.match(s3)
> p [m[1].to_i, m[2].to_i] # => [2, 3]
>
> Regards,
> Jordan
>

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

Vincent Fourmond a écrit :

  Hello again !

I have a string of the form

2h 3m

or

3m 2h
[....]
Is there a smart regexp one liner that could produce

[2, 3]

  If you want to get [2,3] in both cases, that will be really difficult.
As far as I know, you can only do that in C#, which has named capturing
groups. In all the other languages I know, the capturing groups are
numbered when they are found... That rules it out.

  Well, just to contradict myself, although this is no one-liner:

def scan(str)
  re = Regexp.new(/(\d+)h.*(\d+)m|(\d+)m.*(\d+)h/)
  if m = re.match(str)
    return [m[1], m[2]] if m[1]
    return [m[4], m[3]]
  end
end

p scan("2h 3m")
p scan("3m 2h")

  Cheers !

  Vince

And the one-liner :

$ irb
>> "3m 2h".scan(/(\d+)h.*(\d+)m|(\d+)m.*(\d+)h/).flatten.values_at(0,1,3,2).compact
=> ["2", "3"]
>> "2h 3m".scan(/(\d+)h.*(\d+)m|(\d+)m.*(\d+)h/).flatten.values_at(0,1,3,2).compact
=> ["2", "3"]

It's possible to add .map { |i| i.to_i } at the end of this one-liner if the result array must contain integers instead of strings.

···

--
Bruno Michel

Python regexps have named capturing groups. It's extremely helpful if you need to construct complicated patterns; because the index of each capturing group can eaasily change when you add and remove things in the regexp.

Tom

···

On Sep 30, 2006, at 2:55 AM, Relm wrote:

As far as I know, you can only do that in C#, which has named capturing
groups. In all the other languages I know, the capturing groups are
numbered when they are found... That rules it out.

Tom Armitage wrote:

But, of course, that *won't* capture "3m 2h", like you described...

True...

So:

r = Regexp.new(/(\d+)h?m?.*(\d+)m?h?/)

'Course, then you'll have [3, 2] for the edge case rather than [2,
3]...but to get the full functionality that the OP described (including
the case where just "2" is given), you'd need fancier logic than just
regexp anyhow.

Regards,
Jordan

And the one-liner :

$ irb

"3m

2h".scan(/(\d+)h.*(\d+)m|(\d+)m.*(\d+)h/).flatten.values_at(0,1,3,2).compact

  That's a nice one !

  Vince