Regular expression question

Your problem is that '*' is greedy so it'll match as many 'any characters' as it can. Try

  /.*?(\d+).*?(\d+).*?(\d+)/

Usually I'd tend to use something like:

  /[^\d]*(\d+)[^\d]*(\d+)[^\d]*(\d+)/

instead, to make it explicit I want not digits, followed by digits, etc...

Hope that helps,
Ross

···

On Wed, 21 Dec 2005 18:03:13 -0000, DeZo <nobody@nowhere.com> wrote:

Why does the following code:

line = " rows = 10 cols = 1 occupied cells = 0"
line =~ /.*(\d+).*(\d+).*(\d+)/
print(" scanned rows = ",$1," cols = ",$2," occ = ",$3,"\n")

print this when it runs:

  scanned rows = 0 cols = 1 occ = 0

(notice rows is zero!)

What have I done wrong?

--
Ross Bamford - rosco@roscopeco.remove.co.uk

Ross Bamford wrote:

Why does the following code:

line = " rows = 10 cols = 1 occupied cells = 0"
line =~ /.*(\d+).*(\d+).*(\d+)/
print(" scanned rows = ",$1," cols = ",$2," occ = ",$3,"\n")

print this when it runs:

  scanned rows = 0 cols = 1 occ = 0

(notice rows is zero!)

What have I done wrong?

Your problem is that '*' is greedy so it'll match as many 'any
characters' as it can. Try

/.*?(\d+).*?(\d+).*?(\d+)/

Usually I'd tend to use something like:

/[^\d]*(\d+)[^\d]*(\d+)[^\d]*(\d+)/

instead, to make it explicit I want not digits, followed by digits,
etc...

Some other solutions with individual pros and cons:

line = " rows = 10 cols = 1 occupied cells = 0"

=> " rows = 10 cols = 1 occupied cells = 0"

line.scan(/\d+/)

=> ["10", "1", "0"]

line.scan(/\d+/).map {|s| s.to_i}

=> [10, 1, 0]

line.scan(/\w+\s*=\s*(\d+)/)

=> [["10"], ["1"], ["0"]]

line.scan(/\w+\s*=\s*(\d+)/).map {|m| m[0].to_i}

=> [10, 1, 0]

And explicitely matching the pattern:

/rows\s*=\s*(\d+)\s*cols\s*=\s*(\d+)\s*occupied cells\s*=\s*(\d+)/ =~

line and [$1, $2, $3]
=> ["10", "1", "0"]

/rows\s*=\s*(\d+)\s*cols\s*=\s*(\d+)\s*occupied cells\s*=\s*(\d+)/ =~

line and [$1.to_i, $2.to_i, $3.to_i]
=> [10, 1, 0]

Kind regards

    robert

···

On Wed, 21 Dec 2005 18:03:13 -0000, DeZo <nobody@nowhere.com> wrote:

Ahh, much better. Another KISS reminder gets it's own page (again) in my notebook..

Thanks :slight_smile:

···

On Thu, 22 Dec 2005 09:42:00 -0000, Robert Klemme <bob.news@gmx.net> wrote:

Ross Bamford wrote:

On Wed, 21 Dec 2005 18:03:13 -0000, DeZo <nobody@nowhere.com> wrote:

Why does the following code:

line = " rows = 10 cols = 1 occupied cells = 0"
line =~ /.*(\d+).*(\d+).*(\d+)/
print(" scanned rows = ",$1," cols = ",$2," occ = ",$3,"\n")

print this when it runs:

  scanned rows = 0 cols = 1 occ = 0

(notice rows is zero!)

What have I done wrong?

Your problem is that '*' is greedy so it'll match as many 'any
characters' as it can. Try

/.*?(\d+).*?(\d+).*?(\d+)/

Usually I'd tend to use something like:

/[^\d]*(\d+)[^\d]*(\d+)[^\d]*(\d+)/

instead, to make it explicit I want not digits, followed by digits,
etc...

Some other solutions with individual pros and cons:

line = " rows = 10 cols = 1 occupied cells = 0"

=> " rows = 10 cols = 1 occupied cells = 0"

line.scan(/\d+/)

=> ["10", "1", "0"]

line.scan(/\d+/).map {|s| s.to_i}

=> [10, 1, 0]

line.scan(/\w+\s*=\s*(\d+)/)

=> [["10"], ["1"], ["0"]]

line.scan(/\w+\s*=\s*(\d+)/).map {|m| m[0].to_i}

=> [10, 1, 0]

--
Ross Bamford - rosco@roscopeco.remove.co.uk

I've been using this idiom recently.

line = " rows = 10 cols = 1 occupied cells = 0"

=> " rows = 10 cols = 1 occupied cells = 0"

if line[/.*?(\d+).*?(\d+).*?(\d+)/]
rows, cols, cells = $1.to_i, $2.to_i, $3.to_i
end

=> [10, 1, 0]

rows

=> 10

cols

=> 1

cells

=> 0