Extracting numbers from a string

I have filenames from various digital cameras: DSC_1234.jpg,
CRW1234.jpg, etc. What I really want is the numeric portion of that
filename. How would I extract just that portion?

I expect it to involve the regex /\d+/, but I'm unclear how to extract a
portion of a string matching a regex.

Thank you

···

--
Posted via http://www.ruby-forum.com/.

Matt Jones wrote:

I have filenames from various digital cameras: DSC_1234.jpg,
CRW1234.jpg, etc. What I really want is the numeric portion of that
filename. How would I extract just that portion?

I expect it to involve the regex /\d+/, but I'm unclear how to extract a
portion of a string matching a regex.

Thank you

This may be the simplest (and arguably the most ruby-esque):
str = "DSC_1234.jpg"
num = str.scan(/\d+/)[0]

Other ways to do it:
num = str.match(/\d+/)[0]

OR
num = (/\d+/).match(str)[0]

OR
num = str.scan(/\d+/) {|match| match}

OR
num = str =~ /(\d+)/ ? $1 : nil

That is,
num = if str =~ /(\d+)/
   $1
else
   nil
end

OR
if str =~ /\d+/
   num = $~[0]
end

Some proponents of ruby have said that perl's "There is more than one way to do it," is a curse. But the same is true of ruby. However, it seems to me that most people learn reasonable idioms and common sense prevails.

Dan

Matt Jones wrote:

I have filenames from various digital cameras: DSC_1234.jpg,
CRW1234.jpg, etc. What I really want is the numeric portion of that
filename. How would I extract just that portion?

I expect it to involve the regex /\d+/, but I'm unclear how to extract a
portion of a string matching a regex.

Thank you

a = "DSC_1234.jpg"
b = a.gsub(/[^[:digit:]]/, '')

If you just want to extract one number from a string, you could write
something like :

if a="DSC_1234.jpg"

then a[/\d+/] will give you the first longest string of numbers, so
1234.

If you want to be more precise, you could use parenthesis to extract
the exact portion you want, like :

a[/DSC_(\d+)\.jpg/,1] (<=> a.match(/DSC_(\d+)\.jpg/)[1])

or even : a[/\ADSC_(\d+)\.jpg\Z/,1]

···

On 12 juin, 08:45, Matt Jones <mattjonesph...@gmail.com> wrote:

I have filenames from various digital cameras: DSC_1234.jpg,
CRW1234.jpg, etc. What I really want is the numeric portion of that
filename. How would I extract just that portion?

I expect it to involve the regex /\d+/, but I'm unclear how to extract a
portion of a string matching a regex.

Thank you

--
Posted viahttp://www.ruby-forum.com/.

Some solutions have been posted already, but here's mine:

  irb(main):001:0> s="DSC_1234.jpg"
  => "DSC_1234.jpg"
  irb(main):002:0> s.sub(/\D+(\d+).*/,'\1')
  => "1234"

basicially the regexp looks for :

  - one or more non-digits
  - one or more digits => because this is between parenthesis you can refer to
    it with \1 later on
  - something more

The digits (safely stored in \1) is all you want to keep... this assumed you
are only interested in the first sequence of numbers.

Cheers

  Bas

···

On Tue, Jun 12, 2007 at 03:45:04PM +0900, Matt Jones wrote:

I have filenames from various digital cameras: DSC_1234.jpg,
CRW1234.jpg, etc. What I really want is the numeric portion of that
filename. How would I extract just that portion?

--
Bas van Gils <bas@van-gils.org>, http://www.van-gils.org
[[[ Thank you for not distributing my E-mail address ]]]

Quod est inferius est sicut quod est superius, et quod est superius est sicut
quod est inferius, ad perpetranda miracula rei unius.

Last November (2006), there was a series of postings to the Columbus Ruby Brigade list beginning with:
http://groups.google.com/group/columbusrb/browse_frm/thread/9c2e682f9926bad0

This was the pattern that I used when responding to Bill's code because many of *my* pictures had names like "100_5142.jpg", "100_5143.jpg", etc.

NUMBERED_FILE_PATTERN = %r{^(.*\D)?(\d+)(.+)$}

It became a constant since I used it in three places.

Rob Biedenharn http://agileconsultingllc.com
Rob@AgileConsultingLLC.com

···

On Jun 12, 2007, at 2:45 AM, Matt Jones wrote:

I have filenames from various digital cameras: DSC_1234.jpg,
CRW1234.jpg, etc. What I really want is the numeric portion of that
filename. How would I extract just that portion?

I expect it to involve the regex /\d+/, but I'm unclear how to extract a
portion of a string matching a regex.

Thank you

Or even simpler

irb(main):001:0> "DSC_1234.jpg"[/\d+/]
=> "1234"
irb(main):002:0> Integer("DSC_1234.jpg"[/\d+/])
=> 1234

Kind regards

  robert

···

On 12.06.2007 09:32, come wrote:

If you just want to extract one number from a string, you could write
something like :

if a="DSC_1234.jpg"

then a[/\d+/] will give you the first longest string of numbers, so
1234.

If you want to be more precise, you could use parenthesis to extract
the exact portion you want, like :

a[/DSC_(\d+)\.jpg/,1] (<=> a.match(/DSC_(\d+)\.jpg/)[1])

or even : a[/\ADSC_(\d+)\.jpg\Z/,1]

A big thanks to everybody and all the creative solutions!

···

--
Posted via http://www.ruby-forum.com/.