Simple regex question

Hello.
I need to parse through thousands of TIFF files and do some re-naming.
These files have underscores in them followed by a sequential number. I
need to grab just the "root" of the filename, without the underscore or
the numbers.
  Dir.chdir("L:/infocontiffs/ehs-g7917741")
  files = Dir.glob("*.tiff")
  file = files[0]
  puts file
  file = file.gsub(/^(.*)_[0-9]+\.tiff/, "#{$1}")
  puts file
What I get with this is:
        ehs-g7917741_01.tiff
Why doesn't it give me my root filename?
Thanks,
Peter

···

--
Posted via http://www.ruby-forum.com/.

Peter Bailey wrote:

Hello.
I need to parse through thousands of TIFF files and do some re-naming.
These files have underscores in them followed by a sequential number. I
need to grab just the "root" of the filename, without the underscore or
the numbers.
  Dir.chdir("L:/infocontiffs/ehs-g7917741")
  files = Dir.glob("*.tiff")
  file = files[0]
  puts file
  file = file.gsub(/^(.*)_[0-9]+\.tiff/, "#{$1}")
  puts file
What I get with this is:
        ehs-g7917741_01.tiff
Why doesn't it give me my root filename?
Thanks,
Peter

Is this what you want?

while fname = DATA.gets
  m = fname.match /(.*?)_\d+\.tiff/
  if m
    puts "Match: '#{m[1]}'"
  else
    puts "No match: #{fname}"
  end
end

__END__
ehs-g7917741_01.tiff
asadsasd_12345.tiff
ljhkjhkh_1_2_3.tiff
xxxx__1.tiff
xxxx_.tiff
xxxx.tiff
xxxx
_.tiff
_01.tiff

···

--
Posted via http://www.ruby-forum.com/\.

Peter Bailey wrote:

Hello.
I need to parse through thousands of TIFF files and do some re-naming.
These files have underscores in them followed by a sequential number. I
need to grab just the "root" of the filename, without the underscore or
the numbers.
  Dir.chdir("L:/infocontiffs/ehs-g7917741")
  files = Dir.glob("*.tiff")
  file = files[0]
  puts file
  file = file.gsub(/^(.*)_[0-9]+\.tiff/, "#{$1}")

The argument "#{$1}" is expanded once, before gsub even executes. You
probably want the block form:

  file = file.sub(/^(.*)_\d+\.tiff/) { $1 }

···

--
Posted via http://www.ruby-forum.com/\.

Hi --

Hello.
I need to parse through thousands of TIFF files and do some re-naming.
These files have underscores in them followed by a sequential number. I
need to grab just the "root" of the filename, without the underscore or
the numbers.
Dir.chdir("L:/infocontiffs/ehs-g7917741")
files = Dir.glob("*.tiff")
file = files[0]
puts file
file = file.gsub(/^(.*)_[0-9]+\.tiff/, "#{$1}")
puts file
What I get with this is:
       ehs-g7917741_01.tiff
Why doesn't it give me my root filename?

Here's another good use of the string[//] technique:

file = "ehs-g7917741_01.tiff"

=> "ehs-g7917741_01.tiff"

file[/[^_]+/] # match non-underscore characters

=> "ehs-g7917741"

David

···

On Fri, 26 Jun 2009, Peter Bailey wrote:

--
David A. Black / Ruby Power and Light, LLC
Ruby/Rails consulting & training: http://www.rubypal.com
Now available: The Well-Grounded Rubyist (http://manning.com/black2\)
"Ruby 1.9: What You Need To Know" Envycasts with David A. Black
http://www.envycasts.com

Tim Hunter wrote:

Peter Bailey wrote:

Hello.
I need to parse through thousands of TIFF files and do some re-naming.
These files have underscores in them followed by a sequential number. I
need to grab just the "root" of the filename, without the underscore or
the numbers.
  Dir.chdir("L:/infocontiffs/ehs-g7917741")
  files = Dir.glob("*.tiff")
  file = files[0]
  puts file
  file = file.gsub(/^(.*)_[0-9]+\.tiff/, "#{$1}")
  puts file
What I get with this is:
        ehs-g7917741_01.tiff
Why doesn't it give me my root filename?
Thanks,
Peter

Is this what you want?

while fname = DATA.gets
  m = fname.match /(.*?)_\d+\.tiff/
  if m
    puts "Match: '#{m[1]}'"
  else
    puts "No match: #{fname}"
  end
end

__END__
ehs-g7917741_01.tiff
asadsasd_12345.tiff
ljhkjhkh_1_2_3.tiff
xxxx__1.tiff
xxxx_.tiff
xxxx.tiff
xxxx
_.tiff
_01.tiff

Well, you gave me a good idea, using match. Here's what I did, and, it
worked. Thank you very much, Tim.

  Dir.chdir("L:/infocontiffs/ehs-g7917741")
  files = Dir.glob("*.tiff")
  file = files[0]
  puts file
  file = file.match(/^(.*)_[0-9]+\.tiff/)
  #file = file.to_i
  puts $1
  #end
gives me:
        ehs-g7917741_01.tiff
        ehs-g7917741

        Program exited with code 0

···

--
Posted via http://www.ruby-forum.com/\.

Beautiful. Thanks.

···

--
Posted via http://www.ruby-forum.com/.

Combining all the good suggestions this is probably what I'd do:

files = Dir.glob("L:/infocontiffs/ehs-g7917741/*.tiff")
files.each do |f|
  base = File.basename f
  root = base[/^([^_]+)_\d+\.tiff$/, 1]

  if base
   # rename or whatever
  else
    $stderr.puts "Dunno what to do with #{f}"
  end
end

The reason I left in the matching of underscores and digits is to be
sure that the complete name matches the pattern that we required in
order to detect other files that might accidentally have been placed
in that directory.

Kind regards

robert

···

2009/6/26 David A. Black <dblack@rubypal.com>:

On Fri, 26 Jun 2009, Peter Bailey wrote:

Here's another good use of the string[//] technique:

file = "ehs-g7917741_01.tiff"

=> "ehs-g7917741_01.tiff"

file[/[^_]+/] # match non-underscore characters

=> "ehs-g7917741"

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/