Go through directories recursively

Hello,

I'm quite new to Ruby.
I'd to walk through a directory tree and edit all jsp files in there.

I do it this way:

def editDirectory(path)
     Dir.entries(path).each { |filename|
       next if filename =~ /^\.+$/
       currFile = path + "/" + filename
       if(File.stat(currFile).directory?)
         editDirectory(currFile)
       elsif(filename =~ /(.+)\.jsp$/)
         # ... do something with the file
       end
     }
end

I'd like to know if there is a more professional or more "ruby-like" way to do this...

Thanx for every hint,
Jens

Jens Riedel wrote:

Hello,

I'm quite new to Ruby.
I'd to walk through a directory tree and edit all jsp files in there.

I do it this way:

def editDirectory(path)
     Dir.entries(path).each { |filename|
       next if filename =~ /^\.+$/
       currFile = path + "/" + filename
       if(File.stat(currFile).directory?)
         editDirectory(currFile)
       elsif(filename =~ /(.+)\.jsp$/)
         # ... do something with the file
       end
     }
end

I'd like to know if there is a more professional or more "ruby-like"
way to do this...

Thanx for every hint,
Jens

require 'find'

def edit_directory(dir)
  Find::find(dir) do |f|
    if /\.jsp$/i =~ f and File.file? f
      # do something with f
    end
  end
end

Kind regards

    robert

HI --

Hello,

I'm quite new to Ruby.
I'd to walk through a directory tree and edit all jsp files in there.

I do it this way:

def editDirectory(path)
   Dir.entries(path).each { |filename|
     next if filename =~ /^\.+$/
     currFile = path + "/" + filename
     if(File.stat(currFile).directory?)
       editDirectory(currFile)
     elsif(filename =~ /(.+)\.jsp$/)
       # ... do something with the file
     end
   }
end

I'd like to know if there is a more professional or more "ruby-like" way to do this...

You can use the 'find' module. Here's a parameterized version, which
takes as its arguments a path and a file extension, and yields back
the filenames it finds one by one:

   require 'find'

   def find_by_extension(path, ext)
     Find.find(path) do |f|
       next unless FileTest.file?(f)
       next unless /#{Regexp.escape(ext)}$/.match(f)
       yield f
     end
   end

   # Example of usage:

   find_by_extension("/home/dblack", ".rb") do |f|
     # do stuff with f
   end

David

···

On Thu, 12 May 2005, Jens Riedel wrote:

--
David A. Black
dblack@wobblini.net

Hi Jens

one possible solution is to use the Find library

require 'find'

Find::find(path){|file|
next if !File.basename(x).include?(".jsp")
## do what you need on jsp files
}

where path is the starting dir where you start the search

Don't know if it is more ruby, but for me is working quite well.

See you, Riccardo :slight_smile:

Jens Riedel wrote:

Hello,

I'm quite new to Ruby.
I'd to walk through a directory tree and edit all jsp files in there.

I do it this way:

def editDirectory(path)
     Dir.entries(path).each { |filename|
       next if filename =~ /^\.+$/
       currFile = path + "/" + filename
       if(File.stat(currFile).directory?)
         editDirectory(currFile)
       elsif(filename =~ /(.+)\.jsp$/)
         # ... do something with the file
       end
     }
end

I'd like to know if there is a more professional or more "ruby-like"

way

···

to do this...

Thanx for every hint,
Jens

Thanx for all of your hints, I chose the Find.find method.

Regards,
Jens

Robert Klemme wrote:

Jens Riedel wrote:

Hello,

I'm quite new to Ruby.
I'd to walk through a directory tree and edit all jsp files in there.

I do it this way:

def editDirectory(path)
     Dir.entries(path).each { |filename|
       next if filename =~ /^\.+$/
       currFile = path + "/" + filename
       if(File.stat(currFile).directory?)
         editDirectory(currFile)
       elsif(filename =~ /(.+)\.jsp$/)
         # ... do something with the file
       end
     }
end

I'd like to know if there is a more professional or more "ruby-like"
way to do this...

Thanx for every hint,
Jens

require 'find'

def edit_directory(dir)
  Find::find(dir) do |f|
    if /\.jsp$/i =~ f and File.file? f
      # do something with f
    end
  end
end

Kind regards

    robert

PS: You can also do

Dir["**/*.jsp"].each do |f|
  if File.file? f
    # do something with f
  end
end

David A. Black wrote:

HI --

Hello,

I'm quite new to Ruby.
I'd to walk through a directory tree and edit all jsp files in there.

I do it this way:

def editDirectory(path)
   Dir.entries(path).each { |filename|
     next if filename =~ /^\.+$/
     currFile = path + "/" + filename
     if(File.stat(currFile).directory?)
       editDirectory(currFile)
     elsif(filename =~ /(.+)\.jsp$/)
       # ... do something with the file
     end
   }
end

I'd like to know if there is a more professional or more "ruby-like"
way to do this...

You can use the 'find' module. Here's a parameterized version, which
takes as its arguments a path and a file extension, and yields back
the filenames it finds one by one:

   require 'find'

   def find_by_extension(path, ext)
     Find.find(path) do |f|
       next unless FileTest.file?(f)
       next unless /#{Regexp.escape(ext)}$/.match(f)
       yield f
     end
   end

   # Example of usage:

   find_by_extension("/home/dblack", ".rb") do |f|
     # do stuff with f
   end

David

Even more generic:

module Find
  def self.find_cond(dir, cond)
    find(dir){|f| yield f if cond === f}
  end
end

Find::find_cond ".", /\.jsp$/ do |f|
  puts f
end

# and including the file type test
proper_test = Object.new
def proper_test.===(f) /\.jsp$/ =~ f and File.file? f end

Find::find_cond ".", proper_test do |f|
  puts f
end

Hm, this might be worthwile to go into find.rb...

Kind regards

    robert

···

On Thu, 12 May 2005, Jens Riedel wrote:

one thing to consider is that neither of these approaches follows links:

   harp:~/tmp > rm -rf *
   harp:~/tmp > mkdir a
   harp:~/tmp > touch a/b
   harp:~/tmp > ln -s /tmp/ c
   harp:~/tmp > ruby -r find -e 'Find::find("."){|f| p f}'
   "."
   "./c"
   "./a"
   "./a/b"

using Dir# doesn't either:

   harp:~/tmp > ruby -r find -e 'p Dir["**/*"]'
   ["a", "a/b", "c"]

use find2 from the RAA to follow links:

   harp:~/tmp > ruby -r find2 -e 'Find2::find(:follow=>true){|f| p f}' | head
   "."
   "./a"
   "./a/b"
   "./c"
   "./c/lost+found"
   "./c/.font-unix"
   "./c/.font-unix/fs7100"
   "./c/.306.80a5"
   "./c/.ICE-unix"
   "./c/.ICE-unix/dcop22125-1115045753"

i've had both following and not-following links cause frustrating bugs - so
it's good to be aware of which/when you need the behaviour.

cheers.

-a

···

On Thu, 12 May 2005, Robert Klemme wrote:

Jens Riedel wrote:

Hello,

I'm quite new to Ruby.
I'd to walk through a directory tree and edit all jsp files in there.

I do it this way:

def editDirectory(path)
     Dir.entries(path).each { |filename|
       next if filename =~ /^\.+$/
       currFile = path + "/" + filename
       if(File.stat(currFile).directory?)
         editDirectory(currFile)
       elsif(filename =~ /(.+)\.jsp$/)
         # ... do something with the file
       end
     }
end

I'd like to know if there is a more professional or more "ruby-like"
way to do this...

Thanx for every hint,
Jens

require 'find'

def edit_directory(dir)
Find::find(dir) do |f|
   if /\.jsp$/i =~ f and File.file? f
     # do something with f
   end
end
end

--

email :: ara [dot] t [dot] howard [at] noaa [dot] gov
phone :: 303.497.6469
renunciation is not getting rid of the things of this world, but accepting
that they pass away. --aitken roshi

===============================================================================

Hi,

At Thu, 12 May 2005 23:25:28 +0900,
Ara.T.Howard@noaa.gov wrote in [ruby-talk:142376]:

one thing to consider is that neither of these approaches follows links:

It can cause infinite recursion.

···

--
Nobu Nakada

For this kind of thing I find the enumerator module very useful:

require 'find'
require 'enumerator'

Find.to_enum(:find, ".").grep(/\.jsp$/)
=> #array of files that end with .jsp

# or (to exclude directories):
Find.to_enum(:find, ".").select do |f|
  /\.jsp$/ =~ f && File.file? f
end

Cheers,
KB

···

On Thu, 12 May 2005 13:33:25 +0200, Robert Klemme wrote:

<snip>
Even more generic:

module Find
  def self.find_cond(dir, cond)
    find(dir){|f| yield f if cond === f}
  end
end

Find::find_cond ".", /\.jsp$/ do |f|
  puts f
end

# and including the file type test
proper_test = Object.new
def proper_test.===(f) /\.jsp$/ =~ f and File.file? f end

Find::find_cond ".", proper_test do |f|
  puts f
end

Robert Klemme wrote:

Robert Klemme wrote:
>
> require 'find'
>
> def edit_directory(dir)
> Find::find(dir) do |f|
> if /\.jsp$/i =~ f and File.file? f
> # do something with f
> end
> end
> end
>
> Kind regards
>
> robert

PS: You can also do

Dir["**/*.jsp"].each do |f|
  if File.file? f
    # do something with f
  end
end

Question: what if the operation you want to perform is to delete
directories or files? From my understanding of Find#find, it yields
directories breadth-firsh rather than depth first. So if you want to
delete a directory yielded to your block, do you have to call
Find#prune on it to avoid messing up the traversal?

Likewise for Dir["**/*.jsp"]. Does this form first create an internal
array of file names before yielding them, or does it yield each file as
it inspects each directory? If the latter, does the traversal get
messed up if you delete a file out from under it?

I didn't know about either of the two above options, so the first time
I needed to traverse directories recursively (to delete files, no less)
I rolled my own solution, shown below. Until I know that either of the
above options is safe for this, I will continue to use it.

# Yields the name of each directory within the specified directory
# recursively. Order is depth-first, to accommodate deleting files in
the
# yielded directory from within the block. Passed in directory is
yielded last
def each_dir(base, &block)
    Dir.entries(base).each do |n|
        unless (n == '.') || (n == '..') # Don't traverse . or ..
            n = File.join(base, n) unless base == '.'
            if File.directory?(n)
                each_dir(n, &block)
                yield n
            end
        end
    end
    yield base
end

find2 watches for this:

   harp:~/tmp > pwd
   /home/ahoward/tmp

   harp:~/tmp > ls

   harp:~/tmp > ln -s /tmp/ ./link_to_tmp

   harp:~/tmp > file !$
   ./link_to_tmp: symbolic link to /tmp/

   harp:~/tmp > ln -s `pwd`/link_to_tmp /tmp/link_to_link_to_tmp

   harp:~/tmp > file !$
   /tmp/link_to_link_to_tmp: symbolic link to /home/ahoward/tmp/link_to_tmp

   harp:~/tmp > ruby -r alib -e 'ALib::Util::find(:follow=>true){|f| p f}' | tail
   "./link_to_tmp/pico.964304"
   "./link_to_tmp/Scha03aTraits.pdf"
   "./link_to_tmp/jpsock.142_04.25373"
   "./link_to_tmp/.pico.964304.swp"
   "./link_to_tmp/foo.txt"
   "./link_to_tmp/Scha03aTraits.ps"
   "./link_to_tmp/Duca04wtoplastraitnotfinal.pdf"
   "./link_to_tmp/TR_CSE_02-012.pdf"
   "./link_to_tmp/link_to_link_to_tmp"
   "./link_to_tmp/druby23486.0"

   harp:~/tmp > ruby -r alib -e 'ALib::Util::find(:follow=>true){|f| p f}' | grep link_to_link
   "./link_to_tmp/link_to_link_to_tmp"

from the find2 code i've incorporated into my personal lib (Alib):

···

On Fri, 13 May 2005 nobu.nokada@softhome.net wrote:

Hi,

At Thu, 12 May 2005 23:25:28 +0900,
Ara.T.Howard@noaa.gov wrote in [ruby-talk:142376]:

one thing to consider is that neither of these approaches follows links:

It can cause infinite recursion.

   #
   # If `entry_path' is a directory, find recursively.
   #
   if stat_result.directory? \
     && (!@xdev || @xdev_device == stat_result.dev) \
     && (!@follow || !visited?(stat_result))
     @dirname_stats.push(stat_result)
     find_directory(entry_path, block)
     @dirname_stats.pop
   end

kind regards.

-a
--

email :: ara [dot] t [dot] howard [at] noaa [dot] gov
phone :: 303.497.6469
renunciation is not getting rid of the things of this world, but accepting
that they pass away. --aitken roshi

===============================================================================

Kristof Bastiaensen wrote:

···

On Thu, 12 May 2005 13:33:25 +0200, Robert Klemme wrote:

<snip>
Even more generic:

module Find
  def self.find_cond(dir, cond)
    find(dir){|f| yield f if cond === f}
  end
end

Find::find_cond ".", /\.jsp$/ do |f|
  puts f
end

# and including the file type test
proper_test = Object.new
def proper_test.===(f) /\.jsp$/ =~ f and File.file? f end

Find::find_cond ".", proper_test do |f|
  puts f
end

For this kind of thing I find the enumerator module very useful:

require 'find'
require 'enumerator'

Find.to_enum(:find, ".").grep(/\.jsp$/)
=> #array of files that end with .jsp

# or (to exclude directories):
Find.to_enum(:find, ".").select do |f|
  /\.jsp$/ =~ f && File.file? f
end

Cheers,
KB

Ah, even better! /me makes mental note "should start using enumerator
more".

Thanks!

    robert

Karl von Laudermann wrote:

Question: what if the operation you want to perform is to delete
directories or files? From my understanding of Find#find, it yields
directories breadth-firsh rather than depth first. So if you want to
delete a directory yielded to your block, do you have to call
Find#prune on it to avoid messing up the traversal?

There is a prune method that should do it, although I never used that. See
http://www.ruby-doc.org/stdlib/libdoc/find/rdoc/classes/Find.html#M000002

Likewise for Dir["**/*.jsp"]. Does this form first create an internal
array of file names before yielding them, or does it yield each file
as it inspects each directory? If the latter, does the traversal get
messed up if you delete a file out from under it?

Dir always creates the array first so this is save.

Kind regards

    robert

Hi,

At Fri, 13 May 2005 00:45:27 +0900,
Ara.T.Howard@noaa.gov wrote in [ruby-talk:142400]:

from the find2 code i've incorporated into my personal lib (Alib):

   #
   # If `entry_path' is a directory, find recursively.
   #
   if stat_result.directory? \
     && (!@xdev || @xdev_device == stat_result.dev) \
     && (!@follow || !visited?(stat_result))

This "visited?" method would be the key. We know it is possible, but
judged it is too expensive to implement as built-in.

···

--
Nobu Nakada

Find2 does/can too.

-a

···

On Fri, 13 May 2005, Robert Klemme wrote:

Karl von Laudermann wrote:

Question: what if the operation you want to perform is to delete
directories or files? From my understanding of Find#find, it yields
directories breadth-firsh rather than depth first. So if you want to
delete a directory yielded to your block, do you have to call
Find#prune on it to avoid messing up the traversal?

There is a prune method that should do it, although I never used that. See
http://www.ruby-doc.org/stdlib/libdoc/find/rdoc/classes/Find.html#M000002

Likewise for Dir["**/*.jsp"]. Does this form first create an internal
array of file names before yielding them, or does it yield each file
as it inspects each directory? If the latter, does the traversal get
messed up if you delete a file out from under it?

Dir always creates the array first so this is save.

--

email :: ara [dot] t [dot] howard [at] noaa [dot] gov
phone :: 303.497.6469
renunciation is not getting rid of the things of this world, but accepting
that they pass away. --aitken roshi

===============================================================================

it doesn't look too bad in these dirs - which are quite big:

   gilligan:/ftp/cfd0-0/avg_dn > time ruby -r find -e 'a = ;Find::find(Dir::pwd){|f| a << f};p a.size'

   35288

   real 0m0.731s
   user 0m0.490s
   sys 0m0.240s

   gilligan:/ftp/cfd0-0/avg_dn > time ruby -r find2 -e 'a = ;Find2::find(Dir::pwd,:follow=>true){|f| a << f};p a.size'

   35286

   real 0m0.990s
   user 0m0.760s
   sys 0m0.200s

also note that following links is an option.

my problem with the builtin Find module and Dir is that our site has tons of
disks and so have to organize them logically: eg

   ...
   /mnt/raid/0/1/2
   /mnt/raid/0/1/3
   ...
   /mnt/raid/0/2/0
   ...

etc. this is pretty ugly to deal with so we also setup functional and
hierarchical structures via symlink to we can remember what all the stuff is:

   /dmsp/rico/ftp -> /dmsp/ftp/rico
   /dmsp/rico/www -> /dmsp/www/rico

   /dmsp/ftp -> /mnt/raid/0/1/2/
   /dmsp/www -> /mnt/raid/0/1/3/

and ruby then becomes useless for doing find type operations in this
environment since any view of the fs stops dead at a link. in any case, i
certainly can (have) dealt with it by using Find2 - is there any reason this
module, or something similar, shouldn't be included in the core?

kind regards.

-a

···

On Fri, 13 May 2005, Nakada, Nobuyoshi wrote:

Hi,

At Fri, 13 May 2005 00:45:27 +0900,
Ara.T.Howard@noaa.gov wrote in [ruby-talk:142400]:

from the find2 code i've incorporated into my personal lib (Alib):

   #
   # If `entry_path' is a directory, find recursively.
   #
   if stat_result.directory? \
     && (!@xdev || @xdev_device == stat_result.dev) \
     && (!@follow || !visited?(stat_result))

This "visited?" method would be the key. We know it is possible, but
judged it is too expensive to implement as built-in.

--

email :: ara [dot] t [dot] howard [at] noaa [dot] gov
phone :: 303.497.6469
renunciation is not getting rid of the things of this world, but accepting
that they pass away. --aitken roshi

===============================================================================