Find.find sorting: files before directories

I'm looking for a simple way to use Find.find to produce a list of
files/directories for NSIS. Ideally the list will sort files before
directories so as to minimize the number of times SetOutPath is used
(an NSIS function).

The script so far is as follows:
-----------snip-----------------------
dirs = ["jruby-1.0.3"]
excludes = []
for dir in dirs
  folder = ''
  Find.find(dir) do |path|
    if FileTest.directory?(path)
      if excludes.include?(File.basename(path))
        Find.prune # Don't look any further into this directory.
      else
        next
      end
    else
      if folder != File.dirname(path)
        folder = File.dirname(path)
        puts 'SetOutPath "' + folder + '"'
      end
      puts 'File "' + path + '"'
    end
  end
end

-----------------------end snip-------------------

Simple directory traversal, no problem there. The issue is that
Find.find doesn't allow any sort of ordering to be specified; namely
that directories are mixed in with files. The script produces the
following output:

---------------output---------------------------------
SetOutPath "jruby-1.0.3/docs"
File "jruby-1.0.3/docs/README.rails"
File "jruby-1.0.3/docs/README.coverage"
File "jruby-1.0.3/docs/Readline-HOWTO.txt"
SetOutPath "jruby-1.0.3/docs/rbyaml"
File "jruby-1.0.3/docs/rbyaml/README"
File "jruby-1.0.3/docs/rbyaml/LICENSE"
SetOutPath "jruby-1.0.3/docs"
File "jruby-1.0.3/docs/LICENSE.bouncycastle"
File "jruby-1.0.3/docs/LICENSE.ant"
File "jruby-1.0.3/docs/LICENCE.bsf"
SetOutPath "jruby-1.0.3/docs/jvyaml"
File "jruby-1.0.3/docs/jvyaml/README"
File "jruby-1.0.3/docs/jvyaml/LICENSE"
File "jruby-1.0.3/docs/jvyaml/CREDITS"
SetOutPath "jruby-1.0.3/docs"
File "jruby-1.0.3/docs/Glossary.txt"
File "jruby-1.0.3/docs/getting_involved.html"
-----------------------end output-------------------------

The desired output would sort sub-folders before files when traversing
a given directory, thus eliminating the duplicate entries for
SetOutPath "jruby-1.0.3/docs" as seen above.

------------desired output--------------------------------
SetOutPath "jruby-1.0.3/docs"
File "jruby-1.0.3/docs/README.rails"
File "jruby-1.0.3/docs/README.coverage"
File "jruby-1.0.3/docs/Readline-HOWTO.txt"
File "jruby-1.0.3/docs/LICENSE.bouncycastle"
File "jruby-1.0.3/docs/LICENSE.ant"
File "jruby-1.0.3/docs/LICENCE.bsf"
File "jruby-1.0.3/docs/Glossary.txt"
File "jruby-1.0.3/docs/getting_involved.html"
SetOutPath "jruby-1.0.3/docs/rbyaml"
File "jruby-1.0.3/docs/rbyaml/README"
File "jruby-1.0.3/docs/rbyaml/LICENSE"
SetOutPath "jruby-1.0.3/docs/jvyaml"
File "jruby-1.0.3/docs/jvyaml/README"
File "jruby-1.0.3/docs/jvyaml/LICENSE"
File "jruby-1.0.3/docs/jvyaml/CREDITS"
------------------------------end output--------------------------

Any ideas?

I forgot to include the require:
------------snip---------------------
require 'find' # oops!
dirs = ["jruby-1.0.3"]
excludes = []
for dir in dirs
  folder = ''
  Find.find(dir) do |path|
    if FileTest.directory?(path)
      if excludes.include?(File.basename(path))
        Find.prune # Don't look any further into this directory.
      else
        next
      end
    else
      if folder != File.dirname(path)
        folder = File.dirname(path)
        puts 'SetOutPath "' + folder + '"'
      end
      puts 'File "' + path + '"'
    end
  end
end
-----------------------end snip-------------------

IIRC Find.find does sort sub folders before files. To me your code seems pretty complex and I believe the major problem is that you check the same directory over and over again for exclusion because you use File.basename. Can't you just do this?

Find.find base do |path|
   if File.directory? path
     if excludes.include? path
       Find.prune
     else
       puts "SetOutPath #{path}"
     end
   else
     puts "File #{path}"
   end
end

Kind regards

  robert

···

On 20.03.2008 16:46, Adam Boyle wrote:

I'm looking for a simple way to use Find.find to produce a list of
files/directories for NSIS. Ideally the list will sort files before
directories so as to minimize the number of times SetOutPath is used
(an NSIS function).

The script so far is as follows:
-----------snip-----------------------
dirs = ["jruby-1.0.3"]
excludes =
for dir in dirs
  folder = ''
  Find.find(dir) do |path|
    if FileTest.directory?(path)
      if excludes.include?(File.basename(path))
        Find.prune # Don't look any further into this directory.
      else
        next
      end
    else
      if folder != File.dirname(path)
        folder = File.dirname(path)
        puts 'SetOutPath "' + folder + '"'
      end
      puts 'File "' + path + '"'
    end
  end
end

-----------------------end snip-------------------

Simple directory traversal, no problem there. The issue is that
Find.find doesn't allow any sort of ordering to be specified; namely
that directories are mixed in with files. The script produces the
following output:

---------------output---------------------------------
SetOutPath "jruby-1.0.3/docs"
File "jruby-1.0.3/docs/README.rails"
File "jruby-1.0.3/docs/README.coverage"
File "jruby-1.0.3/docs/Readline-HOWTO.txt"
SetOutPath "jruby-1.0.3/docs/rbyaml"
File "jruby-1.0.3/docs/rbyaml/README"
File "jruby-1.0.3/docs/rbyaml/LICENSE"
SetOutPath "jruby-1.0.3/docs"
File "jruby-1.0.3/docs/LICENSE.bouncycastle"
File "jruby-1.0.3/docs/LICENSE.ant"
File "jruby-1.0.3/docs/LICENCE.bsf"
SetOutPath "jruby-1.0.3/docs/jvyaml"
File "jruby-1.0.3/docs/jvyaml/README"
File "jruby-1.0.3/docs/jvyaml/LICENSE"
File "jruby-1.0.3/docs/jvyaml/CREDITS"
SetOutPath "jruby-1.0.3/docs"
File "jruby-1.0.3/docs/Glossary.txt"
File "jruby-1.0.3/docs/getting_involved.html"
-----------------------end output-------------------------

The desired output would sort sub-folders before files when traversing
a given directory, thus eliminating the duplicate entries for
SetOutPath "jruby-1.0.3/docs" as seen above.

------------desired output--------------------------------
SetOutPath "jruby-1.0.3/docs"
File "jruby-1.0.3/docs/README.rails"
File "jruby-1.0.3/docs/README.coverage"
File "jruby-1.0.3/docs/Readline-HOWTO.txt"
File "jruby-1.0.3/docs/LICENSE.bouncycastle"
File "jruby-1.0.3/docs/LICENSE.ant"
File "jruby-1.0.3/docs/LICENCE.bsf"
File "jruby-1.0.3/docs/Glossary.txt"
File "jruby-1.0.3/docs/getting_involved.html"
SetOutPath "jruby-1.0.3/docs/rbyaml"
File "jruby-1.0.3/docs/rbyaml/README"
File "jruby-1.0.3/docs/rbyaml/LICENSE"
SetOutPath "jruby-1.0.3/docs/jvyaml"
File "jruby-1.0.3/docs/jvyaml/README"
File "jruby-1.0.3/docs/jvyaml/LICENSE"
File "jruby-1.0.3/docs/jvyaml/CREDITS"
------------------------------end output--------------------------

Any ideas?

Adam Boyle wrote:

The desired output would sort sub-folders before files when traversing

If output is the only requirement for sort order, why not just have two
arrays, one for files and one for directories? You can then .sort.uniq
the directory listing and the files separately and output them at the
end of the traversal.

Mac

···

--
Posted via http://www.ruby-forum.com/\.

I mixed up sub-folders and files. I meant to say that I want to sort
files before sub-folders (so that the folder that a file resides in
comes right before the file in the list).

I tried the posted code, but I couldn't get it to run. Some String
comparison problem.
---------output---------------
SetOutPath jruby-1.0.3
SetOutPath Djruby-1.0.3/..
C:/ruby/jruby-1.0.3/lib/ruby/1.8/find.rb:45:in `find': comparison of
String with String failed (ArgumentError)
        from C:/ruby/jruby-1.0.3/lib/ruby/1.8/find.rb:38:in `find'
--------------end output-------

···

from :1:in `catch'
        from :1

IIRC Find.find does sort sub folders before files. To me your code
seems pretty complex and I believe the major problem is that you check
the same directory over and over again for exclusion because you use
File.basename. Can't you just do this?

Find.find base do |path|
   if File.directory? path
     if excludes.include? path
       Find.prune
     else
       puts "SetOutPath #{path}"
     end
   else
     puts "File #{path}"
   end
end

Kind regards

        robert

I mixed up sub-folders and files. I meant to say that I want to sort
files before sub-folders (so that the folder that a file resides in
comes right before the file in the list).

Why do you need that? I mean, if a folder is excluded then you want to exclude all files and subfolders, don't you?

I tried the posted code, but I couldn't get it to run. Some String
comparison problem.
---------output---------------
SetOutPath jruby-1.0.3
SetOutPath Djruby-1.0.3/..
C:/ruby/jruby-1.0.3/lib/ruby/1.8/find.rb:45:in `find': comparison of
String with String failed (ArgumentError)
        from :1:in `catch'
        from C:/ruby/jruby-1.0.3/lib/ruby/1.8/find.rb:38:in `find'
        from :1
--------------end output-------

I have no idea what's wrong there. Did you try with the Ruby interpreter (instead of JRuby)?

  robert

···

On 24.03.2008 19:27, Adam Boyle wrote:

Why do you need that? I mean, if a folder is excluded then you want to
exclude all files and subfolders, don't you?

Yes, an excluded folder would also exclude its children files and
folders.

I'm thinking that I haven't exactly made it clear what my goal is...

Using Find.find, I want to traverse through a directory structure and
make an NSIS-style list of files and their paths for use in an NSIS
installer script. A list of this sort would be best organized if a
directory's file children are listed before the directory children.

Example:
SetOutPath "jruby-1.0.3/docs"
File "jruby-1.0.3/docs/README.rails"
File "jruby-1.0.3/docs/README.coverage"
File "jruby-1.0.3/docs/Readline-HOWTO.txt"
File "jruby-1.0.3/docs/LICENSE.bouncycastle"
File "jruby-1.0.3/docs/LICENSE.ant"
File "jruby-1.0.3/docs/LICENCE.bsf"
File "jruby-1.0.3/docs/Glossary.txt"
File "jruby-1.0.3/docs/getting_involved.html"
SetOutPath "jruby-1.0.3/docs/rbyaml"
File "jruby-1.0.3/docs/rbyaml/README"
File "jruby-1.0.3/docs/rbyaml/LICENSE"
SetOutPath "jruby-1.0.3/docs/jvyaml"
File "jruby-1.0.3/docs/jvyaml/README"
File "jruby-1.0.3/docs/jvyaml/LICENSE"
File "jruby-1.0.3/docs/jvyaml/CREDITS"
...

The "SetOutPath" lines are the directories, the "File" lines are the
files.

The code you gave will gave results like this (once I used Ruby
instead of JRuby :)...):
SetOutPath jruby-1.0.3
SetOutPath jruby-1.0.3/samples
File jruby-1.0.3/samples/xslt.rb
File jruby-1.0.3/samples/thread.rb
File jruby-1.0.3/samples/swing2.rb
File jruby-1.0.3/samples/scripting.rb
File jruby-1.0.3/samples/javascript.rb
File jruby-1.0.3/samples/java2.rb
File jruby-1.0.3/samples/error.rb
File jruby-1.0.3/samples/dom-applet.html
File jruby-1.0.3/samples/applet.html
File jruby-1.0.3/README
SetOutPath jruby-1.0.3/lib
SetOutPath jruby-1.0.3/lib/ruby
SetOutPath jruby-1.0.3/lib/ruby/site_ruby
SetOutPath jruby-1.0.3/lib/ruby/site_ruby/1.8
File jruby-1.0.3/lib/ruby/site_ruby/1.8/ubygems.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/securerandom.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems.rb
SetOutPath jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/version.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/validator.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/
user_interaction.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/timer.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/
specification.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/
source_info_cache_entry.rb
...

The issue is that it is very important that the SetOutPath call for
any particular file comes directly before it in the list (ie,
SetOutPath jruby-1.0.3 for the line File jruby-1.0.3/README). That
way the output path is set correctly when the file is extracted from
the installer executable.

The selected lines from the previous example would ideally be listed
this way:
SetOutPath jruby-1.0.3
File jruby-1.0.3/README
SetOutPath jruby-1.0.3/samples
File jruby-1.0.3/samples/xslt.rb
File jruby-1.0.3/samples/thread.rb
File jruby-1.0.3/samples/swing2.rb
File jruby-1.0.3/samples/scripting.rb
File jruby-1.0.3/samples/javascript.rb
File jruby-1.0.3/samples/java2.rb
File jruby-1.0.3/samples/error.rb
File jruby-1.0.3/samples/dom-applet.html
File jruby-1.0.3/samples/applet.html
SetOutPath jruby-1.0.3/lib/ruby/site_ruby/1.8
File jruby-1.0.3/lib/ruby/site_ruby/1.8/ubygems.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/securerandom.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems.rb
SetOutPath jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/version.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/validator.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/
user_interaction.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/timer.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/
specification.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/
source_info_cache_entry.rb
...

It's beginning to seem to me that Find.find just doesn't have an easy
way of sorting the elements being traversed. Any additional help is
greatly appreciated.

···

> I tried the posted code, but I couldn't get it to run. Some String
> comparison problem.
> ---------output---------------
> SetOutPath jruby-1.0.3
> SetOutPath Djruby-1.0.3/..
> C:/ruby/jruby-1.0.3/lib/ruby/1.8/find.rb:45:in `find': comparison of
> String with String failed (ArgumentError)
> from :1:in `catch'
> from C:/ruby/jruby-1.0.3/lib/ruby/1.8/find.rb:38:in `find'
> from :1
> --------------end output-------

I have no idea what's wrong there. Did you try with the Ruby
interpreter (instead of JRuby)?

        robert

Thanks for clarifying. You could do something like this:

require 'find'

base='.'
excludes =
dirs = Hash.new {|h,p| h[p]=}

Find.find base do |path|
  if File.directory? path
    Find.prune if excludes.include? path
  else
    dirs[File.dirname(path)] << path
  end
end

dirs.sort.each do |dir,files|
  puts "SetOutPath #{dir}"
  files.each {|f| puts "File #{f}"}
end

If you want to do printing while traversing then the code becomes more
complicated (either using Find or manual traversal via Dir).

Kind regards

robert

···

2008/3/24, Adam Boyle <briefcase.speakers@gmail.com>:

> Why do you need that? I mean, if a folder is excluded then you want to
> exclude all files and subfolders, don't you?

Yes, an excluded folder would also exclude its children files and
folders.

I'm thinking that I haven't exactly made it clear what my goal is...

Using Find.find, I want to traverse through a directory structure and
make an NSIS-style list of files and their paths for use in an NSIS
installer script. A list of this sort would be best organized if a
directory's file children are listed before the directory children.

Example:

SetOutPath "jruby-1.0.3/docs"
File "jruby-1.0.3/docs/README.rails"
File "jruby-1.0.3/docs/README.coverage"
File "jruby-1.0.3/docs/Readline-HOWTO.txt"
File "jruby-1.0.3/docs/LICENSE.bouncycastle"
File "jruby-1.0.3/docs/LICENSE.ant"
File "jruby-1.0.3/docs/LICENCE.bsf"
File "jruby-1.0.3/docs/Glossary.txt"
File "jruby-1.0.3/docs/getting_involved.html"

SetOutPath "jruby-1.0.3/docs/rbyaml"
File "jruby-1.0.3/docs/rbyaml/README"
File "jruby-1.0.3/docs/rbyaml/LICENSE"

SetOutPath "jruby-1.0.3/docs/jvyaml"
File "jruby-1.0.3/docs/jvyaml/README"
File "jruby-1.0.3/docs/jvyaml/LICENSE"
File "jruby-1.0.3/docs/jvyaml/CREDITS"

...

The "SetOutPath" lines are the directories, the "File" lines are the
files.

The code you gave will gave results like this (once I used Ruby
instead of JRuby :)...):

SetOutPath jruby-1.0.3

SetOutPath jruby-1.0.3/samples
File jruby-1.0.3/samples/xslt.rb
File jruby-1.0.3/samples/thread.rb
File jruby-1.0.3/samples/swing2.rb
File jruby-1.0.3/samples/scripting.rb
File jruby-1.0.3/samples/javascript.rb
File jruby-1.0.3/samples/java2.rb
File jruby-1.0.3/samples/error.rb
File jruby-1.0.3/samples/dom-applet.html
File jruby-1.0.3/samples/applet.html
File jruby-1.0.3/README
SetOutPath jruby-1.0.3/lib
SetOutPath jruby-1.0.3/lib/ruby
SetOutPath jruby-1.0.3/lib/ruby/site_ruby
SetOutPath jruby-1.0.3/lib/ruby/site_ruby/1.8
File jruby-1.0.3/lib/ruby/site_ruby/1.8/ubygems.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/securerandom.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems.rb
SetOutPath jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/version.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/validator.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/
user_interaction.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/timer.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/
specification.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/
source_info_cache_entry.rb
...

The issue is that it is very important that the SetOutPath call for
any particular file comes directly before it in the list (ie,
SetOutPath jruby-1.0.3 for the line File jruby-1.0.3/README). That
way the output path is set correctly when the file is extracted from
the installer executable.

The selected lines from the previous example would ideally be listed
this way:

SetOutPath jruby-1.0.3

File jruby-1.0.3/README
SetOutPath jruby-1.0.3/samples
File jruby-1.0.3/samples/xslt.rb
File jruby-1.0.3/samples/thread.rb
File jruby-1.0.3/samples/swing2.rb
File jruby-1.0.3/samples/scripting.rb
File jruby-1.0.3/samples/javascript.rb
File jruby-1.0.3/samples/java2.rb
File jruby-1.0.3/samples/error.rb
File jruby-1.0.3/samples/dom-applet.html
File jruby-1.0.3/samples/applet.html
SetOutPath jruby-1.0.3/lib/ruby/site_ruby/1.8
File jruby-1.0.3/lib/ruby/site_ruby/1.8/ubygems.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/securerandom.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems.rb
SetOutPath jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/version.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/validator.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/
user_interaction.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/timer.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/
specification.rb
File jruby-1.0.3/lib/ruby/site_ruby/1.8/rubygems/
source_info_cache_entry.rb
...

It's beginning to seem to me that Find.find just doesn't have an easy
way of sorting the elements being traversed. Any additional help is
greatly appreciated.

--
use.inject do |as, often| as.you_can - without end

This is exactly what I was looking for. Thank you for your time and
effort!

···

require 'find'

base='.'
excludes =
dirs = Hash.new {|h,p| h[p]=}

Find.findbase do |path|
  if File.directory? path
    Find.prune if excludes.include? path
  else
    dirs[File.dirname(path)] << path
  end
end

dirs.sort.each do |dir,files|
  puts "SetOutPath #{dir}"
  files.each {|f| puts "File #{f}"}
end