I wrote this little method to return the size of a given directory, but
I think it's very ugly. Could anyone help me clean it up a bit?
def Dir.size(dirname)
Dir.chdir(dirname)
entries = Dir.entries(".").reject { |x| %w(. ..).include? x }
entries.collect! { |filename| File.expand_path(filename) }
size = 0
entries.each do |filename|
begin
if File.file?(filename)
size += File.size(filename) rescue 0
else
size += Dir.size(filename)
end
rescue
next
end
end
size
end
I managed to get rid of the file names discovery by using Dir's globbing
facilities. The size calculation is then a matter of a single inject call:
def Dir.size(name)
Dir.chdir(name)
files = Dir["**/*"]
files.inject(0) do |total, name|
if File.file?(name)
total + File.size(name)
else
total
end
end
end
just thought i'd point out that every solution posted thus far fails in a
variety of ways when links are considered - in the best cases linked files are
counted twice or linked dirs are not checked, in the worst case infinite loops
occur. the methods using 'Dir[glob]' v.s. 'Find::find' suffer from the link
issue but also will perfom badly on large file systems. unfortunately ruby's
built-in 'Find::find' cannot deal with links - for that you have to rely on
motoyuki kasahara's Find2 module, which you can get off of the raa. i have it
inlined in my personal library (alib - also on the raa) with few small bug
fixes and interface additions, to use you would do something like:
require 'alib'
def dir_size dir
size = 0
totalled = {}
ALib::Util::find2(dir, 'follow' => true) do |path, stat|
begin
next if totalled[stat.ino]
next unless stat.file?
size += stat.size
ensure
totalled[stat.ino] = true
end
end
size
end
p dir_size('.')
this handles huge directories, duplicate files (links) in a directory, linked
directories, and potential infinite loops. i think this is about as simply as
one can write this without introducing subtle, or not so subtle, bugs.
cheers.
-a
···
On Mon, 5 Sep 2005, Vincent Foley wrote:
Hello guys,
I wrote this little method to return the size of a given directory, but
I think it's very ugly. Could anyone help me clean it up a bit?
def Dir.size(dirname)
Dir.chdir(dirname)
entries = Dir.entries(".").reject { |x| %w(. ..).include? x }
entries.collect! { |filename| File.expand_path(filename) }
size = 0
entries.each do |filename|
begin
if File.file?(filename)
size += File.size(filename) rescue 0
else
size += Dir.size(filename)
end
rescue
next
end
end
size
end
Thanks,
Vincent.
--
email :: ara [dot] t [dot] howard [at] noaa [dot] gov
phone :: 303.497.6469
Your life dwells amoung the causes of death
Like a lamp standing in a strong breeze. --Nagarjuna
File.size(dirname) only tells you how many bytes the directory's own
inode is using. It doesn't include the bytes for the directory's
files.
How about this:
def Dir.size(dname)
Dir.new(dname).inject(File.size(dname)) {|total,name|
begin
exname = File.expand_path(name,dname)
if File.file?(exname)
total + File.size(exname)
elsif File.directory?(exname) and name != '.' and name != '..'
total + Dir.size(exname)
else
total
end
rescue
total
end
}
end
-Ed
···
On Mon, Sep 05, 2005 at 10:52:18AM +0900, James Edward Gray II wrote:
On Sep 4, 2005, at 7:46 PM, Vincent Foley wrote:
> I wrote this little method to return the size of a given directory,
> but I think it's very ugly. Could anyone help me clean it up a bit?
File.size(dirname) seems to be working on my system. Am I missing
something obvious?
The mailing list critters ate my indentation! Maybe this one will get
through:
def Dir.size(name)
Dir.chdir(name)
files = Dir["**/*"]
files.inject(0) do |total, name|
if File.file?(name)
total + File.size(name)
else
total
end
end
end
Hristo Deshev
···
On 9/5/05, Hristo Deshev <hristo.deshev@gmail.com> wrote:
On 9/5/05, Vincent Foley <vfoley@gmail.com> wrote:
>
> Hello guys,
>
> I wrote this little method to return the size of a given directory, but
> I think it's very ugly. Could anyone help me clean it up a bit?
Hi guys,
I managed to get rid of the file names discovery by using Dir's globbing
facilities. The size calculation is then a matter of a single inject call:
def Dir.size(name)
Dir.chdir(name)
files = Dir["**/*"]
files.inject(0) do |total, name|
if File.file?(name)
total + File.size(name)
else
total
end
end
end
One more safety addition
def Dir.size(name)
Dir[ File.join(name, "**/*") ].select{ | f | File.file?(f)
}.inject(0){ | sum, f |
sum + File.size(f)
}
end
regards,
Brian
···
On 05/09/05, William James <w_a_x_man@yahoo.com> wrote:
Vincent Foley wrote:
> Hello guys,
>
> I wrote this little method to return the size of a given directory, but
> I think it's very ugly. Could anyone help me clean it up a bit?
def Dir.size(name)
Dir[ name + "/**/*" ].select{|x| File.file?(x)}.inject(0){|sum,f|
sum + File.size(f) }
end
> Hello guys,
>
> I wrote this little method to return the size of a given directory, but
> I think it's very ugly. Could anyone help me clean it up a bit?
>
[snip]
just thought i'd point out that every solution posted thus far fails in a
variety of ways when links are considered - in the best cases linked files are
counted twice or linked dirs are not checked, in the worst case infinite loops
occur. the methods using 'Dir[glob]' v.s. 'Find::find' suffer from the link
issue but also will perfom badly on large file systems. unfortunately ruby's
built-in 'Find::find' cannot deal with links - for that you have to rely on
motoyuki kasahara's Find2 module, which you can get off of the raa. i have it
inlined in my personal library (alib - also on the raa) with few small bug
fixes and interface additions, to use you would do something like:
require 'alib'
def dir_size dir
size = 0
totalled = {}
ALib::Util::find2(dir, 'follow' => true) do |path, stat|
begin
next if totalled[stat.ino]
next unless stat.file?
size += stat.size
ensure
totalled[stat.ino] = true
end
end
size
end
p dir_size('.')
this handles huge directories, duplicate files (links) in a directory, linked
directories, and potential infinite loops. i think this is about as simply as
one can write this without introducing subtle, or not so subtle, bugs.
This solution fails under windoze by always returning 0, apparently
because
File.stat(path).ino always returns 0.
If you're stuck with windoze, use the previously posted
def Dir.size(name)
Dir[File.join(name, "**/*")].select{|f|
File.file?(f)}.inject(0){|sum,f|
sum + File.size(f)
}
end
just thought i'd point out that every solution posted thus far fails in a
variety of ways when links are considered - in the best cases linked files are
counted twice or linked dirs are not checked, in the worst case infinite loops
occur.
windows has links - and systems that have du may not have one that supports
max-depth. and du reports on file system blocks - not directory sums. this is
typically close, but can diverge greatly depending on file system setup and the
number of directoriess - since du will report usage for directories too..
harp:~ > mkdir foobar
harp:~ > du foobar
4 foobar
fyi.
-a
···
On Tue, 6 Sep 2005, David Brady wrote:
Ara.T.Howard wrote:
just thought i'd point out that every solution posted thus far fails in a
variety of ways when links are considered
If you have links, you also have du, which IMO is the Right Tool For This Job:
total = `du --max-depth=0|cut -f 1`.to_i
--
email :: ara [dot] t [dot] howard [at] noaa [dot] gov
phone :: 303.497.6469
Your life dwells amoung the causes of death
Like a lamp standing in a strong breeze. --Nagarjuna
def dir_size dir
size = 0
totalled = {}
ALib::Util::find2(dir, 'follow' => true) do |path, stat|
begin
next if totalled[stat.ino]
next unless stat.file?
size += stat.size
ensure
totalled[stat.ino] = true
end
end
size
end
so it seems like your ruby may be broken - how'd you install it?
cheers.
-a
···
On Tue, 6 Sep 2005, William James wrote:
This solution fails under windoze by always returning 0, apparently because
File.stat(path).ino always returns 0.
If you're stuck with windoze, use the previously posted
def Dir.size(name)
Dir[File.join(name, "**/*")].select{|f|
File.file?(f)}.inject(0){|sum,f|
sum + File.size(f)
}
end
--
email :: ara [dot] t [dot] howard [at] noaa [dot] gov
phone :: 303.497.6469
Your life dwells amoung the causes of death
Like a lamp standing in a strong breeze. --Nagarjuna
we've got a script here (dirsum) that does essentially all the things outlined
- especially checking compressed files - for monitoring data volumes. the
problem is much trickier that one would assume.
cheers.
-a
···
On Tue, 6 Sep 2005, Daniel Berger wrote:
Ara.T.Howard wrote:
<snip>
just thought i'd point out that every solution posted thus far fails in a
variety of ways when links are considered - in the best cases linked files
are counted twice or linked dirs are not checked, in the worst case
infinite loops occur.
email :: ara [dot] t [dot] howard [at] noaa [dot] gov
phone :: 303.497.6469
Your life dwells amoung the causes of death
Like a lamp standing in a strong breeze. --Nagarjuna
If you have links, you also have du, which IMO is the Right Tool For This Job:
total = `du --max-depth=0|cut -f 1`.to_i
sorry for stating the obvious, but: this isn't very portable.
Yup!
<soapbox>
Portability isn't a good idea here.
(wait for scandalized gasps to quiet down)
The need here was to clean up the code. The solutions so far have bulkily danced around the fact that Find::find doesn't seem to work satisfactorily.
If the Standard Library is defective, the proper, *portable* solution should be to patch the .c files that are defective. Until then, if we're going to work around the Standard Library, we should be as quick and deadly as possible.
I recognize that it may be seen as cheating or possibly even inappropriate to suggest "don't use Ruby for this" on a Ruby mailing list, but I feel strongly that Ruby's very ability to take advantage of external tools is a great strength of the language. A superior wheel already exists; reinventing it poorly does not seem to me to leverage Ruby's power very well.
There seems to me to be a "right" way to do this, and it is to have the Standard Library work as desired. Until then, how much effort should be spent standardizing on a portable kludge?
</soapbox>
And of course, if the OP is on a different platform, the if statement that started my earlier post will become important. The else clause probably reads "then the issues with File::find link don't affect you. Use the Standard Library."
Just my $0.02. Note sig.
-dB
···
--
David Brady
ruby-talk@shinybit.com
C++ Guru. Ruby nuby. Apply salt as needed.
> This solution fails under windoze by always returning 0, apparently because
> File.stat(path).ino always returns 0.
>
> If you're stuck with windoze, use the previously posted
>
> def Dir.size(name)
> Dir[File.join(name, "**/*")].select{|f|
> File.file?(f)}.inject(0){|sum,f|
> sum + File.size(f)
> }
> end
works for me :
Ara@JEN ~
$ cat a.rb
require 'alib'
def dir_size dir
size = 0
totalled = {}
ALib::Util::find2(dir, 'follow' => true) do |path, stat|
begin
next if totalled[stat.ino]
next unless stat.file?
size += stat.size
ensure
totalled[stat.ino] = true
end
end
size
end
> Ara.T.Howard wrote:
>
> <snip>
>
>> just thought i'd point out that every solution posted thus far fails in a
>> variety of ways when links are considered - in the best cases linked files
>> are counted twice or linked dirs are not checked, in the worst case
>> infinite loops occur.
>
> For a good summary of the problems with calculating the size of a directory
> (on Windows) see
> http://blogs.msdn.com/oldnewthing/archive/2004/12/28/336219.aspx\.
indeed.
we've got a script here (dirsum) that does essentially all the things outlined
- especially checking compressed files - for monitoring data volumes. the
problem is much trickier that one would assume.
By all means, please share. I'd be happy to add a Dir.size method to
win32-dir.