Deleting of files older than a stipulated date

Pardon for my multiple posts in this forum as Im rushing a project but
still quite new in ruby..

Anyway, I would like to select files to delete based on their
datestamped folders.
For example, I would like to delete files which older than 5 business
days (i.e that means you dont count sats and suns). Currently my code
looks like this:
def delFiles
sd_a=$del_path.zip($del_selection)
sd_a.each do |sd|
  $del_path, $del_selection = sd
  del = File.join $del_path, $del_selection
  puts "Files/Folders Deleted: #{del}"
  FileUtils.rm_r Dir.glob(del)
end #each
  end #delFiles

Im actually using arrays to contain my del_path and del_selection
because I will need to delete multiple directories or files all at once.
Is there any way where I can build on my code but with the criteria that
only files older than 5 business days (bearing in mind that files not
only contain the date eg. 20080331 but can also come with other
characters eg. risk20080331) get deleted.
Much help is really appreciated. :wink:

···

--
Posted via http://www.ruby-forum.com/.

If you mean that the date is part of the name, I have a script that
does something
similar. Let me paste you the relevant parts (this is not a complete program,
it's just the part that makes the checks against the dates).

Here I receive in params the list of folders to process,
and the age of files for zipping and deleting. So for example
to say: delete files older than 15 days, zip files older than 7 days
I pass delete=15, zip=7. In this part I calculate the dates to
check:

    folders = params[:directories].values
    have_to_zip = params[:zip].given?
    zip = params[:zip].value
    if have_to_zip
      zip_date = DateTime.now - zip
    end
    delete = params[:delete].value
    delete_date = DateTime.now - delete

Now I build a regexp to match the file names and extract the date from them:

    regexp = Regexp.compile(/^(\d\d\d\d-\d\d-\d\d).*\.log(\.gz)?$/)

Now I traverse the folders, trying to match the filenames against the regexp.
When I find a match, I check the date in the name against the delete date
and the zip date and act accordingly, storing info for a report:

    fileData = Struct.new(:name, :size)
    deleted_files = []
    zipped_files = []

    folders.each do |folder|
      Find.find(folder + "/") do |file|
        match = regexp.match(File.basename(file));
        if match
          file_date = DateTime.parse(match[1])
          size = File.stat(file).size
          if delete_date > file_date
            deleted_files << fileData.new(file,size)
            File.delete(file)
          elsif have_to_zip && zip_date > file_date && !match[2]
            zipped_files << fileData.new(file,size)
            `gzip -f #{file}`
          end
        end
      end
    end

If you want to check the actual modification date of the file,
I think File.stat can help on that, or I don't know if File.find has
an option to search
for files based on date.

Hope this helps.

Jesus.

···

On Thu, Apr 17, 2008 at 4:53 AM, Clement Ow <clement.ow@asia.bnpparibas.com> wrote:

Pardon for my multiple posts in this forum as Im rushing a project but
still quite new in ruby..

Anyway, I would like to select files to delete based on their
datestamped folders.
For example, I would like to delete files which older than 5 business
days (i.e that means you dont count sats and suns). Currently my code
looks like this:
def delFiles
sd_a=$del_path.zip($del_selection)
sd_a.each do |sd|
  $del_path, $del_selection = sd
  del = File.join $del_path, $del_selection
  puts "Files/Folders Deleted: #{del}"
  FileUtils.rm_r Dir.glob(del)
end #each
  end #delFiles

Im actually using arrays to contain my del_path and del_selection
because I will need to delete multiple directories or files all at once.
Is there any way where I can build on my code but with the criteria that
only files older than 5 business days (bearing in mind that files not
only contain the date eg. 20080331 but can also come with other
characters eg. risk20080331) get deleted.
Much help is really appreciated. :wink:

Jesús Gabriel y Galán wrote:

If you want to check the actual modification date of the file,
I think File.stat can help on that, or I don't know if File.find has
an option to search
for files based on date.

Hope this helps.

Jesus.

Are you using a hash to contain the file paths that you want to delete?
It pretty much works about the same as the array that I have created eh?
btw, is it possible to attach your script so that I can use it for
reference? Thanks!

···

--
Posted via http://www.ruby-forum.com/.

Thanks Jesus! I modified the codes abit and it currently is a handy
deletion script:

delete=5

I add 2 to escape couting the weekends, as i only want to delete files older >>than 5 business days i.e not including Sat and Sun.

delete=delete + 2
   folders = $del_path
   delete_date = DateTime.now - delete

    regexp = Regexp.compile(/(\d{4}\d{2}\d{2})/)

    fileData = Struct.new(:name, :size)
    deleted_files = []

    folders.each do |folder|
      Find.find(folder + "/") do |file|
        match = regexp.match(File.basename(file));
        if match
          puts file_date = DateTime.parse(match[1])
          size = File.stat(file).size
          if delete_date > file_date
            deleted_files << fileData.new(file,size)
            puts "delete files: #{file} size: #{size} bytes"
            #File.delete(file)
            end
        end
      end
    end

However, there will be flaws and bugs when I change to: delete =2..
Any ways to improve the code are most welcome :wink:

···

--
Posted via http://www.ruby-forum.com/.

I'm using the "main" gem to process input parameters. The folders
variable is an array, I think.
The full script (it uses an erb template to compose the email report,
which I don't think it's interesting):

require 'date'
require 'find'
require 'simplemail'
require 'main'
require 'erb'
# the next two are just for the number_to_human_size method
require 'action_controller'
require 'action_view'
include ActionView::Helpers::NumberHelper

TEMPLATE_FILE = File.join(File.dirname(__FILE__), "deletelogs_template.erb")

main {
  description <<-DESC
    Deletes or gzips jhub log files older than the specified dates,
searching the specified directories
    recursively. The files should match this regexp
/^(\d\d\d\d-\d\d-\d\d).*\.log(\.gz)?$/
DESC

  option("zip", "z") {
    argument :required
    description "Zip files older than the specified number of days"
    cast :int
  }
  option("delete", "d") {
    argument :required
    defaults 7
    description "Delete files older than the specified number of days.
If --zip option is specified, only delete the files that are in
between
both dates"
    cast :int
  }
  argument("directories") {
    arity -2
  }

def disk_usage
    `df -h`
  end

  def run
    folders = params[:directories].values
    have_to_zip = params[:zip].given?
    zip = params[:zip].value
    if have_to_zip
      zip_date = DateTime.now - zip
    end
    delete = params[:delete].value
    delete_date = DateTime.now - delete

    usage_before = disk_usage
    regexp = Regexp.compile(/^(\d\d\d\d-\d\d-\d\d).*\.log(\.gz)?$/)

    fileData = Struct.new(:name, :size)
    deleted_files = []
    zipped_files = []

    folders.each do |folder|
      Find.find(folder + "/") do |file|
        match = regexp.match(File.basename(file));
        if match
          file_date = DateTime.parse(match[1])
          size = File.stat(file).size
          if delete_date > file_date
            deleted_files << fileData.new(file,size)
            File.delete(file)
          elsif have_to_zip && zip_date > file_date && !match[2]
            zipped_files << fileData.new(file,size)
            `gzip -f #{file}`
          end
        end
      end
    end
    usage_after = disk_usage
    report = ERB.new(File.read(TEMPLATE_FILE), nil, "%<>")
    report = report.result(binding)
    SimpleMail.deliver_simple('email address', 'email address',
'Delete old log files', report)
  end
}

SimpleMail is just a class that inherits ActionMailer and has the smtp
info and a simple method to compose the email.
Any comment on how to make this more efficient or better is appreciated...

Jesus.

···

On Thu, Apr 17, 2008 at 11:19 AM, Clement Ow <clement.ow@asia.bnpparibas.com> wrote:

Jesús Gabriel y Galán wrote:

>
> If you want to check the actual modification date of the file,
> I think File.stat can help on that, or I don't know if File.find has
> an option to search
> for files based on date.
>
> Hope this helps.
>
> Jesus.

Are you using a hash to contain the file paths that you want to delete?
It pretty much works about the same as the array that I have created eh?
btw, is it possible to attach your script so that I can use it for
reference? Thanks!

Currently this is how my code looks like:

delete=delete + 2
  folders = $del_path
  delete_date = DateTime.now - delete

   regexp = Regexp.compile(/(\d{4}\d{2}\d{2})/)

   fileData = Struct.new(:name, :size)
   deleted_files = []

   folders.each do |folder|
     Find.find(folder + "/") do |file|
       match = regexp.match(File.basename(file));
       if match
         puts file_date = DateTime.parse(match[1])
         size = File.stat(file).size
         if delete_date > file_date
           deleted_files << fileData.new(file,size)
           puts "delete files: #{file} size: #{size} bytes"
           #File.delete(file)
           end
       end
     end
   end

Currently, when i specify the folder path to be C:/Test, it begins
searching for files that are in C:/Test/New as well. So is there any way
that the command doesnt traverse the folders to match the regexp? (i.e
only match C:/Test if the path specified is C:/Test)Thanks in advance!

···

--
Posted via http://www.ruby-forum.com/.

> folders.each do |folder|
> Find.find(folder + "/") do |file|

[...]

> end
> end

Currently, when i specify the folder path to be C:/Test, it begins
searching for files that are in C:/Test/New as well. So is there any way
that the command doesnt traverse the folders to match the regexp? (i.e
only match C:/Test if the path specified is C:/Test)Thanks in advance!

You mean you want just to list the first level, without recursively processing
the subfolders? Then you don't need the Find module, you can use:

Dir.glob(folder + "/*") do |file|
# This will list subfolders too, so you can skip them with:
next if File.directory? file
[...]
end

If you want to do more fine grained pruning of which subfolders to recurse
you can use Find.prune. This is an example to achieve the same as above:

irb(main):046:0> Find.find("/home/jesus/") do |file|
irb(main):047:1* if ((File.directory? file) && !(file == "/home/jesus/"))
irb(main):048:2> Find.prune
irb(main):049:2> end
irb(main):050:1> puts file
irb(main):051:1> end

Hope this helps,

Jesus.

···

On Mon, Apr 21, 2008 at 9:10 AM, Clement Ow <clement.ow@asia.bnpparibas.com> wrote:

Hope this helps,

Jesus.

Thanks for taking time to help me with my script. I really appreciate
it!

Cheers,
Clement :wink:

···

--
Posted via http://www.ruby-forum.com/.