[ANN] dirwatch-0.0.6

URLS
   http://raa.ruby-lang.org/project/dirwatch/
   http://www.codeforpeople.com/lib/ruby/dirwatch/

NAME
   dirwatch v0.0.6

SYNOPSIS
   dirwatch [options]+ [directory = ./] [mode = watch] [dbdir = 0]

DESCRIPTTION
   dirwatch is a tool used to design file system based event driven systems.

   dirwatch manages an sqlite database that mirrors the state of a directory
   and then triggers user definable event handlers for certain filesystem
   activities such file creation, modification, deletion, etc. dirwatch
   normally runs as a daemon process sychronizing the database inventory with
   that of the directory and then fires appropriate triggers. dirwatch is
   designed such that more than one 'watch' may be placed on a given directory
   and it is nfs clean.

···

-----------------------------------------------------------------------------
   the following actions may have triggers configured for them
   -----------------------------------------------------------------------------

   created -> a file was detected that was not already in the database
   modified -> a file in the database was detected as being modified
   updated -> a file was created or modified (union of these actions)
   deleted -> a file in the database is no longer in the directory
   existing -> a file in the database still exists in the directory and has not
               been modified

   -----------------------------------------------------------------------------
   the command line 'mode' must be one of the following
   -----------------------------------------------------------------------------

   create (c) -> initialize the database and supporting files
   watch (w) -> maintain database inventory and trigger actions
   list (l) -> dump database to stdout in silky smooth yaml format
   template (t) -> generate a template config file

   for all modes except 'template' the command line argument must be the name of
   the directory to apply the operation

   -----------------------------------------------------------------------------
   mode: create (c)
   -----------------------------------------------------------------------------

   initializes a storage directory, known from here on as 'dbdir', with all
   required database files, logs, command directories, sample configuration,
   sample programs, etc.

   by default the dbdir will be stored in a numbered subdirectory such as

     directory/.dirwatch/n

   where 'directory' is the directory named on the command line and 'n' is the
   watch number.

   multiple dirwatches may be placed upon a directory - these 'watches' will be
   automagically numbered starting from 0 as they are created. for instance
   the command

     dirwatch ./ create

   followed by another

     dirwatch ./ create

   would initialize both the dbdirs './.dirwatch/0' AND '././dirwatch/1' to
   allow two 'watches' (0 and 1) to later be placed upon the directory. see
   watch section below.

   dbdir may be specified at creation (or watch) time as either the last
   command line argument, or by using the '--dbdir' option, as the full path to
   the storage directory. as a special case dbdir may be specified as a number
   only (matching /[0-9]+) in which case the dbdir is assumed to be a numbered
   subdirectory of directory/.dirwatch/.

   for example

     dirwatch ./ create 42

   or

     dirwatch --dbdir=42 ./ create

   would use the directory ./.dirwatch/42/ as dbdir, and

     dirwatch ./ create /full/path/to/dbdir

   would use /full/path/to/dbdir as a dbdir

   when a dirwatch directory is created a hierarchy is created for storing
   commands (programs) to be triggered for the various actions. the hierachy
   is :

     dbdir/
           commands/
                    created/
                    updated/
                    modified/
                    deleted/
                    existing/

   the idea being that that actual trigger commands (programs) will be stored
   in either the commands/ subdirectory or in an action specific subdirectory
   (commands/created/, commands/deleted/, etc.). it is not required to store
   programs here, but these locations are automatically checked based on
   trigger type.

   a default config file will be auto-generated and placed in the 'dbdir' with
   the name 'dirwatch.conf'. this config will automatically be used, iff
   found, when watching. use the '--config' option to override this.

   -----------------------------------------------------------------------------
   mode: watch (w)
   -----------------------------------------------------------------------------

   dirwatch is designed to run as a daemon, updating the database inventory
   at the interval specified by the '--interval' option (5 minutes by default)
   and firing appropriate trigger commands. two watchers may not watch the
   same dbdir simoultaneously and attempting the start a second watcher will
   fail when the second watcher is unable to obtain the pid lockfile. it is a
   non-fatal error to attempt to start another watcher when one is running and
   this failure can be made silent by using the '--quiet' option. the reason
   for this is to allow a crontab entry to be used to make the daemon
   'immortal'. for example, the following crontab entry

     */15 * * * * dirwatch directory --daemon --dbdir=0 \
                                     --files_only --flat \
                                     --interval=10minutes --quiet

   or (same but shorter)

     */15 * * * * dirwatch directory -D -d0 -f -F -i10m -q

   will __attempt__ to start a daemon watching 'directory' every fifteen
   minutes. if the daemon is not already running one will started, otherwise
   dirwatch will simply fail silently (no cron email sent due to stderr).

   this feature allows a normal user to setup daemon processes that not only
   will run after machine reboot, but which will continue to run after other
   terminal program behaviour.

   the meaning of the options in the above crontab entry are as follows

     --daemon -> become a child of init and run forever
     --dbdir -> the storage directory, here the default is specified
     --files_only -> inventory files only (default is files and directories)
     --flat -> do not recurse into subdirectories (default recurses)
     --interval -> generate inventory, at mininum, every 10 minutes
     --quiet -> be quiet when failing due to another daemon already watching

   as the watcher runs and maintains the inventory it is noted when
   files/directories (entries) have been created, modified, updated, deleted,
   or are existing. these entries are then handled by user definable triggers
   as specified in the config file. the config file is of the format

     ...
     actions :
       created :
         commands :
           ...
       updated :
         commands :
           ...
       ...
     ...

   where the commands to be run for each trigger type are enumerated. each
   command entry is of the following format:
         ...
         -
           command : command to run
           type : calling convention
           pattern : filter files further by this pattern
           timing : synchronous or asynchronous execution
         ...

   the meaning of each field is as follows :

     command: this is the program to run. the search path for the program is
              determined dynamically by the action run. for instance, when a
              file is discovered to be 'modified' the search path for the
              command will be

                dbdir/commands/modified/ + dbdir/commands/ + $PATH

              this dynamic path setting simply allows for short pathnames if
              commands are stored in the dbdir/commands/* subdirectories.

     type: there are four types of commands. the type merely indicates the
              calling convention of the program. when commands are run there
              are two peices of information which must be passed to the
              program, the file in question and the mtime of that file. the
              mtime is less important but programs may use it to know if the file
              has been changed since they were spawned. mtime will probably be
              ignored for most commands. the four types of commands fall into
              two catagories: those commands called once for each file and those
              types of commands called once with __all__ files

              each file:

                simple: the command will be called with three arguments: the file
                         in question, the mtime date, and the mtime time. eg:

                           command foobar.txt 2002-11-04 01:01:01.1234

                expaned: the command will be have the strings '@file' and
                         '@mtime' replaced with appropriate values. eg:

                           command '@file' '@mtime'

                         expands to (and is called as)

                           command 'foobar.txt' '2002-11-04 01:01:01.1234'

              all at once:

                filter: the stdin of the program will be given a list where each
                         line contains three items, the file, the mtime data, and
                         the mtime time.

                yaml: the stdin of the program will be given a list where each
                         entry contains two items, the file and the mtime. the
                         format of the list is valid yaml and the schema is an
                         array of hashes with the keys 'path' and 'mtime'.

     pattern: all the files for a given action are filtered by this pattern,
              and only those files matching pattern will have triggers fired.

     timing: if timing is asynchronous the command will be run and not waited
              for before starting the next command. asynchronous commands may
              yield better performance but may also result in many commands
              being run at once. asyncronous commands should not load the
              system heavily unless one is looking to freeze a machine.
              synchronous commands are spawned and waited for before the next
              command is started. a side effect of synchronous commands is
              that the time spent waiting may sum to an ammount of time greater
              than the interval ('--interval' option) specified - if the amount
              of time running commands exceeds the interval the next inventory
              simply begins immeadiately with no pause. because of this one
              should think of the interval used as a minimum bound only,
              especially when synchronous commands are used.

   note that sample commands of each type are auto-generated in the
   dbdir/commands directory. reading these should answer any questions regarding
   the calling conventions of any of the four types. for other questions regard
   the sample config, which is also auto-generated.

   -----------------------------------------------------------------------------
   mode: list (l)
   -----------------------------------------------------------------------------

   dump the contents of the database in yaml format for easy viewing/parsing

   -----------------------------------------------------------------------------
   mode: template (t)
   -----------------------------------------------------------------------------

   generate a template config. the first directory argument is ignored so one
   may type

     dirwatch directory template [template file]

   or

     dirwatch template [template file]

ENVIRONMENT

   for dirwatch:

     export SQLDEBUG=1 -> cause sql debugging info to be logged
     export LOCKFILE_DEBUG=1 -> cause lockfile debugging info to be logged

   for triggers run under dirwatch:

     DIRWATCH_DIR -> directory being watched
     DIRWATCH_ACTION -> trigger type
     DIRWATCH_TYPE -> command type
     DIRWATCH_N_PATHS -> total number of paths for this trigger
     DIRWATCH_PATH_IDX -> for simple|expanded path number
                          for filter|yaml set to DIRWATCH_N_PATHS
     DIRWATCH_PATH -> for simple|expanded path
                          for filter|yaml nil
     DIRWATCH_MTIME -> for simple|expanded mtime of path
                          for filter|yaml nil
     DIRWATCH_PID -> pid of dirwatch watcher
     DIRWATCH_ID -> trigger unique identifier
     PATH -> .dirwatch/(0...n)/commands/action + ENV['PATH']

FILES
   directory/.dirwatch/n/ -> dirwatch data files
   directory/.dirwatch/n/dirwatch.conf -> default configuration file
   directory/.dirwatch/n/commands/* -> default location for triggers
   directory/.dirwatch/n/db -> sqlite database file
   directory/.dirwatch/n/db.schema -> sqlite database schema
   directory/.dirwatch/n/lock -> sentinal lock file used for nfs safe access
   directory/.dirwatch/n/dirwatch.pid -> default pidfile
   directory/.dirwatch/n/dirwatch.log -> default log file
   directory/.dirwatch/n/* -> misc files used by locking subsystem

DIAGNOSTICS
   success -> $? == 0
   failure -> $? != 0

AUTHOR
   ara.t.howard@noaa.gov

BUGS
   1 < bugno && bugno < 42

OPTIONS
   --lockfile=[lockfile], -L
         coordinate inventory on lockfile - (default directory/.lock)
   --dbdir=dbdir, -d
         specify dbdir used - (default directory/.dirwatch/0)
   --interval=interval, -i
         specify polling interval - (default 5 minutes)
   --nloops=nloops, -N
         specify the number of watch loops - (default infinite)
   --daemon, -D
         run as a daemon
   --quiet, -q
         fail quietly if pidfile cannot be generated
   --pattern=pattern, -p
         watch only files matching pattern (__not__ shell glob)
   --files_only, -f
         ignore everything but files - (default directories and files)
   --flat, -F
         do not recurse into subdirectories - (default recurse)
   --pidfile=pidfile, -P
         specifiy pidfile used - (default @dbdir/dirwatch.pid)
   --verbosity=verbostiy, -v
         0|fatal < 1|error < 2|warn < 3|info < 4|debug - (default info)
   --log=path, -l
         set log file - (default stderr or, iff existing, @dbdir/dirwatch.log)
   --log_age=log_age
         daily | weekly | monthly - what age will cause log rolling (default
         nil)
   --log_size=log_size
         size in bytes - what size will cause log rolling (default 1mb)
   --config=path, -c
         valid path - specify config file (default @dbdir/dirwatch.conf)
   --template=[path]
         valid path - generate a template config file in path (default stdout)
   --help, -h
         this message

EXAMPLES

   0) initialize a directory for watching (dbdir = directory/.dirwatch/0/)

     ~ > dirwatch dir create

   1) initialize another watch (the '1' is optional)

     ~ > dirwatch dir create 1

   2) create a config (to edit afterwards)

     ~ > dirwatch template config
     ~ > vi config

   3) watch a directory using all defaults, logging to stderr

     ~ > dirwatch dir watch

   4) start daemon to watch a directory using all defaults, daemons log to
      dbdir/dirwatch.log by default

     ~ > dirwatch dir watch -D

   5) same as above but use dbdir .dirwatch/2/

     ~ > dirwatch dir watch 2 -D

   6) dump contents of database (dbdir = .dirwatch/0/) in yaml format

     ~ > dirwatch dir list

   7) same as above but use dbdir .dirwatch/2/

     ~ > dirwatch dir list 2

   8) crontab entry to keep alive a watcher for a directory using default dbdir,
      watching files only and not recursing into subdirectories

      */15 * * * * /full/path/to/dirwatch /full/path/to/directory w -D -f -F -q

   9) another watch on that same directory using different dbdir (7). this one
      watches all entries and recurses into subdirectories

      */15 * * * * /full/path/to/dirwatch /full/path/to/directory w 7 -D -q

enjoy.

-a
--

EMAIL :: Ara [dot] T [dot] Howard [at] noaa [dot] gov
PHONE :: 303.497.6469
When you do something, you should burn yourself completely, like a good
bonfire, leaving no trace of yourself. --Shunryu Suzuki

===============================================================================