Strategies for autoloading ruby files

Hi all,

Following up on my post about complicated problems, here's a much simpler problem that I need to tackle.

There are many parts of my framework that are extensible, and I want people to be able to create extensions without modifying the core framework, akin to how you don't have to specifically load each model and controller in Rails.

Is there a relatively standard way of doing that with Ruby? This is what I'm doing in one of my projects, Facter[1]:

         # Now see if we can find any other facts
         $:.each do |dir|
             fdir = File.join(dir, "facter")
             if FileTest.exists?(fdir) and FileTest.directory?(fdir)
                 Dir.glob("#{fdir}/*.rb").each do |file|
                     # Load here, rather than require, because otherwise
                     # the facts won't get reloaded if someone calls
                     # "loadfacts". Really only important in testing,
                     # but, well, it's important in testing.
                     begin
                         load file
                     rescue => detail
                         warn "Could not load %s: %s" %
                             [file, detail]
                     end
                 end
             end
         end

I've tried further copying Rails by checking that loaded files create the objects it seems they should create (e.g., here they should create a fact named after the file), but users constantly complained about those warnings.

Is this the best way? Any other recommended mechanisms?

I'd actually like the autoloading to work either on-demand (e.g., look for 'facter/myfact' if I ask for the 'myfact' fact), or load all of them right now.

Hmm. In fact, it'd be ideal if you could write a module that allows you to register autoload paths (e.g., I'd use 'facter', 'puppet/type', 'puppet/type/package', 'puppet/type/service', and lots more), and then just have hooks into that module so that you could autoload all instances, or just load a specific name.

1 - http://reductivelabs.com/projects/facter

···

--
The Number 1 Sign You Have Nothing to Do at Work...
    The 4th Division of Paperclips has overrun the Pushpin Infantry
    and General White-Out has called for a new skirmish.
---------------------------------------------------------------------
Luke Kanies | http://reductivelabs.com | http://madstop.com

this may be helpful (or not! ;-))

   http://codeforpeople.com/lib/ruby/dynaload/

i use it for plugins in my satellite processing system like so

   class PlugIn
     def self.inherited other
       Dynaload::export self, 'plugin' => true
     end
   end

then plugins are written something like

   require 'plugin'

   class MyPlugIn < Plugin

···

On Wed, 2 Aug 2006, Luke Kanies wrote:

Hi all,

Following up on my post about complicated problems, here's a much simpler
problem that I need to tackle.

There are many parts of my framework that are extensible, and I want people
to be able to create extensions without modifying the core framework, akin
to how you don't have to specifically load each model and controller in
Rails.

Is there a relatively standard way of doing that with Ruby? This is what
I'm doing in one of my projects, Facter[1]:

       # Now see if we can find any other facts
       $:.each do |dir|
           fdir = File.join(dir, "facter")
           if FileTest.exists?(fdir) and FileTest.directory?(fdir)
               Dir.glob("#{fdir}/*.rb").each do |file|
                   # Load here, rather than require, because otherwise
                   # the facts won't get reloaded if someone calls
                   # "loadfacts". Really only important in testing,
                   # but, well, it's important in testing.
                   begin
                       load file
                   rescue => detail
                       warn "Could not load %s: %s" %
                           [file, detail]
                   end
               end
           end
       end

I've tried further copying Rails by checking that loaded files create the
objects it seems they should create (e.g., here they should create a fact
named after the file), but users constantly complained about those warnings.

Is this the best way? Any other recommended mechanisms?

I'd actually like the autoloading to work either on-demand (e.g., look for 'facter/myfact' if I ask for the 'myfact' fact), or load all of them right now.

Hmm. In fact, it'd be ideal if you could write a module that allows you to register autoload paths (e.g., I'd use 'facter', 'puppet/type', 'puppet/type/package', 'puppet/type/service', and lots more), and then just have hooks into that module so that you could autoload all instances, or just load a specific name.

1 - http://reductivelabs.com/projects/facter

     #
     # stuff
     #
   end

then, assume that is saved in myplugin.rb, then i can do

   require 'dynaload'

   loaded = Dynaload::dynaload 'myplugin.rb'

   plugins = loaded.select{|klass, attributes| attributes['plugin'] == true}

make sense?

the reason it's at 0.0.0 is that it just worked for me - been using in
production for a long time.

cheers.

-a
--
we can never obtain peace in the outer world until we make peace with
ourselves.
- h.h. the 14th dali lama

You can use `Module.const_missing` to autoload a file the first time the class in that file is mentioned in the code. So you could write an `Object.const_missing(name)` method that translates the constant into a file name of an expected format, then load that file on-the-fly.

Amusingly, this is exactly the code example that's given in the Pickaxe version 2 under Module.const_missing, along with a note that the code is in very poor style and is a "perverse kind of autoload functionality".

:slight_smile:

-- Brian

···

On Aug 1, 2006, at 3:53 PM, Luke Kanies wrote:

Hi all,

Following up on my post about complicated problems, here's a much simpler problem that I need to tackle.

There are many parts of my framework that are extensible, and I want people to be able to create extensions without modifying the core framework, akin to how you don't have to specifically load each model and controller in Rails.

Is there a relatively standard way of doing that with Ruby? This is what I'm doing in one of my projects, Facter[1]:

        # Now see if we can find any other facts
        $:.each do |dir|
            fdir = File.join(dir, "facter")
            if FileTest.exists?(fdir) and FileTest.directory?(fdir)
                Dir.glob("#{fdir}/*.rb").each do |file|
                    # Load here, rather than require, because otherwise
                    # the facts won't get reloaded if someone calls
                    # "loadfacts". Really only important in testing,
                    # but, well, it's important in testing.
                    begin
                        load file
                    rescue => detail
                        warn "Could not load %s: %s" %
                            [file, detail]
                    end
                end
            end
        end

I've tried further copying Rails by checking that loaded files create the objects it seems they should create (e.g., here they should create a fact named after the file), but users constantly complained about those warnings.

Is this the best way? Any other recommended mechanisms?

I'd actually like the autoloading to work either on-demand (e.g., look for 'facter/myfact' if I ask for the 'myfact' fact), or load all of them right now.

Hmm. In fact, it'd be ideal if you could write a module that allows you to register autoload paths (e.g., I'd use 'facter', 'puppet/type', 'puppet/type/package', 'puppet/type/service', and lots more), and then just have hooks into that module so that you could autoload all instances, or just load a specific name.

1 - http://reductivelabs.com/projects/facter

--
The Number 1 Sign You Have Nothing to Do at Work...
   The 4th Division of Paperclips has overrun the Pushpin Infantry
   and General White-Out has called for a new skirmish.
---------------------------------------------------------------------
Luke Kanies | http://reductivelabs.com | http://madstop.com

I don't know about recommended, but I've made two or three programs
using plugins, and here's my method, in the main script :

Dir.glob($conf[:path]) do |f|
  $mtime = File.mtime(f)
        begin
          require "#{f}"
  rescue LoadError => err
    puts err
  end
end
Plugin.plugins.each |p| do
  ...
end

class Plugin
        @@plugins =
        @@mtimes = {}
        def Grapher.inherited(c)
                @@plugins.push(c)
                @@mtimes[c.to_s] = $mtime
        end
        def Plugin.plugins
                @@plugins
        end
        def Plugin.mtimes
                @@mtimes
        end
end

And the plugins just have to :

class LittlePlugin < Plugin
  ...
end

Note : I needed here to know the modification time of the plugin to
eventually reinitialize some data. I think that using a global variable
is more than clunky. Any thoughts on a better way ?

Fred

···

Le 01 août à 23:53, Luke Kanies a écrit :

Is this the best way? Any other recommended mechanisms?

--
The DBAs will be whining because when you kill -9 Oracle, you get bits
and pieces left all over the place, and when re-started Oracle gets all
squeamish and frightened about what happened to it's predecessor. (Its
very literal pre-decessor...) (Dan Holdsworth in the SDM)

Ruby has a function for this though:

$ ri -T Kernel#autoload
-------------------------------------------------------- Kernel#autoload
      autoload(module, filename) => nil

···

On Aug 1, 2006, at 5:04 PM, Brian Palmer wrote:

You can use `Module.const_missing` to autoload a file the first time the class in that file is mentioned in the code. So you could write an `Object.const_missing(name)` method that translates the constant into a file name of an expected format, then load that file on-the-fly.

------------------------------------------------------------------------
      Registers filename to be loaded (using Kernel::require) the first
      time that module (which may be a String or a symbol) is accessed.

         autoload(:MyModule, "/usr/local/lib/modules/my_module.rb")

James Edward Gray II

Brian Palmer wrote:

You can use `Module.const_missing` to autoload a file the first time the class in that file is mentioned in the code. So you could write an `Object.const_missing(name)` method that translates the constant into a file name of an expected format, then load that file on-the-fly.

Ironically, I use almost no constants in my code. Parent classes get constants, but subclasses are almost always created dynamically[1], like so:

   Puppet::Type.newtype(:user) do
     ...
   end

Then I always refer to that class by the name 'user' and never by the constant. This is partially because it's easier for me, but also because I've written a custom language (no, not a DSL[2]) for Puppet, and most of those names are exposed in that language. E.g., you create a 'user' type like above, and users can create user elements in Puppet:

   user { luke:
     comment => "Luke Kanies",
     uid => 100,
     shell => bash,
     ensure => exists
   }

Puppet supports lots of package types (e.g., dpkg, apt, rpm, darwinports, gems) and service types (init, smf, etc.); I specify the name when I create the class, and users can select those types by name, either in the language or directly in Ruby if they're directly accessing the library.

In other words, my problem is not how to hook into the autoload, it's the guts behind it.

Amusingly, this is exactly the code example that's given in the Pickaxe version 2 under Module.const_missing, along with a note that the code is in very poor style and is a "perverse kind of autoload functionality".

1 - I do assign constants to them, but only because class inspection during error printing did not work consistently unless there's a constant assigned to the class. I just tried reproducing this but could not; I don't remember the details well enough, I guess.

2 - See the FAQ, http://reductivelabs.com/projects/puppet/faq.html, the fourth question.

···

--
An expert is a person who has made all the mistakes that can be made
in a very narrow field. - Niels Bohr
---------------------------------------------------------------------
Luke Kanies | http://reductivelabs.com | http://madstop.com

this may be helpful (or not! ;-))

  http://codeforpeople.com/lib/ruby/dynaload/

Clearly, I should always check your codebase before asking any questions. :slight_smile:

i use it for plugins in my satellite processing system like so

  class PlugIn
    def self.inherited other
      Dynaload::export self, 'plugin' => true
    end
  end

then plugins are written something like

  require 'plugin'

  class MyPlugIn < Plugin
    #
    # stuff
    #
  end

then, assume that is saved in myplugin.rb, then i can do

  require 'dynaload'

  loaded = Dynaload::dynaload 'myplugin.rb'

  plugins = loaded.select{|klass, attributes| attributes['plugin'] == true}

make sense?

Yeah. Half of my question, though, was figuring out which files to load, although that's easy enough to resolve, I guess. I'm assuming it does get moderately expensive as I do more of these, but I normally load by name (e.g., look for 'puppet/type/package/dpkg.rb' anywhere in the search path), and only sometimes load all files in all search directories.

I started out using the self.inherited method for all of this kind of work, but it just got too cumbersome eventually, at least partially because the object passed to 'inherited' has not yet been initialized. That is, with the following code:

   class MyClass < ParentClass
     @name = :myclass
   end

When 'inherited' is called, '@name' is not set, so you have to stick it in an array and then do shenanigans later to make it act like a hash. This is clearly a trivial example, because 'name' can (usually) be autodetermined from the class name, but you also can't test whether methods are defined, for instance. With the 'new<thing>' methods, I can do any initialization on the class that I want (e.g., setting @name using accessor methods), class_eval the passed block, and then do any analysis I need. See [1].

When I switched from using inherited to creating 'new<thing>' methods, it unleashed a flood of productivity. I was astounded -- I barely slept for two weeks because so much was suddenly so easy when it had been so difficult for so long.

1 - http://reductivelabs.com/downloads/puppet/apidocs/classes/Puppet/Type.html#M000597

Notice all of the other 'new*' methods around this one. I use it everywhere and I love it.

···

ara.t.howard@noaa.gov wrote:

--
Neonle will continue to be rude, and will nretend that you had a small
stroke which makes you unable to say or see the letter "n". Stunid
nractical joke, if you ask me. Bunch of noon-heads, huh?
                 -- Fred Barling, Humorscope
---------------------------------------------------------------------
Luke Kanies | http://reductivelabs.com | http://madstop.com

Ruby has a function for this though:

$ ri -T Kernel#autoload
-------------------------------------------------------- Kernel#autoload
     autoload(module, filename) => nil
------------------------------------------------------------------------
     Registers filename to be loaded (using Kernel::require) the first
     time that module (which may be a String or a symbol) is accessed.

        autoload(:MyModule, "/usr/local/lib/modules/my_module.rb")

James Edward Gray II

Cool, I didn't realize that was already baked in to Kernel.

···

On Aug 1, 2006, at 4:11 PM, James Edward Gray II wrote:

On Aug 1, 2006, at 4:22 PM, Luke Kanies wrote:

Puppet supports lots of package types (e.g., dpkg, apt, rpm, darwinports, gems) and service types (init, smf, etc.); I specify the name when I create the class, and users can select those types by name, either in the language or directly in Ruby if they're directly accessing the library.

In other words, my problem is not how to hook into the autoload, it's the guts behind it.

I misunderstood your question, sorry about that.

-- Brian

Yeah. Half of my question, though, was figuring out which files to load,
although that's easy enough to resolve, I guess. I'm assuming it does get
moderately expensive as I do more of these, but I normally load by name
(e.g., look for 'puppet/type/package/dpkg.rb' anywhere in the search path),
and only sometimes load all files in all search directories.

indeed. that one's pretty project specific. in my case i have a well known
location(s) for plugins...

I started out using the self.inherited method for all of this kind of work, but it just got too cumbersome eventually, at least partially because the object passed to 'inherited' has not yet been initialized. That is, with the following code:

class MyClass < ParentClass
   @name = :myclass
end

When 'inherited' is called, '@name' is not set, so you have to stick it in
an array and then do shenanigans later to make it act like a hash. This is
clearly a trivial example, because 'name' can (usually) be autodetermined
from the class name, but you also can't test whether methods are defined,
for instance. With the 'new<thing>' methods, I can do any initialization on
the class that I want (e.g., setting @name using accessor methods),
class_eval the passed block, and then do any analysis I need. See [1].

hmmm. that's interesting. another approach would be

   module Plugable
     def self.included other
       ...
     end
   end

   class MyPlugin
     include Plugable
   end

i've not run into that issue though.

When I switched from using inherited to creating 'new<thing>' methods, it
unleashed a flood of productivity. I was astounded -- I barely slept for
two weeks because so much was suddenly so easy when it had been so difficult
for so long.

1 - http://reductivelabs.com/downloads/puppet/apidocs/classes/Puppet/Type.html#M000597

Notice all of the other 'new*' methods around this one. I use it everywhere and I love it.

yes - i've got a fair number of class factories lying around myself.
Class.new is powerfull since you can easily parameterize your classes.

on a related note i used 'traits' alot with this sort of thing since it avoids
needing to initialize all those class vars directly down the inheritence
chain.

i'll read over some over of you code but gotta run to get the kid...

ciao.

-a

···

On Wed, 2 Aug 2006, Luke Kanies wrote:
--
we can never obtain peace in the outer world until we make peace with
ourselves.
- h.h. the 14th dali lama

  module Plugable
    def self.included other
      ...
    end
  end

  class MyPlugin
    include Plugable
  end

At this point, I'm looking at creating an Autoload class, and classes wanting to autoload will create an instance:

   Puppet::Autoload.new(self, "puppet/type")

They'll then be able to store that instance, or retrieve it from the class:

   Puppet::Autoload[self].load(:package)

That'll handle the mechanics of the loading. I'll have an additional :loadall method that will search through $:, loading everything that matches.

I want to keep the work around plugin loading -- that is, what to do with the class and how to initialize it -- in the class supporting plugins (i.e., the class with the 'new<thing>' method), because most initialization is pretty custom. I suppose I could start with a default method in a module, though; I hadn't thought much about that.

A significant part of this for me is that I have lots of classes that are related to each other -- the Puppet::Type class is related to its subclasses, the Puppet::Type::Package class is related to the Puppet::Type::package::PkgType classes (ugh, constants are hard), the 'service' class is related to each 'svctype' class, etc. Calling 'new<thing>' directly on the related class makes that relationship very explicit and pleasantly clear.

yes - i've got a fair number of class factories lying around myself.
Class.new is powerfull since you can easily parameterize your classes.

I *love* Class.new and Module.new. Ruby was great before I discovered them, but with them, I'm definitely hooked. Well, Class.new and class_eval, anyway.

on a related note i used 'traits' alot with this sort of thing since it avoids
needing to initialize all those class vars directly down the inheritence
chain.

Most of the variables involved are actually class instance variables (which, I understand, others don't use that much, but which I use pervasively and could not live without).

I need to just add traits into my codebase; I think it would simplify a lot of what I'm doing.

/me adds one more item to the todo

i'll read over some over of you code but gotta run to get the kid...

Comments, positive or negative, are always appreciated.

···

ara.t.howard@noaa.gov wrote:

--
Today I dialed a wrong number...The other person said, "Hello?" and
I said, "Hello, could I speak to Joey?"...
They said, "Uh...I don't think so...he's only 2 months old."
I said, "I'll wait." -- Steven Wright
---------------------------------------------------------------------
Luke Kanies | http://reductivelabs.com | http://madstop.com