Bug in require?

[ahoward@localhost ~]$ strace ruby -e' 1.times{ require "digest/md5.so" }' 2>&1 |grep open|grep md5
   open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY|O_LARGEFILE) = 3
   open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY) = 3

   [ahoward@localhost ~]$ strace ruby -e' 2.times{ require "digest/md5.so" }' 2>&1 |grep open|grep md5
   open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY|O_LARGEFILE) = 3
   open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY) = 3

   [ahoward@localhost ~]$ strace ruby -e' 3.times{ require "digest/md5.so" }' 2>&1 |grep open|grep md5
   open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY|O_LARGEFILE) = 3
   open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY) = 3

   [ahoward@localhost ~]$ strace ruby -e' 4.times{ require "digest/md5.so" }' 2>&1 |grep open|grep md5
   open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY|O_LARGEFILE) = 3
   open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY) = 3

   [ahoward@localhost ~]$ cat a.rb
   4.times{ require 'digest/md5' }

   [ahoward@localhost ~]$ strace ruby a.rb 2>&1 |grep open|grep md5
   open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY|O_LARGEFILE) = 3
   open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY) = 3
   open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY|O_LARGEFILE) = 3

why the difference?

-a

···

--

email :: ara [dot] t [dot] howard [at] noaa [dot] gov
phone :: 303.497.6469
Your life dwells amoung the causes of death
Like a lamp standing in a strong breeze. --Nagarjuna

===============================================================================

[ahoward@localhost ~]$ cat a.rb

4.times{ require 'digest/md5' }

[ahoward@localhost ~]$ strace ruby a.rb 2>&1 |grep open|grep md5
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so",
O_RDONLY|O_LARGEFILE) = 3
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so",
O_RDONLY|O_LARGEFILE) = 3
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY) = 3
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so",
O_RDONLY|O_LARGEFILE) = 3
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so",
O_RDONLY|O_LARGEFILE) = 3
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so",
O_RDONLY|O_LARGEFILE) = 3

why the difference?

If you look at the stracefile, you'll notice that the last three times it
just open and closes the file, without reading. I'm very interested in this,
because I've noticed the same behaviour with my Rails application, and there
the search path is _huge_.

Han Holl

-a

···

--

===============================================================================
> email :: ara [dot] t [dot] howard [at] noaa [dot] gov
> phone :: 303.497.6469
> Your life dwells amoung the causes of death
> Like a lamp standing in a strong breeze. --Nagarjuna

===============================================================================

  [ahoward@localhost ~]$ strace ruby -e' 4.times{ require "digest/md5.so"

···

On Fri, Sep 16, 2005 at 12:18:33PM +0900, Ara.T.Howard wrote:
                                                                       ===
[...]

  [ahoward@localhost ~]$ cat a.rb
  4.times{ require 'digest/md5' }

                               ===

why the difference?

batsman@tux-chan:/tmp$ strace ruby -e' 4.times{ require "digest/md5.so" }' 2>&1 |grep open|grep md5
open("/home/batsman/usr//lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY|O_LARGEFILE) = 3
open("/home/batsman/usr//lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY) = 3
batsman@tux-chan:/tmp$ strace ruby -e' 4.times{ require "digest/md5" }' 2>&1 |grep open|grep md5
open("/home/batsman/usr//lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY|O_LARGEFILE) = 3
open("/home/batsman/usr//lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY|O_LARGEFILE) = 3
open("/home/batsman/usr//lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY) = 3
open("/home/batsman/usr//lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY|O_LARGEFILE) = 3
open("/home/batsman/usr//lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY|O_LARGEFILE) = 3
open("/home/batsman/usr//lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY|O_LARGEFILE) = 3

If no extension is given, ruby verifies if any new 'feature' can be loaded:

batsman@tux-chan:/tmp$ strace ruby -rfileutils -e'require "digest/md5"; FileUtils.mkdir("digest"); File.open("digest/md5.rb","w"){|f| f.puts %{puts "==== opened & loaded md5.rb"} }; $:.unshift "."; require "digest/md5" ' 2>&1 |grep open|grep md5
execve("/home/batsman/usr/bin/ruby", ["ruby", "-rfileutils", "-erequire \"digest/md5\"; FileUtils.mkdir(\"digest\"); File.open(\"digest/md5.rb\",\"w\"){|f| f.puts %{puts \"==== opened & loaded md5.rb\"} }; $:.unshift \".\"; require \"digest/md5\" "], [/* 31 vars */]) = 0
open("/home/batsman/usr//lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY|O_LARGEFILE) = 3
open("/home/batsman/usr//lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY|O_LARGEFILE) = 3
open("/home/batsman/usr//lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY) = 3
open("digest/md5.rb", O_WRONLY|O_CREAT|O_TRUNC|O_LARGEFILE, 0666) = 3
write(3, "puts \"==== opened & loaded md5.r"..., 35) = 35
open("./digest/md5.rb", O_RDONLY|O_LARGEFILE) = 3
open("./digest/md5.rb", O_RDONLY|O_LARGEFILE) = 3
open("./digest/md5.rb", O_RDONLY|O_LARGEFILE) = 3
open("./digest/md5.rb", O_RDONLY|O_LARGEFILE) = 3
open("./digest/md5.rb", O_RDONLY|O_LARGEFILE) = 3
read(3, "puts \"==== opened & loaded md5.r"..., 4096) = 35
write(1, "==== opened & loaded md5.rb\n", 28==== opened & loaded md5.rb

--
Mauricio Fernandez

static int
file_load_ok(const char *file)
{
    FILE *f;

    if (!file) return 0;
    f = fopen(file, "r");
    if (f == NULL) return 0;
    fclose(f);
    return 1;
}

(file.c)

···

On Fri, Sep 16, 2005 at 07:14:21PM +0900, Han Holl wrote:

If you look at the stracefile, you'll notice that the last three times it
just open and closes the file, without reading. I'm very interested in this,
because I've noticed the same behaviour with my Rails application, and there
the search path is _huge_.

--
Mauricio Fernandez

yeah exactly - i'm trying to speed up acgi running strace's and was seeing md5
get opened like 20 times. this sure could slow something like rails down
alright.

guess this is the point where it's be smart to look at the source for
rb_require...

cheers.

-a

···

On Fri, 16 Sep 2005, Han Holl wrote:

[ahoward@localhost ~]$ cat a.rb

4.times{ require 'digest/md5' }

[ahoward@localhost ~]$ strace ruby a.rb 2>&1 |grep open|grep md5
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so",
O_RDONLY|O_LARGEFILE) = 3
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so",
O_RDONLY|O_LARGEFILE) = 3
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so", O_RDONLY) = 3
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so",
O_RDONLY|O_LARGEFILE) = 3
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so",
O_RDONLY|O_LARGEFILE) = 3
open("/usr/local/lib/ruby/1.8/i686-linux/digest/md5.so",
O_RDONLY|O_LARGEFILE) = 3

why the difference?

If you look at the stracefile, you'll notice that the last three times it
just open and closes the file, without reading. I'm very interested in this,
because I've noticed the same behaviour with my Rails application, and there
the search path is _huge_.

--

email :: ara [dot] t [dot] howard [at] noaa [dot] gov
phone :: 303.497.6469
Your life dwells amoung the causes of death
Like a lamp standing in a strong breeze. --Nagarjuna

===============================================================================

Yes, that's answering where it happens.
Remains the question why.

Han

···

On 9/16/05, Mauricio Fernández <mfp@acm.org> wrote:

On Fri, Sep 16, 2005 at 07:14:21PM +0900, Han Holl wrote:
> If you look at the stracefile, you'll notice that the last three times
it
> just open and closes the file, without reading. I'm very interested in
this,
> because I've noticed the same behaviour with my Rails application, and
there
> the search path is _huge_.

static int
file_load_ok(const char *file)
{
FILE *f;

if (!file) return 0;
f = fopen(file, "r");
if (f == NULL) return 0;
fclose(f);
return 1;
}

(file.c)

--
Mauricio Fernandez

Hi,

···

In message "Re: bug in require?" on Fri, 16 Sep 2005 20:56:38 +0900, Han Holl <han.holl@gmail.com> writes:

Yes, that's answering where it happens.
Remains the question why.

It was the easiest way to check if a file is loadable (i.e. a file
exists and has proper permissions). Maybe it's possible to replace it
with access(3).

              matz.

commandline and from a script.
Only now I see that the difference is that from the commandline he requires
"digest/md5.so", and from the file
'digest/md5', without the extension.
Which means that you get a considerable perfomance penalty by omitting the
extension. In a gem-installed rails environment the seachpath can easily be
25 directories. Multiple requires withou extension can be costly there.

Cheers,

Han

···

On 9/16/05, Yukihiro Matsumoto <matz@ruby-lang.org> wrote:

Hi,

In message "Re: bug in require?" > on Fri, 16 Sep 2005 20:56:38 +0900, Han Holl <han.holl@gmail.com> writes:

>Yes, that's answering where it happens.
>Remains the question why.

It was the easiest way to check if a file is loadable (i.e. a file
exists and has proper permissions). Maybe it's possible to replace it
with access(3).

matz.

Ara's question was what the difference was between execution from the

right. try this fix for your rails app:

   harp:~ > strace ruby -e'4.times{ require "digest/md5" }' 2>&1 |egrep 'open.*md5'
   open("/usr/lib/ruby/1.6/i386-linux-gnu/digest/md5.so", O_RDONLY) = 3

   harp:~ > strace ruby -r ./require.rb -e'4.times{ require "digest/md5" }' 2>&1 |egrep 'open.*md5'
   open("/usr/lib/ruby/1.6/i386-linux-gnu/digest/md5.so", O_RDONLY) = 3

   harp:~ > cat ./require.rb
   require 'rbconfig.rb'
   module Kernel
     alias_method '__require__', 'require'
     def require lib
       if((ext = lib[ %r/\.[^\.]+$/ ]))
         __require__ lib
       else
         dlext, dlext2 = ::Config::CONFIG['DLEXT'], ::Config::CONFIG['DLEXT2']
         __require__ "#{ lib }.#{ dlext }" rescue
         __require__ "#{ lib }.#{ dlext2 }" rescue
         __require__ "#{ lib }.rb"
       end
     end
   end

cheers.

-a

···

On Fri, 16 Sep 2005, Han Holl wrote:

On 9/16/05, Yukihiro Matsumoto <matz@ruby-lang.org> wrote:

Hi,

In message "Re: bug in require?" >> on Fri, 16 Sep 2005 20:56:38 +0900, Han Holl <han.holl@gmail.com> writes:

>Yes, that's answering where it happens.
>Remains the question why.

It was the easiest way to check if a file is loadable (i.e. a file
exists and has proper permissions). Maybe it's possible to replace it
with access(3).

matz.

Ara's question was what the difference was between execution from the

commandline and from a script. Only now I see that the difference is that
from the commandline he requires "digest/md5.so", and from the file
'digest/md5', without the extension. Which means that you get a
considerable perfomance penalty by omitting the extension. In a
gem-installed rails environment the seachpath can easily be 25 directories.
Multiple requires withou extension can be costly there.

--

email :: ara [dot] t [dot] howard [at] noaa [dot] gov
phone :: 303.497.6469
Your life dwells amoung the causes of death
Like a lamp standing in a strong breeze. --Nagarjuna

===============================================================================

Hi,

At Fri, 16 Sep 2005 23:24:16 +0900,
Han Holl wrote in [ruby-talk:156435]:

Ara's question was what the difference was between execution from the
commandline and from a script.
Only now I see that the difference is that from the commandline he requires
"digest/md5.so", and from the file
'digest/md5', without the extension.
Which means that you get a considerable perfomance penalty by omitting the
extension. In a gem-installed rails environment the seachpath can easily be
25 directories. Multiple requires withou extension can be costly there.

If md5.rb exists in loadpath, it should be loaded even if
md5.so has been loaded.

···

--
Nobu Nakada

Yes, that's one way of dealing with it.
Another approach that I find attractive is autoload. I think in the case of
the Digest namespace, one could write:
module Digest
autoload :MD5, 'digest/md5'
end
Autoloading also has the advantage of looser coupling, lazy loading and
auto-determining load order in case of (almost cyclic) dependencies.
It should perform better than redundant requires as well, but I doubt this
is noticeable.

Cheers,

Han Holl

···

On 9/16/05, Ara.T.Howard <Ara.T.Howard@noaa.gov> wrote:

harp:~ > cat ./require.rb
require 'rbconfig.rb'
module Kernel
alias_method '__require__', 'require'
def require lib
if((ext = lib[ %r/\.[^\.]+$/ ]))
__require__ lib
else
dlext, dlext2 = ::Config::CONFIG['DLEXT'], ::Config::CONFIG['DLEXT2']
__require__ "#{ lib }.#{ dlext }" rescue
__require__ "#{ lib }.#{ dlext2 }" rescue
__require__ "#{ lib }.rb"
end
end
end

I don't doubt for a minute that you are right, but in practice this will of
course hardly ever occur, and with the gem-induced explosion of the search
path it is probably worthwhile to do something about it: rewriting require,
or introducing a specialized require with a different name, or autoloading.

Intuitively, if I say: require 'some/feature' , I would expect (almost)
nothing to happen if this feature had been required before. About 50 failing
lookups in directories is not almost nothing.
Maybe we could have some global flag meaning: I don't need this overly
correct behaviour, please do nothing if either .rb of .so (or .dll) has been
loaded ?

Cheers,

Han Holl

···

On 9/16/05, nobu.nokada@softhome.net <nobu.nokada@softhome.net> wrote:

If md5.rb exists in loadpath, it should be loaded even if
md5.so has been loaded.

In article <7348fe71050916123626c96aa4@mail.gmail.com>,
han.holl@gmail.com says...

I don't doubt for a minute that you are right, but in practice this will of
course hardly ever occur, and with the gem-induced explosion of the search
path it is probably worthwhile to do something about it: rewriting require,
or introducing a specialized require with a different name, or autoloading.

I was noticing this too the other day; Windows file opens are much
slower than Linux, and even small rails apps can spend a second or two
on Windows just waiting for the file system while it goes searching for
gems at startup. Makes unit testing a good bit less fun.

···

--
Jay Levitt |
Wellesley, MA | I feel calm. I feel ready. I can only
Faster: jay at jay dot fm | conclude that's because I don't have a
http://www.jay.fm | full grasp of the situation. - Mark Adler