CPAN Style installer

Hi all,

I had a number of conversations at RubyConf (you know who you are :-)) on
the subject of a CPAN style installer for ruby. Being someone who is
happier with code than talk, I decided to code something up.

Here it is:
http://osdn.dl.sourceforge.net/sourceforge/narf-lib/raa-install-0.0.2.tgz
The project page is shared with narf at:

Its more a demo or a proof of concept (though the sort with lots of
Unit tests) and as such has not been tested on anything other than my
platform (which is RedHat) and has some dependencies that would be
undesirable in a final version (wget and gnu tar spring to mind). It may
well not work on your platform (yet).

It has a few useful attributes though:
* It uses the RAA as a database (via the raa-xml.xml file)
* It will install packages that use some of the common ruby setup
methods (setup.rb, install.rb, extconf.rb and configure)
* It supports dependency tracking and versioning without needing a
local database. (to work, this will require additional info in the RAA,
and version info in the ruby libraries)
* Little change is needed from developers

As such, it successfully installs quite a large percentage of packages in
the RAA (particularly those where the download isn’t a 404). Some usage
examples:
bash$ raa-install --install strscan
bash$ raa-install --install Test::Unit
bash$ raa-install --showallpackages

One important thing, to fully implement this solution would require
additional data in the RAA as described below - the specifics of which I’m
more than happy to change. I think that building on top of the RAA is
definitely preferable to setting up something new though.

If there is interest in me doing so I would be happy to get this into a
more generally usable state (particularly with regard to platform
support) and would be extremely happy with any contributions and even
better, offers to take the project off of my hands :-).

Competing projects are also welcome to make this project redundant as
long as it happens before I grow old.

Enjoy,

-Tom Clarke

Some additional notes:

Support for parsing the xml from the RAA is crude. I first experimented
with REXML for parsing the data, but I lost patience before it completed a
single parse - having libxml as a dependency didn’t seem like a good idea
so I implemeted a crude RegEx based parser that would be sensitive to
changes in format.

RAA-Install supports dependencies, but to be useful would require
additional metadata to be available in the RAA xml file. If there are
lines such as…

… in raa-xml.xml it will install that package first.

Versioning is done by getting the following possible new fields in the RAA
strscan
StrScan

And doing the following:

require ‘strscan’
StrScan::VERSION # <== “1.2.3”

On all of these details I am flexible (I have to be, I don’t run the
RAA!), and if there any conventions - in particular for specifying a
version - I would be glad to here about them.

This is a swell, little tool, Tom. I played with it a bit tonight and it
worked great. This is exactly the sort of tool that could become pervasive.

Now all we need are mirrors. I spoke to Brian Ingerson today on the phone.
(He’s working with CPAN admins to incorporate Ruby, Python, etc.) We’ve
been analyzing the CPAN structure to determine where the RAA can fit. This
weekend we’ll be working on the scripts to build the structure, the indices
from RAA.

I imagine Ingy will jump on the list tomorrow or Saturday and run through
his findings concerning CPAN and Ruby.

_why

···

Tom Clarke (tom@u2i.com) wrote:

I had a number of conversations at RubyConf (you know who you are :-)) on
the subject of a CPAN style installer for ruby. Being someone who is
happier with code than talk, I decided to code something up.

Here it is:
Download raa-install-0.0.2.tgz (NARF -- better web libraries for Ruby)

In article Pine.LNX.4.44.0211071926160.7998-100000@localhost.localdomain,

Hi all,

I had a number of conversations at RubyConf (you know who you are :-)) on
the subject of a CPAN style installer for ruby. Being someone who is
happier with code than talk, I decided to code something up.

Here it is:
Download raa-install-0.0.2.tgz (NARF -- better web libraries for Ruby)
The project page is shared with narf at:
NARF -- better web libraries for Ruby download | SourceForge.net

Cool. I hope to get time to play with this tomorrow.

Its more a demo or a proof of concept (though the sort with lots of
Unit tests) and as such has not been tested on anything other than my
platform (which is RedHat) and has some dependencies that would be
undesirable in a final version (wget and gnu tar spring to mind). It may
well not work on your platform (yet).

I believe there is a pure ruby implementation of tar in rpkg (?)

It has a few useful attributes though:

  • It uses the RAA as a database (via the raa-xml.xml file)
  • It will install packages that use some of the common ruby setup
    methods (setup.rb, install.rb, extconf.rb and configure)
  • It supports dependency tracking and versioning without needing a
    local database. (to work, this will require additional info in the RAA,
    and version info in the ruby libraries)
  • Little change is needed from developers

As such, it successfully installs quite a large percentage of packages in
the RAA (particularly those where the download isn’t a 404). Some usage
examples:
bash$ raa-install --install strscan
bash$ raa-install --install Test::Unit
bash$ raa-install --showallpackages

One important thing, to fully implement this solution would require
additional data in the RAA as described below - the specifics of which I’m
more than happy to change. I think that building on top of the RAA is
definitely preferable to setting up something new though.

If there is interest in me doing so I would be happy to get this into a
more generally usable state (particularly with regard to platform
support) and would be extremely happy with any contributions and even
better, offers to take the project off of my hands :-).

Competing projects are also welcome to make this project redundant as
long as it happens before I grow old.

Enjoy,

-Tom Clarke

Some additional notes:

Support for parsing the xml from the RAA is crude. I first experimented
with REXML for parsing the data, but I lost patience before it completed a
single parse - having libxml as a dependency didn’t seem like a good idea
so I implemeted a crude RegEx based parser that would be sensitive to
changes in format.

I thought REXML was implemented in pure Ruby? Why the dependence on
libxml?

Thanks for the good work… now if we can only get mirroring… I believe
someone will be posting something about that soon :wink:

Phil

···

Tom Clarke tom@u2i.com wrote:

For those of us unfortunate enough not to be able to attend rubyconf can you
explain why to use this over rpkg? There are now several different (and up
and coming) ruby package management tools and I am fearful that having many
would be harmful to Ruby newbies and the Ruby community as a whole.

Thanks in advance for any information provided.


Signed,
Holden Glova

···

On Fri, 08 Nov 2002 18:00, Tom Clarke wrote:

Hi all,

I had a number of conversations at RubyConf (you know who you are :-)) on
the subject of a CPAN style installer for ruby. Being someone who is
happier with code than talk, I decided to code something up.

Here it is:
Download raa-install-0.0.2.tgz (NARF -- better web libraries for Ruby)
The project page is shared with narf at:
NARF -- better web libraries for Ruby download | SourceForge.net

I believe there is a pure ruby implementation of tar in rpkg (?)

Excellent, I will look into this.

Support for parsing the xml from the RAA is crude. I first experimented
with REXML for parsing the data, but I lost patience before it completed a
single parse - having libxml as a dependency didn’t seem like a good idea
so I implemeted a crude RegEx based parser that would be sensitive to
changes in format.

I thought REXML was implemented in pure Ruby? Why the dependence on
libxml?

I didn’t ever see REXML complete the parsing of the file - it was too
slow (I never waited long enough), certainly far too slow to be part of
the snappy command line tool of my imagination. I could of course have
been doing something wrong, or stupid. I am prone to such things,
sans-pair.

-Tom

···

On Fri, 8 Nov 2002, Phil Tomson wrote:

I would be absolutely fine with the community deciding to use rpkg. The
only reason I bothered to work on this is that rpkg doesn’t seem to have
attracted wide community support.

Without being in the slightest an expert on rubyconf there are a few
design decisions that I like:
* Raa-install as indicated in its name doesn’t try to do
everything, it builds on what we already have working (the RAA)
* Raa-install doesn’t need a local database of which libraries
have been installed. This means that even if you don’t use the installer
to install libraries, everything will still work
* Raa- install requires little new work from most developers, as
it supports the (mostly) standard ruby-tools for doing installation, in
fact the majority of packages work as is.

I hope this helps. If, I am misunderstanding anything about rpkg, please
let me know.

Thanks,

-Tom

···

On Fri, 8 Nov 2002, Holden Glova wrote:

For those of us unfortunate enough not to be able to attend rubyconf can
you explain why to use this over rpkg? There are now several different
(and up and coming) ruby package management tools and I am fearful that
having many would be harmful to Ruby newbies and the Ruby community as a
whole.

Thanks in advance for any information provided.

Yes, courtesy Thomas Hurst with minor tweakings by me.

Massimiliano

···

On Fri, Nov 08, 2002 at 05:51:22PM +0900, Phil Tomson wrote:

I believe there is a pure ruby implementation of tar in rpkg (?)

It’s a require in your library.

Thanks,

-Tom

or even rpkg :slight_smile:

-Tom

···

On Fri, 8 Nov 2002, Tom Clarke wrote:

On Fri, 8 Nov 2002, Holden Glova wrote:
Without being in the slightest an expert on rubyconf there are a few

I hope this helps. If, I am misunderstanding anything about rpkg, please
let me know.

As requested:

  • Raa-install as indicated in its name doesn’t try to do
    everything, it builds on what we already have working (the RAA)

The RAA is good for developers, rpkg wants to also be good for users
and clients.

Users won’t have root access, won’t like sorting out package
dependencies and conflicts, and won’t like compiling. rpkg manages
user side installations, handles dependencies and conflicts, and
allows using precompiled packages.

Clients will just want the software, they won’t care if it’s in Ruby
or whatever, so you just install Ruby and rpkg and issue

$ rapt install http://your.site/your.app.metadata

and it’ll install your app with all it needs.

  • Raa-install doesn’t need a local database of which libraries
    have been installed. This means that even if you don’t use the installer
    to install libraries, everything will still work

The current test release manages no database of installed files yet.
That’s one of the first things to fix. I’ll leave a --force option
for the adventurous. :slight_smile:

  • Raa- install requires little new work from most developers, as
    it supports the (mostly) standard ruby-tools for doing installation, in
    fact the majority of packages work as is.

rpkg uses setup.rb, install.rb, extconf.rb where available. Only new
work is writing a ~10 lines script to have them install to a specific
tree instead of the default (setup.rb --with-libdir=… etc.).

Massimiliano

···

On Fri, Nov 08, 2002 at 08:04:54PM +0900, Tom Clarke wrote:

I thought REXML was implemented in pure Ruby? Why the dependence on
libxml?

REXML is pure Ruby.

I didn’t ever see REXML complete the parsing of the file - it was too
slow (I never waited long enough), certainly far too slow to be part of
the snappy command line tool of my imagination. I could of course have
been doing something wrong, or stupid. I am prone to such things,
sans-pair.

I wrote a knock-off script using REXML to slurp in the RAA XML and then download
every file listed. Yes, it was slow, but parsing the entire file did complete
in a few minutes. This on an AMD K6 400 MHz with (I think) 256K RAM.

Now, right up front, reading in the entire file before working with the data is
a poor design choice, and had I been willing to spend more time on this I would
have used the REXML stream parser and gotten better performance (and would
likely have used threads to download multiple files at once).

BTW, many of the RAA entries do not have an actual download URL, but rather
point to another HTML page where some user intervention is needed. This was
almost always the case for anything hosted on Sourceforge.

And, unsurprisingly, many URLs just didn’t work. But I did grab 551 out of
roughly 700 files, which came to 25 MB when tarred and zipped. The uber-sumo
package.

James

···

-Tom

Great, wrong reply to. Still an answer from anyone will do :-).

Sorry,

-Tom

···

On Sat, 9 Nov 2002, Tom Clarke wrote:

It’s a require in your library.

Thanks,

-Tom

Did you mean in rpkg? Everything is in the tarball at
http://www.allruby.com/rpkg/rpkg-test.tar.gz, stringio.rb included.

Massimiliano

···

On Sat, Nov 09, 2002 at 10:05:04AM +0900, Tom Clarke wrote:

It’s a require in your library.

I wrote a knock-off script using REXML to slurp in the RAA XML and then download
every file listed. Yes, it was slow, but parsing the entire file did complete
in a few minutes. This on an AMD K6 400 MHz with (I think) 256K RAM.

Fair enough, but I would like as close to instant as I can get - a few
seconds at most.

Now, right up front, reading in the entire file before working with the data is
a poor design choice, and had I been willing to spend more time on this I would
have used the REXML stream parser and gotten better performance (and would
likely have used threads to download multiple files at once).

I was looking for solutions implementable in less than an hour or two.
I’ll devote more time to this should it edge toward popularity - it would
obviously be much nicer to use a real XML parser. For now, it is fast and works well.

BTW, many of the RAA entries do not have an actual download URL, but rather
point to another HTML page where some user intervention is needed. This was
almost always the case for anything hosted on Sourceforge.

I filter out all files without a tar.gz or tgz extension so this problem
is somewhat minimized.

And, unsurprisingly, many URLs just didn’t work. But I did grab 551 out of
roughly 700 files, which came to 25 MB when tarred and zipped. The uber-sumo
package.

As I was messing around with the xml file, I ran a script to find those
links which didn’t work - I have a list of those if anyone is interested.

-Tom

···

On Sat, 9 Nov 2002, JamesBritt wrote:

Thanks for the info. I’ve been looking around the rpkg source tree
while extracting the Tar source and it’s a very nice package.

I can see there is a lot of value in the work you have done, and feature
for feature blows my comparitively modest effort away. Though perhaps not
on the issues that are most important to me.

As I collect my thoughts on the issue, the things that are important to me
are:
* I want to see a package installation system usable to install
most potential libraries. This is more important to me than it being the
best packaging system. RAA integration is a key part of this.
* A packaging system stands the best chance if it can work with
the kinds of packages currently available - this means using source style
packages
* This doesn’t preclude the use of more advanced packaging in
addition, this particularly applies to the problem of Windows binaries,
which will need binary packaging anyway.
* The packaging system should not make use of a database anywhere
on the system, though it should be possible to identify what packages are
installed - this can be done by using version info in the packages.

Interestingly, I suspect that features to accomodate those goals could be
added to rpkg without detracting from any of rpkg’s value. The features
being:
* Handle source packages
* Use RAA as database
* Use versioning information in source files, not a db

I’d be happy to discuss doing the work to rpkg instead of raa-install if
these were features you would be happy to have added to rpkg.

So am I crazy or what?

-Tom

k

From: tom [mailto:tom@u2i.com]
Sent: Friday, November 08, 2002 11:31 AM
To: ruby-talk ML
Subject: Re: CPAN Style installer

I wrote a knock-off script using REXML to slurp in the RAA XML and
then download
every file listed. Yes, it was slow, but parsing the entire file
did complete
in a few minutes. This on an AMD K6 400 MHz with (I think) 256K RAM.

Fair enough, but I would like as close to instant as I can get - a few
seconds at most.

BTW, the entire RAA XML file is around 14K lines. Maybe I missed it in your
original post, but why do you want to read the entire thing, and how often would
you want to do it?

You can pull back specific RAA info using the RAA XML-RPC interface. (And I
have some almost-ready-for-release code that converts the returned data
structure into RSS/RDF.)

BTW, many of the RAA entries do not have an actual download URL, but rather
point to another HTML page where some user intervention is needed. This was
almost always the case for anything hosted on Sourceforge.

I filter out all files without a tar.gz or tgz extension so this problem
is somewhat minimized.

Some packages are simple *.rb files, though. Rimport, for example.

And, unsurprisingly, many URLs just didn’t work. But I did grab 551 out of
roughly 700 files, which came to 25 MB when tarred and zipped. The uber-sumo
package.

As I was messing around with the xml file, I ran a script to find those
links which didn’t work - I have a list of those if anyone is interested.

I neglected to track such stuff, though I may rewrite the code. I needed some
quick info for somebody who is considering hosting a mirror of all RAA
packages. They wanted to know how many apps there were, and just how much disk
space might be needed.

James

···

On Sat, 9 Nov 2002, JamesBritt wrote:

-Tom

  • I want to see a package installation system usable to install
    most potential libraries. This is more important to me than it being the
    best packaging system. RAA integration is a key part of this.

If you were able to install most potential libraries through rpkg,
would you consider RAA integration still necessary?

  • A packaging system stands the best chance if it can work with
    the kinds of packages currently available - this means using source style
    packages

Most current rpkg packages are source packages.

  • This doesn’t preclude the use of more advanced packaging in
    addition, this particularly applies to the problem of Windows binaries,
    which will need binary packaging anyway.

Yes.

  • The packaging system should not make use of a database anywhere
    on the system, though it should be possible to identify what packages are
    installed - this can be done by using version info in the packages.

Later you suggest `Use RAA as database’. There might be some
confusion about this, as we’re talking about at least three databases:

  • a database of metadata for available packages. Right now this is
    local because I pay Internet connection by the minute and I don’t
    like online just to browse packages; I’ll add capability for a
    remote database, either LDAP or a RAA backend or both (though
    information is needed which the RAA doesn’t carry and will
    have to be found somewhere else);

  • a database of metadata for installed packages, used to detect
    conflicts with packages to be installed, and to check whether their
    dependency requirements are already satisfied;

  • a database of installed files, used to protect against accidental
    clashing of package files.

The latter two are in the old rpkg and still have to be integrated in
the new one.

You also suggest `Use versioning information in source files, not a
db’. Doable, provided all authors agree on a way to version their
sources.

I’m not sure having the distribution’s consistency depend on the
upstream developers is wise, though (just think about the dead links
in the RAA). IMHO it’s easier to apply a clean and consistent policy
in the packaging stage while still leaving developers free to organize
their work in the way they’re most confortable with.

Interestingly, I suspect that features to accomodate those goals could be
added to rpkg without detracting from any of rpkg’s value. The features
being:

  • Handle source packages
  • Use RAA as database
  • Use versioning information in source files, not a db

I’d be happy to discuss doing the work to rpkg instead of raa-install if
these were features you would be happy to have added to rpkg.

I’d be happy to discuss it, too. Hopefully I have clarified some
design decisions with which you can agree.

So am I crazy or what?

What.

:slight_smile:

Massimiliano

···

On Sat, Nov 09, 2002 at 06:09:10AM +0900, tom wrote:

BTW, the entire RAA XML file is around 14K lines. Maybe I missed it in your
original post, but why do you want to read the entire thing, and how often would
you want to do it?

I just sent a patch to Tom which will use Apache’s ETag header to detect
any changes and cache the XML file once it’s been pulled. This
significantly aleviates the amount of data raa-install is pulling down
the line.

Some packages are simple *.rb files, though. Rimport, for example.

I wonder. Is it fair to simply install these directly in
$prefix/lib/ruby/site_ruby/$version? (aka Config::CONFIG[‘sitelibdir’])

_why

···

JamesBritt (james@jamesbritt.com) wrote:

BTW, the entire RAA XML file is around 14K lines. Maybe I missed it in your
original post, but why do you want to read the entire thing, and how often would
you want to do it?

You can pull back specific RAA info using the RAA XML-RPC interface. (And I
have some almost-ready-for-release code that converts the returned data
structure into RSS/RDF.)

Well the main feature that uses the whole list is the --showall option. I
could avoid pulling back the list, unless that option was used. I would
love to have more details about the XML-RPC interface, is that available
anywhere.?

We probably don’t need to download the whole file even if we want
the whole thing everytime though, we can cache the file unless it is
updated.

BTW, many of the RAA entries do not have an actual download URL, but rather
point to another HTML page where some user intervention is needed. This was
almost always the case for anything hosted on Sourceforge.

I filter out all files without a tar.gz or tgz extension so this problem
is somewhat minimized.

Some packages are simple *.rb files, though. Rimport, for example.

I can add support for them pretty easily, and will.

-Tom

···

On Sat, 9 Nov 2002, JamesBritt wrote:

Some packages are simple *.rb files, though. Rimport, for example.

I wonder. Is it fair to simply install these directly in
$prefix/lib/ruby/site_ruby/$version? (aka Config::CONFIG[‘sitelibdir’])

I’m a liar.

It’s my own freakin’ code, and I can’t even recall how it’s packaged. How sad :frowning:

Anyway, the download link is to a *.rb file, but that’s just a script that a)
E-mails me when a download request is made, and b) redirects the browser to a
tar file with the latest version. Installing Rimport involves calling
install.rb, placing assorted files under site_dir.

But it’s a good question. Should every 3rd-party app go into a qualifiying
subdirectory? I’d say Yes. (Now I have to go see if Rimport does that …)

James

···

_why