Dependency "trees" - suggestions?

I’m struggling with building dependency “trees” for rpkg. What
happens (or should happen eventually) is that, given a record like:

Package: foo
Depends: bar (= 1.3), booz (> 2.0)

…installing foo will get the foo package, the matching bar package,
the matching booz package, the packages bar depends on, the packages
booz depends on… you get the picture (if you’ve ever used Debian,
you also get the movie).

This is not really a tree because many packages can depend on one, so
a node can have multiple parents. But it is directed and must be
acyclic.

And there’s the rub. I can’t just trust the database that it won’t
generate cycles, so if it happens I must catch that.

In other words, whenever a node is added, it must be checked that it
hasn’t produced a cycle.

The million dollar question: how?

Right now, each time a node is added, an array containing all the
possible paths is updated. If adding the node causes a cycle in one
of the paths, an exception is raised.

It works, but it’s fragile, and I’m pretty sure it wouldn’t scale up
very good (to put it mildly).

Suggestions?

Massimiliano

You could raise an exception only when you encounter a cycle while
traversing the graph; if you do this, you need only a Hash containing
all the nodes (or their ids), so you can tell if you’ve visited a
particular node before.

Paul

···

On Tue, Sep 17, 2002 at 12:52:49AM +0900, Massimiliano Mirra wrote:

Right now, each time a node is added, an array containing all the
possible paths is updated. If adding the node causes a cycle in one
of the paths, an exception is raised.

This is not really a tree because many packages can depend on one, so
a node can have multiple parents. But it is directed and must be
acyclic.

So this is a DAG “Directed Acyclic Graph” for which there many algorithms
available.

In other words, whenever a node is added, it must be checked that it
hasn’t produced a cycle.

The million dollar question: how?

I am sure there is one for this too. You may want to search for DAG
construction algorithms in Google.
HTH,

– shanko

I’m struggling with building dependency “trees” for rpkg. What
happens (or should happen eventually) is that, given a record like:

Package: foo
Depends: bar (= 1.3), booz (> 2.0)

…installing foo will get the foo package, the matching bar package,
the matching booz package, the packages bar depends on, the packages
booz depends on… you get the picture (if you’ve ever used Debian,
you also get the movie).

Tangent to your post …

I’m curious, how will you handle existing local versions? For example, I
request the latest foo, it requires 1.3 but I have 1.4 installed. If they
specify >1.3, then of course, you’re okay.

Then what about foo requires bar 1.3 which requires ick 1.5 and blat 2.0.
Ick 1.5 requires strom 2.4, but blat 2.0 requires strom 1.5?

foo => bar 1.3 => ick 1.5 => strom 2.4

=======> blat 2.0 => strom 1.5

Just declare this an illegal package and not download it?

Chris
http://clabs.org

FYI: Ruby 1.7.x has tsort.rb as a standard library. The following
cites the beginning of tsort.rb’s embedded document.

···

At Tue, 17 Sep 2002 02:51:41 +0900, Shashank Date wrote:

This is not really a tree because many packages can depend on one, so
a node can have multiple parents. But it is directed and must be
acyclic.

So this is a DAG “Directed Acyclic Graph” for which there many algorithms
available.


tsort.rb

tsort.rb provides a module for topological sorting and strongly
connected components.

Example

require 'tsort'

class Hash
  include TSort
  alias tsort_each_node each_key
  def tsort_each_child(node, &block)
    fetch(node).each(&block)
  end
end

{1=>[2, 3], 2=>[3], 3=>[], 4=>[]}.tsort
#=> [3, 2, 1, 4]

{1=>[2], 2=>[3, 4], 3=>[2], 4=>[]}.strongly_connected_components
#=> [[4], [2, 3], [1]]

TSort module

TSort implements topological sorting using Tarjan's algorithm for
strongly connected components.

– Gotoken

Yes. This means that for each node I add, the entire graph has to be
traversed again looking checking for cycles.

I’m wondering whether a less expensive method exists (and less
convoluted than the one I’m using now) maybe involving some tricks
with adiancency or incidence matrices.

Massimiliano

···

On Tue, Sep 17, 2002 at 01:37:33AM +0900, Paul Brannan wrote:

On Tue, Sep 17, 2002 at 12:52:49AM +0900, Massimiliano Mirra wrote:

Right now, each time a node is added, an array containing all the
possible paths is updated. If adding the node causes a cycle in one
of the paths, an exception is raised.

You could raise an exception only when you encounter a cycle while
traversing the graph; if you do this, you need only a Hash containing
all the nodes (or their ids), so you can tell if you’ve visited a
particular node before.

I’m struggling with building dependency “trees” for rpkg. What
happens (or should happen eventually) is that, given a record like:

Package: foo
Depends: bar (= 1.3), booz (> 2.0)

…installing foo will get the foo package, the matching bar package,
the matching booz package, the packages bar depends on, the packages
booz depends on… you get the picture (if you’ve ever used Debian,
you also get the movie).

Tangent to your post …

I’m curious, how will you handle existing local versions? For example, I
request the latest foo, it requires 1.3 but I have 1.4 installed. If they
specify >1.3, then of course, you’re okay.

If there is a bar 1.4 around that is not backward compatible with bar
1.3, foo’s maintainer should make it so that foo’s entry requires
strictly bar 1.3. Then, if foo is requested when bar 1.4 is
installed, rpkg will either downgrade bar after prompting the user or
refuse to install foo at all until bar hasn’t been manually
downgraded.

Then what about foo requires bar 1.3 which requires ick 1.5 and blat 2.0.
Ick 1.5 requires strom 2.4, but blat 2.0 requires strom 1.5?

(I assume you’re talking about strict requires, i.e. =', not >='.)

foo => bar 1.3 => ick 1.5 => strom 2.4

=======> blat 2.0 => strom 1.5

Just declare this an illegal package and not download it?

Yes. This would be a bug in the distribution at the database level.
Either version of strom would not be in the database.

Anyway, none of this is fixed in stone. Nine tenths of this package
management stuff is about defining rules and policies, and I’m going
to hear and welcome all the input that comes.

Massimiliano

···

On Tue, Sep 17, 2002 at 08:48:02AM +0900, Chris Morris wrote:

You migth also have a look at http://rgl.sourceforge.net/ which provides a method
RGL::Graph#acyclic? for directed graphs.

Horst

···

In other words, whenever a node is added, it must be checked that it
hasn’t produced a cycle.

The million dollar question: how?

I am sure there is one for this too. You may want to search for DAG
construction algorithms in Google.
HTH,

– shanko

Sound of head banging against the wall

Why do I always wait so long before asking the List?

tsort.rb looks just like the perfect solution for the lazy me. I
particularly like this:

"If there is a cycle, (({TSort::Cyclic})) is raised."

:slight_smile:

I’m trying it in 1.6 and I get two test failures. This is the first:

Failure occurred in test_cycle(TSortTest)
[/home/bard/lib/ruby/runit/assert.rb:56]: Expected exception to
be of type TSort::Cyclic but was <LocalJumpError: no block given

def test_cycle
  h = {1=>[2], 2=>[3, 4], 3=>[2], 4=>[]}
  assert_equal([[4], [2, 3], [1]],
    h.strongly_connected_components.map {|nodes| nodes.sort})
  assert_exception(TSort::Cyclic) { h.tsort }  # <-----
end

I looked at TSort::tsort but couldn’t see where a block was required.

Massimiliano

···

On Tue, Sep 17, 2002 at 03:33:24AM +0900, GOTO Kentaro wrote:

So this is a DAG “Directed Acyclic Graph” for which there many algorithms
available.
FYI: Ruby 1.7.x has tsort.rb as a standard library.

foo => bar 1.3 => ick 1.5 => strom 2.4

=======> blat 2.0 => strom 1.5

Just declare this an illegal package and not download it?

Yes. This would be a bug in the distribution at the database level.
Either version of strom would not be in the database.

Meaning you won’t support multiple versions of the same package in the
master rpkg database?

Yes. This means that for each node I add, the entire graph has to be
traversed again looking checking for cycles.

OIC, you aren’t interested in protecting against a corrupted database;
you just want to prevent the user from specifying a cyclic dependency.

I’m wondering whether a less expensive method exists (and less
convoluted than the one I’m using now) maybe involving some tricks
with adiancency or incidence matrices.

Probably. I don’t know enough about graphs to say.

Paul

···

On Tue, Sep 17, 2002 at 10:30:30AM +0900, Massimiliano Mirra wrote:

That wasn’t reproduced here with test/unit:

% cat ~/foo.rb
require “test/unit”
require “test/unit/ui/console/testrunner”
class Hash
include TSort
alias tsort_each_node each_key
def tsort_each_child(node, &block)
fetch(node).each(&block)
end
end
class TestMe < Test::Unit::TestCase
def test_cycle
h = {1=>[2], 2=>[3, 4], 3=>[2], 4=>}
assert_equal([[4], [2, 3], [1]],
h.strongly_connected_components.map {|nodes| nodes.sort})
assert_raises(TSort::Cyclic) { h.tsort } # <-----
end
end

% ruby -r ~/ruby/ruby/lib/tsort.rb ~/foo.rb
Loaded suite /home/gotoken/foo
Started…

Finished in 0.005655 seconds.
1 runs, 2 assertions, 0 failures, 0 errors
% ruby -v
ruby 1.6.7 (2002-05-23) [i386-freebsd4]
%

– Gotoken

···

At Tue, 17 Sep 2002 10:30:30 +0900, Massimiliano Mirra wrote:

Failure occurred in test_cycle(TSortTest)
[/home/bard/lib/ruby/runit/assert.rb:56]: Expected exception to
be of type TSort::Cyclic but was <LocalJumpError: no block given

def test_cycle
  h = {1=>[2], 2=>[3, 4], 3=>[2], 4=>[]}
  assert_equal([[4], [2, 3], [1]],
    h.strongly_connected_components.map {|nodes| nodes.sort})
  assert_exception(TSort::Cyclic) { h.tsort }  # <-----
end

Meaning that when two versions of a package are different enough as to
be considered different things altogether, and both are wanted in the
same database, they will be given different names.

If strom 2.4 is strictly required by ick, and strom 1.5 is strictly
required by blat, i.e. neither is happy with strom >= 1.5, 2.4 and 1.5
don’t provide a common subset and are good candidates for different
names, strom1 and strom2.

Examples in the Debian database are numerous, the most apparent being
libc5 (Version: 5.4.146-12) versus libc6 (Version: 2.2.5-6).

Massimiliano

···

On Tue, Sep 17, 2002 at 09:18:13PM +0900, Chris Morris wrote:

foo => bar 1.3 => ick 1.5 => strom 2.4

=======> blat 2.0 => strom 1.5

Just declare this an illegal package and not download it?

Yes. This would be a bug in the distribution at the database level.
Either version of strom would not be in the database.

Meaning you won’t support multiple versions of the same package in the
master rpkg database?

Actually, the user just specifies the starting package. rpkg looks at
the package names in its Depends:' field, queries the database for them, looks in turn at the packages in their Depends:’ fields… and
so on, until it gets to the `leaf’ packages (those who don’t depend on
any).

There is so much information to keep track of, that it’s possible for
a package’s maintainer to specify a dependency that in some cases
causes a cycle. That happens seldom in Debian (I’ve only seen it
twice or thrice in more than a year, and Debian has over 7000
packages), but it’s not impossible, plus we’ll all be novices in
distribution management. :slight_smile:

Massimiliano

···

On Tue, Sep 17, 2002 at 10:36:36PM +0900, Paul Brannan wrote:

On Tue, Sep 17, 2002 at 10:30:30AM +0900, Massimiliano Mirra wrote:

Yes. This means that for each node I add, the entire graph has to be
traversed again looking checking for cycles.

OIC, you aren’t interested in protecting against a corrupted database;
you just want to prevent the user from specifying a cyclic
dependency.

Meaning you won’t support multiple versions of the same package in the
master rpkg database?

Meaning that when two versions of a package are different enough as to
be considered different things altogether, and both are wanted in the
same database, they will be given different names.

Gotcha. Thx.

Chris