Operator - on list should be O(n) on sorted lists

Brian_Candler · 2 February 2003 11:13

1) In the - method, test if the list are allready sorted (which is a O(n+m)
operation), them adapt the algo (ie O(n+m) or O(n*m)).

But as soon as you have used '+' to generate the union of two sets, it will
almost always be unordered again anyway.

IMO it would be better to have a different class for that behaviour (i.e.
your option 3). If there's nothing like it under Library/Datastructure in
RAA, it could be yours

Or you could use one of the 'tree' implementations, which maintains an
ordered list (not quite the same as a Set, since it allows duplicate keys)

I'm not sure how organised this is though - it would be nice to have a
common 'Tree' base class with subclasses for different implementations.

2) In the - method do a sort first (which is a O(n*ln n+m*ln m) operation)
and them do the - with the O(n+m) algo. Because the - operator is a set
operator, the order doesn't really make sense...

To me, an Array is by definition an ordered list, so I wouldn't want it
randomly reordered. I'd also have said that A-B could be a useful array
(rather than set) operation, to remove a few elements from an array:

[5,0,1,3,7]-[0]
=> [5, 1, 3, 7]

Except that in Ruby it also eliminates duplicates from A, so it can't really
be used in this way:

['h','e','l','l','o']-['e']
=> ["h", "l", "o"]

So, A- is not a null operation

I'm sure there must be a reason for this. It seems that an Array is also
intended to be a poor man's Set, but as you've found, it doesn't do that job
very well.

Regarding your Set implementation as ordered Arrays: there's the potential
for even faster operation. For example, the union (+) operator can take
advantage of the fact that its two operands are already sorted. A 'merge'
operation, at the same time removing duplicates, can be done in O(n+m).

Regards,

Brian.

Cedric_Foll4 · 2 February 2003 13:11

But as soon as you have used ‘+’ to generate the union of two sets, it
will
almost always be unordered again anyway.
No.
I’ve defined the + operator in my class Set in order to respect the order
relation.
My solution is dirty (because it’s a O((n+m)ln(n+m)) operation) but it could
be a O(n+m) algo.

IMO it would be better to have a different class for that behaviour (i.e.
your option 3). If there’s nothing like it under Library/Datastructure in
RAA, it could be yours
OK, i will look at the RAA in order to see if such a class have been
allready proposed.

Or you could use one of the ‘tree’ implementations, which maintains an
ordered list (not quite the same as a Set, since it allows duplicate keys)
I’m not sure how organised this is though - it would be nice to have a
common ‘Tree’ base class with subclasses for different implementations.

Yes, a binary tree data structure could be good two.
In fact i think that binary tree (and B-Tree, general tree) should be in the
standart library (and Set too). Somebody has another point of view ?

To me, an Array is by definition an ordered list, so I wouldn’t want it
randomly reordered. I’d also have said that A-B could be a useful array
(rather than set) operation, to remove a few elements from an array:

[5,0,1,3,7]-[0]
=> [5, 1, 3, 7]

I’m agree with you.

Regarding your Set implementation as ordered Arrays: there’s the potential
for even faster operation. For example, the union (+) operator can take
advantage of the fact that its two operands are already sorted. A ‘merge’
operation, at the same time removing duplicates, can be done in O(n+m).

Yes, cf my remark in the end of my mail.

Regards

Yukihiro_Matsumoto2 · 2 February 2003 14:21

Hi,

···

In message “Re: operator - on list should be O(n) on sorted lists” on 03/02/02, Brian Candler B.Candler@pobox.com writes:

[‘h’,‘e’,‘l’,‘l’,‘o’]-[‘e’]
=> [“h”, “l”, “o”]

So, A- is not a null operation

I’m sure there must be a reason for this. It seems that an Array is also
intended to be a poor man’s Set, but as you’ve found, it doesn’t do that job
very well.

The reason is exactly as you guessed. But your example made me think
it is overkill for ‘-’ operation (since “-” is not a Set operation).
Can I change this?

The operator “|” and “&” will not change this behavior of removing
duplicates, since they still are operations for “poor man’s Set”.

						matz.

Yukihiro_Matsumoto2 · 2 February 2003 14:22

Hi,

···

In message “Re: operator - on list should be O(n) on sorted lists” on 03/02/02, “Ce$B"ke(Bdric Foll” foll_cedric@hotmail.com writes:

OK, i will look at the RAA in order to see if such a class have been
allready proposed.

set.rb is available in the distribution, it uses Hash inside.

						matz.

Yukihiro_Matsumoto2 · 2 February 2003 14:28

Hi,

···

In message “Re: operator - on list should be O(n) on sorted lists” on 03/02/02, Yukihiro Matsumoto matz@ruby-lang.org writes:

The reason is exactly as you guessed. But your example made me think
it is overkill for ‘-’ operation (since “-” is not a Set operation).

I mean here is “- can be a non Set operation”, i.e. it can be defined
as

def - (other)
self.delete_if{|x| other.include?(x)}
end

						matz.

Topic		Replies	Views
Operator - on list should be O(n) on sorted lists ruby-talk	0	95	2 February 2003
Set operations on arrays of non-trivial objects ruby-talk	2	104	8 March 2007
Sorting arrays ruby-talk	6	75	29 June 2006
A newbie question for using sort ruby-talk	3	149	18 September 2002
Working through Ch.10 for learning to program 2.0 (Chris Pine) ruby-talk	32	218	3 August 2013

Operator - on list should be O(n) on sorted lists

Related topics