Algorithm help

Hi robert,

[..snip..]
This looks cute. What makes you sure it's O(log n)?

in fact I think its O(m * log n) with n equals the size of data and m
the
number of ranges.

The algorithm cuts the remaining data in halfs ((p1 + p2) / 2) instead
of
a linear search. But it has to do it for each range again.

Btw, here's another funny idea:

-----------------
data = [3, 4, 5, 6, 7, 8, 9, 15, 38, 39, 40, 41, 6789, 6790,
9998, 9999]
ranges =

data.inject(0) {|s,x| s | (1 << x)}.to_s(2).reverse!.scan(/1+/) do |m|
  s = $`.length
  ranges << (s .. (s + m.length - 1))
end

puts ranges
-----------------

Kind regards

    robert

Rofl, indead a funny idea. For sure the slowest one mentioned here but I

had a realy nice time figuring it out :slight_smile: (and its short)

But if that's the only concern I would like to make another proposal:

d = [1, 3, 4, 5, 6, 7, 8, 9, 15, 38, 39, 40, 41, 6789, 6790, 9998, 9999]
puts d.inject([d[0]..d[0]]){|r,x|r<<(((r.last.end+1>=x)?r.pop.end:
x)..x)}

lets see if you can beat this :))

Simon

Yeah, I can't prove it without handwaving, but I think O(m log n) is the
best you can do.

martin

···

"Kroeger Simon (ext)" <simon.kroeger.ext@siemens.com> wrote:

Hi robert,

> [..snip..]
> This looks cute. What makes you sure it's O(log n)?

in fact I think its O(m * log n) with n equals the size of data and m
the number of ranges.

Hi robert,

[..snip..]
This looks cute. What makes you sure it's O(log n)?

in fact I think its O(m * log n) with n equals the size of data and m
the
number of ranges.

The algorithm cuts the remaining data in halfs ((p1 + p2) / 2) instead
of
a linear search. But it has to do it for each range again.

Hm, my alogorithm analysis is quite rusty but I believe you would have to get rid of m and replace it as an expression of n; reason being that m is part of the result and not part of the input. So you probably have to calculate something like an average number of range for the set of input range. Difficult without further knowledge of the data's nature. My gut guess is that for random numbers the number of ranges is rather close to n so that would make O(n*log n) which doesn't exactly look better than O(n)...

Anyway with the low numbers of n we're dealing with here other effects are likely more dramatic. So doing a benchmark of all solutions provided might yield surprising results.

Btw, here's another funny idea:

-----------------
data = [3, 4, 5, 6, 7, 8, 9, 15, 38, 39, 40, 41, 6789, 6790,
9998, 9999]
ranges =

data.inject(0) {|s,x| s | (1 << x)}.to_s(2).reverse!.scan(/1+/) do
  >m> s = $`.length
  ranges << (s .. (s + m.length - 1))
end

puts ranges
-----------------

Kind regards

    robert

Rofl, indead a funny idea. For sure the slowest one mentioned here

Did you benchmark them?

but I
had a realy nice time figuring it out :slight_smile: (and its short)

But if that's the only concern I would like to make another proposal:

d = [1, 3, 4, 5, 6, 7, 8, 9, 15, 38, 39, 40, 41, 6789, 6790, 9998,
9999] puts
d.inject([d[0]..d[0]]){|r,x|r<<(((r.last.end+1>=x)?r.pop.end: x)..x)}

lets see if you can beat this :))

Nah, unlikely. ATM I don't feel like investing that much time. Btw, two points of your solution vs. the bit set approach (above):

- This solution breaks for the empty array.

[0] .. [0]

ArgumentError: bad value for range
        from (irb):5

- It needs a sorted array.

So checks to cope with these make your bit a bit longer. :-}

Kind regards

    robert

···

Kroeger Simon (ext) <simon.kroeger.ext@siemens.com> wrote:

Yep - the two likely cases are m = O(1) and m = O(n). In the latter
case, the linear search is definitely better (you can see it intuitively
if you consider whether it's quicker to find the end of a given range by
crawling forward or by bouncing around - if m = O(n) the size of a given
range is roughly O(1)).

martin

···

Robert Klemme <bob.news@gmx.net> wrote:

Hm, my alogorithm analysis is quite rusty but I believe you would have to
get rid of m and replace it as an expression of n; reason being that m is
part of the result and not part of the input. So you probably have to
calculate something like an average number of range for the set of input
range. Difficult without further knowledge of the data's nature. My gut
guess is that for random numbers the number of ranges is rather close to n
so that would make O(n*log n) which doesn't exactly look better than O(n)...

i think worst case proves it:

   [1,2,3]

three ranges, ergo m, log(n) to find the ends is well proven since it's just
binary search.

regards.

-a

···

On Fri, 5 Aug 2005, Martin DeMello wrote:

"Kroeger Simon (ext)" <simon.kroeger.ext@siemens.com> wrote:

Hi robert,

[..snip..]
This looks cute. What makes you sure it's O(log n)?

in fact I think its O(m * log n) with n equals the size of data and m
the number of ranges.

Yeah, I can't prove it without handwaving, but I think O(m log n) is the
best you can do.

--

email :: ara [dot] t [dot] howard [at] noaa [dot] gov
phone :: 303.497.6469
My religion is very simple. My religion is kindness.
--Tenzin Gyatso

===============================================================================

[1,3,5] you mean :slight_smile:

martin

···

Ara.T.Howard <Ara.T.Howard@noaa.gov> wrote:

i think worst case proves it:

   [1,2,3]

three ranges, ergo m, log(n) to find the ends is well proven since it's just
binary search.

details details. sheesh.

-a

···

On Sat, 6 Aug 2005, Martin DeMello wrote:

Ara.T.Howard <Ara.T.Howard@noaa.gov> wrote:

i think worst case proves it:

   [1,2,3]

three ranges, ergo m, log(n) to find the ends is well proven since it's just
binary search.

[1,3,5] you mean :slight_smile:

--

email :: ara [dot] t [dot] howard [at] noaa [dot] gov
phone :: 303.497.6469
My religion is very simple. My religion is kindness.
--Tenzin Gyatso

===============================================================================