[QUIZ] The Smallest Circle (#157)

Lionel Bouton wrote:

Hi,

here's my solution. I initially thought that the best circle would be either defined by a diameter (2 points) or one combinations of 3 points through which it passes. The idea was that given three points defining a circle a fourth point is either in the circle or defines a circle with 2 from the 3 originals that includes the remaining point.

This would be neat as I wasn't so happy about a gradient walking solution. In this case it can be proven that you can deterministically get the best solution by gradient walking (I don't think there are local maximums) but it's not the general case. Unfortunately it can be proven that such a circle can encircle all points, but not that it's the smallest (and my experiments proved that it isn't).

Update : it seems I was wrong and probably had a bug in this implementation as I just read this algorithm described as the best known solution until 1983 :slight_smile: On the paper referenced by Douglas :

http://www.cs.mcgill.ca/~cs507/projects/1998/jacob/

On a purely random set, my gradient walking is 27 times faster on 10000 points than Meggido's algorithm (but certainly slower in some cases, Douglas' solution gets the result in a fairly constant time for a given set size).

Hum, after testing, modifying the initial step for the gradient walking from

max width of the set / point_number
to
max width of the set / 4

makes it competitive in all cases (12s on 10000 points instead of 270 for Meggido's) and not balanced data sets (which random ones are) only.

Lionel

Philipp Hofmann wrote:

Here is my solution. Although it uses a brute force approach if the
first attempts (driven by obvious constraints) of determining the
circle fail, it scales on larger sets, because the given equal
distibution of random points form an approximate square on these
sets.

Wow, that was awfully fast for the random distribution, it takes more time to display the result than computing it even on 10000 points !

On badly shaped point clouds it suffers greatly, I couldn't get it to compute the circle for 1000 points if I feed it with :

class Point < Struct.new(:x, :y)
  def self.random
    ro = rand / 2
    theta = Math::PI * 2 * rand
    Point.new(ro * Math::cos(theta) + 0.5, ro * Math::sin(theta) + 0.5)
  end
[...]
end

The ruby process happily took more than 800MB and I had to kill it while I could still get some feedback from my poor laptop :slight_smile:

Note :
I found it suspicious that I could get better performance than the accepted best mathematical solution in the general case and I reflected a while on that.
I think the reason why the gradient walking is faster is because it is inherently inaccurate. Megiddo's algorithm defines the exact position of the center using a geometrical process. Unfortunately computers must manipulate floating point data when describing points, segments, vectors, circles, ... So any computer result is inherently inaccurate.
The gradient walking method profits from this inaccuracy: it stops when the step becomes too small to move the current best position because of rounding errors. This happens in o(n) steps where n is the number of bits used to represent the floating point mantissa which is a huge win (I don't remember the actual number of bits, but it must be < 64) compared to the o(n) of Megiddo's where n is the number of *points*.

I benched the hull computing method and I believe that we can get the best of both worlds : on any distribution of points I threw at it (random in a square, in a disk or on a circle) computing the convex hull took 0.4s for 10000 points on my laptop. To be as fast as possible, we could run your algorithm up to the point where it switches on the brute force method. Then, instead of looking for the best circle based on three points of the hull, we'd use the gradient walking method. Using only the hull points saves time in the most common cases and the filtering is quick enough to be negligible in the worst case when no point can be filtered out.

My tests only filtering the points using your ConvexHull class before applying gradient walking returns in 0.5s for the initial distribution (on a square) or on a disk and fall back to 12-13s on the worst case: distribution on a circle.

Lionel.

Hi Matthew,

I think it might be better if you wait until you are better and do it. Just
delay the next quiz. Or have the person who submitted the quiz (if it
wasn't you) write the summary. I don't think a free-for-all on the summary
is a good idea.

Get well soon!

Eric

···

On Wed, Feb 20, 2008 at 2:49 PM, Matthew Moss <matthew.moss@gmail.com> wrote:

Okay, I know that this probably isn't the best first impression I
could make, but is there someone who would mind doing a summary for
this week's quiz?

I developed a fever of about 101F last night and have it today, will
probably carry it into tomorrow. I can still do a summary, though it
will likely show up Friday and be shorter than perhaps is usual,
because I can barely stay focused on something for more than 5 minutes
at the moment.

If someone is up for it, just reply here and have at it. I will
appreciate it enormously.

Matthew, get well soon, I hope you didn't catch the flu that i'm just now recovering from!
I suggest just delaying the summary. It's not the end of the world.
JJ

···

On Feb 20, 2008, at 2:49 PM, Matthew Moss wrote:

Okay, I know that this probably isn't the best first impression I
could make, but is there someone who would mind doing a summary for
this week's quiz?

I developed a fever of about 101F last night and have it today, will
probably carry it into tomorrow. I can still do a summary, though it
will likely show up Friday and be shorter than perhaps is usual,
because I can barely stay focused on something for more than 5 minutes
at the moment.

If someone is up for it, just reply here and have at it. I will
appreciate it enormously.

Apologies for the typos in the summary's code bits. Google Groups has
decided it wants to protect a few email addresses from spam, despite
there being no email addresses. (It's seeing the @ character of the
instance member variables.)

Where you see "..." immediately preceding "@", realize those dots
should not be there. In one of the three cases (the to_s method),
those dots should be replaced with a left curly brace "{".

For the future, I'll send the email directly to the list rather than
going through Google.

Interesting, the numbering is continuing from Ruby Quiz ?
Not starting at 1???

SUGGESTION: I don't have time to do this quiz, but it looks like fun.

I also would love to do this one, but am on vacation :slight_smile:

I hope there's someone out there that supplies a multidimensional
solution (something that gives the user a choice of the dimensional
space; 3-D, 4-D, etc).

I for one would encourage anybody to do it with some visual
representation! (suggested libs: RMagick, Gosu, Rubygame, Ruby/SDL,
etc...)
An audio representation could be interesting as well (perhaps a sort
of sonar?)

Those would be pretty interesting.

Todd

···

On Feb 16, 2008 3:21 PM, John Joyce <dangerwillrobinsondanger@gmail.com> wrote:

SUGGESTION: I don't have time to do this quiz, but it looks like fun. I for one would encourage anybody to do it with some visual representation! (suggested libs: RMagick, Gosu, Rubygame, Ruby/SDL, etc...)

Here's an OpenGL/GLUT-based visualizer for the quiz:

http://tastyspleen.net/~billk/ruby/quiz/157-smallest-circle/157_smallest_circle_visualization.rb

The visualizer is designed to run with any quiz solution that provides the encircle() and generate_samples(n) methods at
outer scope.

A screenshot for those curious, but not curious enough to actually
try installing ruby OpenGL bindings ... <grin>


NOTE: I've only tried this program on ruby 1.8.4 so far... I have
no idea if it might run on 1.9 ...

Also... I believe I'm using yoshi's opengl-0.32f bindings. I haven't
tried it with the very latest ruby/OpenGL bindings on rubyforge.

Regards,

Bill

···

From: "John Joyce" <dangerwillrobinsondanger@gmail.com>

And again I reply to my own post. The test code requires a distance()
method:

def distance(p1, p2)
    Math.sqrt((p1.x - p2.x) ** 2 + (p1.y - p2.y) ** 2)
end

In case somebody cares.

Without looking for literature (on purpose), I thought that an exact solution could perhaps be computed from the convex hull of the set of points. I've written a self-made algorithm to compute the convex hull, but run out of time to continue with the problem.

Anyone?

-- fxn

Had a lot of fun with this one :slight_smile: My solution works by taking the average of
all points as the center, and the farthest outlying point as the radius. An
optimal solution is then found by moving the circle center towards the
outlier as the radius is reduced, while all points still are within the
radius. The same principles could be applied for higher dimensions as well.

Nice solution, this was also one of my first ideas. The problem with
this approach is if many points are close to each other and only few
points are far away. In this case the center is near that mass of points
and the line-search may fail to find an optimal solution. For example:

puts encircle( ([[0.0,0.0]] * 100 + [[0.0,1.0], [1.0,0.0]]).map{|p|
       Point.new(*p)})

(100 points at (0,0), one at (1,0) and one at (0,1)) gives a solution of
{center:(0.00980392156862745, 0.00980392156862745), radius:0.990244611507173}
but optimal would be (1/2, 1/2) and radius 1/sqrt(2).

My variation using an axis-aligned bounding box to pick the
initial center point appears to handle the above case.

However I would not be surprised if it still might find some
sub-optimal solutions.... :-/

Regards,

Bill

···

From: "Frank Fischer" <frank.fischer@s2001.tu-chemnitz.de>

On 2008-02-17, Justin Ethier <justin.ethier@gmail.com> wrote:

Philipp Hofmann wrote:

Here is my solution. Although it uses a brute force approach if the
first attempts (driven by obvious constraints) of determining the
circle fail, it scales on larger sets, because the given equal
distibution of random points form an approximate square on these
sets.

Wow, that was awfully fast for the random distribution, it takes more
time to display the result than computing it even on 10000 points !

On badly shaped point clouds it suffers greatly, I couldn't get it to
compute the circle for 1000 points if I feed it with :

What you call a 'badly shaped point clouds' is in fact an approximate
circle. It is true that this is the worst case for convex hull based
approaches because in this case the convex hull has lots of points.

It would be nice to have an algorithm that produces random
distibutions that form arbitrary polygons. But I guess that's a task
for another quiz.

As I mentioned in one of my comments in the code, right before it goes
into brute force, I prefer using Clarkson's probabilistic aproach here.

So here is my improved version, still based on convex hulls but also
able to deal with circular shaped distributions.

I confident it doesn't do any harm to your poor laptop this time. :wink:

...

g phil

smallest_enclosing_circle2.rb (8.69 KB)

···

On Tue, Feb 19, 2008 at 09:32:16AM +0900, Lionel Bouton wrote:

Hi Matthew,

Another thing... To ease the burden on you for writing the summaries, you
might have the quiz submitter write the summary by default. Since that
person wrote the quiz, hopefully they should be a good person to write the
summary. You could be a backup if they are unable for some reason. To make
this work well, you'd want quiz ideas to come mainly from others instead of
yourself.

Thanks for taking on the new role of quizmaster!

Eric

···

On Wed, Feb 20, 2008 at 3:05 PM, Eric Mahurin <eric.mahurin@gmail.com> wrote:

On Wed, Feb 20, 2008 at 2:49 PM, Matthew Moss <matthew.moss@gmail.com> > wrote:

> Okay, I know that this probably isn't the best first impression I
> could make, but is there someone who would mind doing a summary for
> this week's quiz?

John Joyce wrote:

Matthew, get well soon, I hope you didn't catch the flu that i'm just now recovering from!
I suggest just delaying the summary. It's not the end of the world.
JJ

Same here, get well soon and don't stress yourself over the summary. I think we can wait: the summary is more for people who didn't follow the quiz closely or who will find it searching the web.

Lionel

I'm trying to utilize gnuplot for visualization. With any luck I can
post the visual part without spoiling the quiz. :wink:

Anyone interested?

···

On Feb 17, 8:59 am, Bill Kelly <bi...@cts.com> wrote:

From: "John Joyce" <dangerwillrobinsondan...@gmail.com>

> SUGGESTION: I don't have time to do this quiz, but it looks like fun.
> I for one would encourage anybody to do it with some visual
> representation! (suggested libs: RMagick, Gosu, Rubygame, Ruby/SDL,
> etc...)

Here's an OpenGL/GLUT-based visualizer for the quiz:

http://tastyspleen.net/~billk/ruby/quiz/157-smallest-circle/157_small\.\.\.

--
Alex

And again I reply to my own post.

This is getting a bad habit. Sorry. This slightly modified version
uses the center of the two points with maximum distance as starting
point. For random sets, the results fit quite well most of the time --
on certain sets it's farther off than the mass center version though.

Regards,
Thomas.

def encircle(points) # takes array of Point objects
    points = points.uniq
    return if points.nil? or points.empty?

    return Circle.new(points[0], 0) if points.size == 1

    m, n = points.combination(2).sort_by {|a, b| -distance(a, b)}[0]
    a = Point.new(m.x / 2 + n.x / 2, m.y / 2 + n.y / 2)
    return Circle.new(a, distance(a, m)) unless points.size > 2

    points = points.sort_by {|p| -distance(a, p)}
    f = points[0]
    df = distance(f, a)
    b = med(f, a)
    e = 1e-10
    1000.times do
        db = distance(f, b)
        if points.all? {|p| distance(b, p) <= db + e}
            da = distance(f, a)
            if (da - db).abs <= e
                return Circle.new(b, db)
            else
                a, b = b, med(f, b)
            end
        else
            b = med(a, b)
        end
    end
    raise RuntimeError
end

def med(a, b)
    Point.new((b.x - a.x) / 2.0 + a.x, (b.y - a.y) / 2.0 + a.y)
end

def distance(p1, p2)
    Math.sqrt((p1.x - p2.x) ** 2 + (p1.y - p2.y) ** 2)
end

Philipp Hofmann wrote:

What you call a 'badly shaped point clouds' is in fact an approximate
circle.

In fact initially I tested with a disk (which is an intermediate level before the circle for the number of points on the convex hull), the worst shape is indeed a circle (when you replace ro by a constant in my random method for people still following) : the number of polygone vertices is then the total number of points.

I confident it doesn't do any harm to your poor laptop this time. :wink:
  
The little thing survived ! Clarkson's is good for the worst case (every point on the same circle) which fits common sense (any circle should encircle all points, so the first iteration should win minus rounding errors in floating point computations), it's nearly 3x as fast as the gradient walking in this case. The speed doesn't change too much given various random shape (between 2 to 4s on my laptop). The gradient walking still wins when a small portion of the points are on the convex hull (random on a disk for example: 0.5s vs 4s).

I believe gradient walking should be faster when less than one third of the points are on the convex hull, slower in the other case, but it's a rough estimation based on observed behavior.

One thing that I don't like though is that the total time is difficult to predict. Probabilistic algorithms often have good performance but I'm always uneasy when I'm not sure how much time they can spend before returning and even when they will exhibit bad behaviour. It's a matter of constraints in which the code must fit and I mostly have to return results in a timely manner in most of my projects :slight_smile:

Is there any public paper describing the performance behavior of the algorithm? All scientific papers returned by Google point to paid-subscription materials.

Lionel

PS: is it me or is this becoming more an algorithm-Quiz than a Ruby-Quiz? It's my first Quiz and I was expecting to see some clever use of Ruby, but in this kind of Quiz I don't see much place for elegant code.
Not that I complain: I love to study the performance of algorithms too :slight_smile:

Thanks for the support... I decided, after much lack of cowbell, that
I will delay the summary one week.

···

On Feb 20, 7:19 pm, Lionel Bouton <lionel-subscript...@bouton.name> wrote:

John Joyce wrote:

> Matthew, get well soon, I hope you didn't catch the flu that i'm just
> now recovering from!
> I suggest just delaying the summary. It's not the end of the world.
> JJ

Same here, get well soon and don't stress yourself over the summary. I
think we can wait: the summary is more for people who didn't follow the
quiz closely or who will find it searching the web.

Alex Shulgin wrote:

http://tastyspleen.net/~billk/ruby/quiz/157-smallest-circle/157_small\.\.\.
    
I'm trying to utilize gnuplot for visualization. With any luck I can
post the visual part without spoiling the quiz. :wink:
  
Glad to see that I'm not the sole old-school minded :slight_smile:

Lionel

One thing that I don't like though is that the total time is difficult
to predict. Probabilistic algorithms often have good performance but I'm
always uneasy when I'm not sure how much time they can spend before
returning and even when they will exhibit bad behaviour. It's a matter
of constraints in which the code must fit and I mostly have to return
results in a timely manner in most of my projects :slight_smile:

Is there any public paper describing the performance behavior of the
algorithm? All scientific papers returned by Google point to
paid-subscription materials.

sorry, I haven't come across a public paper of clarkson's algorithm,
either. But it's proven that the algorithm has O(log(n)), n of course
being the number of given points.

I'm not going to do a full proof here, but it's an outline.

You need at max 3 points to describe the smallest enclosing circle. If
you haven't found it yet you increase the chance of every point that is
outside a circle determined by 13 random points each turn, by
doubleing their relative frequencies.

After k iterations at least one of the (at max) three points you are
looking for has a relative frequency of 2**(k/3)

In average the sum of relative frequencies is increased by a factor of
1+3/13 each iteration.

For an increasing k this leads to a paradox: Because at some point the
relative frequency of at least one point 2**(k/3) would exceed the
overall sum of relative frequencies (n*(1+3/13))**k, if the algorithm
hadn't come to an end, yet.

The only reference I found on the net where it is at least partially
explained and where I got the outline for the proof from is this (but
be warned, it's in german):

  http://www-i1.informatik.rwth-aachen.de/~algorithmus/algo42.php

PS: is it me or is this becoming more an algorithm-Quiz than a
Ruby-Quiz? It's my first Quiz and I was expecting to see some clever use
of Ruby, but in this kind of Quiz I don't see much place for elegant
code.
Not that I complain: I love to study the performance of algorithms too :slight_smile:

Even if this one seems to be more about the algorithm, we still do it
in Ruby, right? :wink: Anyway we'll see what our brand new quizmaster will focus
on. :wink:

g phil

···

On Wed, Feb 20, 2008 at 07:20:06AM +0900, Lionel Bouton wrote:

PS: is it me or is this becoming more an algorithm-Quiz than a
Ruby-Quiz? It's my first Quiz and I was expecting to see some clever use
of Ruby, but in this kind of Quiz I don't see much place for elegant code.
Not that I complain: I love to study the performance of algorithms too :slight_smile:

I aim to have a mix... Things that deal with algorithms to see how
people will approach them with Ruby, and things that focus more on
aspects particular aspects of Ruby. And then just some fun things.

Because of the nature of the problem, this quiz probably leaned very
heavily towards the algorithm side... I admit I wanted to see what
folks not familiar with the problem might do to estimate a solution.