I found a heuristic solution that has complexity O(n*log(n)) per

step (for 1000 elements and 16 areas, I got a good solution in 500

steps).

The code has some constraints that are easy to remove:

- All areas must have exactly the same amount of elements

- Distances calculated assuming flat space (just rewrite Point's

function distance_to)

It's easy to customize heuristic params and remove those

constraints

Ps.: Not sure about the complexity, you can check yourself, because

I solved the problem during a sleepless, but it's gone now and I'm

going to bed. If you have some doubts ask me and I'll answer

tomorrow...

Solution:

class Point

attr_accessor :x, :y, :distances, :at

def initialize(x, y)

@x, @y = x.to_f, y.to_f

end

def distance_to(p)

Math.sqrt sum_of_squares_to(p)

end

def sum_of_squares_to(p)

(self.x-p.x)**2 + (self.y-p.y)**2

end

def calc_distances_to(areas)

@distances = areas.inject({}){ |d, a| d[a] = distance_to a.mean;

d }

end

def self.new_by_rand

Point.new(rand(100), rand(100))

end

def to_s

"(#{x},#{y})"

end

alias :inspect :to_s

end

class Area < Array

def mean

@mean || mean!

end

def entropy

self.inject(0) do |ent, p|

ent += mean.sum_of_squares_to p

end

end

def mean!

ps = self.inject( Point.new(0, 0) ) do |ps, p|

ps.x += p.x

ps.y += p.y

ps

end

ps.x = ps.x/self.size

ps.y = ps.y/self.size

self.sort_by!{ |p| p.distance_to ps }

@mean = ps

end

def fix!

self.each do |p|

p.at = self

end

end

def self.refine(areas, points, n)

area_size = areas.first.size

refresh_distances(areas, points)

doubt_points = areas.inject() do |dp, area|

dp.concat area[n..-1]

end

doubt_points.each do |point|

point.at.delete point

point.at = nil

end

doubt_points.each do |point|

ordered_areas = point.distances.sort_by{ |k, v| v }

nearest_area, distance = ordered_areas.pop

until(nearest_area.size < area_size)

nearest_area, distance = ordered_areas.pop

end

nearest_area << point

point.at = nearest_area

end

end

def self.refresh_distances(areas, points)

areas.each &:mean!

points.each do |p|

p.calc_distances_to areas

end

end

def self.all_areas_same(areas_1, areas_2)

areas_1.each_with_index do |a, i|

return false if areas_2[i] != a

end

return true

end

def self.refine_until_good_solution(areas, points, good_ratio =

0.01, max = 10_000, n = (areas.first.size*0.9).to_i, doubt_region =

max*0.05)

areas, points = areas.clone, points.clone

current_entropy = last_entropy = areas.inject(0){ |ent, a| ent +=

a.entropy }

enhance = 10*good_ratio

count = 0

best_solution = areas.clone

best_entropy = current_entropy

while( count < doubt_region || ( count < max && enhance >

good_ratio ) )

last_entropy = current_entropy

Area.refine areas, points, n

current_entropy = areas.inject(0){ |ent, a| ent += a.entropy }

enhance = (current_entropy - best_entropy)/best_entropy

if best_entropy > current_entropy

best_entropy = current_entropy

best_solution = areas.clone

end

count += 1

puts "Step: #{count}, Enhance: #{enhance}, Entropy:

#{current_entropy}"

end

best_solution.each &:fix!

[ best_solution, best_entropy ]

end

end

# Example:

points =

areas = 16.times.map do |n|

area = Area.new

(1000/16).times do |i|

p = Point.new_by_rand

p.at = area

area << p

points << p

end

area

end

Area.refine_until_good_solution(areas, points)

## ···

On 16 set, 20:08, Markus Schirp <m...@seonic.net> wrote:

Hi,

I'm not calling myself an expert on those problems, but here are my toughs:

In a first step you can classify each person to a "neighbor group", this

group could include persons in a specified region.

So the rest of your solution can work with "group" distances instead of

inter person distances. When the "neighbor groups" include enough

persons you'll save computational resources.

You could vary the "neighbor group" region size on demand, in your

example Southern California would/should get twice as much neighbor

groups as NY.

Than your algorithm "just" has to choose "target groups" with no

"neighbor group" duplicates to get an "initial solution".

Now you can start optimizing the "total-distance", just randomly (or

more intelligently) interchange members between the target groups and

calculate the total distance. Once you are happy you are done.

Possible Strategy for selecting an group member to interchange:

Look at group A, find two members A1 and A2 where (group) distance

between A1, and A2 is minimal. Try to push A1 to another group without

negatively affecting the "total-distance", if okay: proceed with group

B, if not try to push A2, and so on.

Additionally you can apply any scheme from:

Combinatorial optimization - Wikipedia

Once I have to implement an Vehicle Routing Problem Solver in *evil* VBA

(Bachelor Thesis for a friend), I did not do the math part, but it was

an interesting challenge.

Please let me know how you solved your problem, I'll have to solve one

similar in the future

Regards,

Markus

On 09/16/2010 11:09 PM, J.R. Gutierrez wrote:

> I am trying to create a program that maximizes the distances of

> addresses in Y amount of groups, essentially trying to decrease the

> chance that you are in the same group as your neighbors.

> Here is an example n it's simplest form. Say there are four people, two

> living in Los Angeles, one in New York, and one in Chicago that needed

> to be divided into two groups. The logical grouping would be {LA, CHI}

> and {LA, NY).

> Now a more complex example. Imagine 1000 people, 400 of which reside in

> the Southern California area, 200 in the NY area, 100 in the Northern

> California area, and the rest peppered in the area within the country.

> We need to separate them into 16 groups. Ideally, depending on

> distances, there would be 400/16 SoCal residents in a group.

> I've been trying hard to find out how to characterize this problem in a

> way that can be calculated with a computer using Ruby. The only way I

> can get an optimal answer by brute-forcing the distance between X people

> (X! calculations) and then doing something with that information to

> divide them into evenly distributed groups. I've also tried to think of

> this as reverse gravity problem, where the closer two people live to

> each other, the more they repel each other when deciding groups.

> So my question is, is there any kind of similar problem where I can

> borrow some kind of algorithm to compute an optimal answer? Or can

> anyone at least point me in the right direction?