NP-Hard Geolocatoin Problem

I am trying to create a program that maximizes the distances of
addresses in Y amount of groups, essentially trying to decrease the
chance that you are in the same group as your neighbors.

Here is an example n it's simplest form. Say there are four people, two
living in Los Angeles, one in New York, and one in Chicago that needed
to be divided into two groups. The logical grouping would be {LA, CHI}
and {LA, NY).

Now a more complex example. Imagine 1000 people, 400 of which reside in
the Southern California area, 200 in the NY area, 100 in the Northern
California area, and the rest peppered in the area within the country.
We need to separate them into 16 groups. Ideally, depending on
distances, there would be 400/16 SoCal residents in a group.

I've been trying hard to find out how to characterize this problem in a
way that can be calculated with a computer using Ruby. The only way I
can get an optimal answer by brute-forcing the distance between X people
(X! calculations) and then doing something with that information to
divide them into evenly distributed groups. I've also tried to think of
this as reverse gravity problem, where the closer two people live to
each other, the more they repel each other when deciding groups.

So my question is, is there any kind of similar problem where I can
borrow some kind of algorithm to compute an optimal answer? Or can
anyone at least point me in the right direction?

···

--
Posted via http://www.ruby-forum.com/.

Hi,

I'm not calling myself an expert on those problems, but here are my toughs:

In a first step you can classify each person to a "neighbor group", this group could include persons in a specified region.

So the rest of your solution can work with "group" distances instead of inter person distances. When the "neighbor groups" include enough persons you'll save computational resources.

You could vary the "neighbor group" region size on demand, in your example Southern California would/should get twice as much neighbor groups as NY.

Than your algorithm "just" has to choose "target groups" with no "neighbor group" duplicates to get an "initial solution".

Now you can start optimizing the "total-distance", just randomly (or more intelligently) interchange members between the target groups and calculate the total distance. Once you are happy you are done.

Possible Strategy for selecting an group member to interchange:

Look at group A, find two members A1 and A2 where (group) distance between A1, and A2 is minimal. Try to push A1 to another group without negatively affecting the "total-distance", if okay: proceed with group B, if not try to push A2, and so on.

Additionally you can apply any scheme from:

Once I have to implement an Vehicle Routing Problem Solver in *evil* VBA (Bachelor Thesis for a friend), I did not do the math part, but it was an interesting challenge.

Please let me know how you solved your problem, I'll have to solve one similar in the future :wink:

Regards,

Markus

···

On 09/16/2010 11:09 PM, J.R. Gutierrez wrote:

I am trying to create a program that maximizes the distances of
addresses in Y amount of groups, essentially trying to decrease the
chance that you are in the same group as your neighbors.

Here is an example n it's simplest form. Say there are four people, two
living in Los Angeles, one in New York, and one in Chicago that needed
to be divided into two groups. The logical grouping would be {LA, CHI}
and {LA, NY).

Now a more complex example. Imagine 1000 people, 400 of which reside in
the Southern California area, 200 in the NY area, 100 in the Northern
California area, and the rest peppered in the area within the country.
We need to separate them into 16 groups. Ideally, depending on
distances, there would be 400/16 SoCal residents in a group.

I've been trying hard to find out how to characterize this problem in a
way that can be calculated with a computer using Ruby. The only way I
can get an optimal answer by brute-forcing the distance between X people
(X! calculations) and then doing something with that information to
divide them into evenly distributed groups. I've also tried to think of
this as reverse gravity problem, where the closer two people live to
each other, the more they repel each other when deciding groups.

So my question is, is there any kind of similar problem where I can
borrow some kind of algorithm to compute an optimal answer? Or can
anyone at least point me in the right direction?

I found a heuristic solution that has complexity O(n*log(n)) per
step (for 1000 elements and 16 areas, I got a good solution in 500
steps).

   The code has some constraints that are easy to remove:
   - All areas must have exactly the same amount of elements
   - Distances calculated assuming flat space (just rewrite Point's
function distance_to)

   It's easy to customize heuristic params and remove those
constraints

   Ps.: Not sure about the complexity, you can check yourself, because
I solved the problem during a sleepless, but it's gone now and I'm
going to bed. If you have some doubts ask me and I'll answer
tomorrow...

Solution:

class Point
  attr_accessor :x, :y, :distances, :at
  def initialize(x, y)
    @x, @y = x.to_f, y.to_f
  end

  def distance_to(p)
    Math.sqrt sum_of_squares_to(p)
  end

  def sum_of_squares_to(p)
    (self.x-p.x)**2 + (self.y-p.y)**2
  end

  def calc_distances_to(areas)
    @distances = areas.inject({}){ |d, a| d[a] = distance_to a.mean;
d }
  end

  def self.new_by_rand
    Point.new(rand(100), rand(100))
  end

  def to_s
    "(#{x},#{y})"
  end
  alias :inspect :to_s
end

class Area < Array
  def mean
    @mean || mean!
  end

  def entropy
    self.inject(0) do |ent, p|
      ent += mean.sum_of_squares_to p
    end
  end

  def mean!
    ps = self.inject( Point.new(0, 0) ) do |ps, p|
      ps.x += p.x
      ps.y += p.y
      ps
    end
    ps.x = ps.x/self.size
    ps.y = ps.y/self.size
    self.sort_by!{ |p| p.distance_to ps }
    @mean = ps
  end

  def fix!
    self.each do |p|
      p.at = self
    end
  end

  def self.refine(areas, points, n)
    area_size = areas.first.size
    refresh_distances(areas, points)
    doubt_points = areas.inject() do |dp, area|
      dp.concat area[n..-1]
    end

    doubt_points.each do |point|
      point.at.delete point
      point.at = nil
    end

    doubt_points.each do |point|
      ordered_areas = point.distances.sort_by{ |k, v| v }
      nearest_area, distance = ordered_areas.pop
      until(nearest_area.size < area_size)
        nearest_area, distance = ordered_areas.pop
      end
      nearest_area << point
      point.at = nearest_area
    end
  end

  def self.refresh_distances(areas, points)
    areas.each &:mean!
    points.each do |p|
      p.calc_distances_to areas
    end
  end

  def self.all_areas_same(areas_1, areas_2)
    areas_1.each_with_index do |a, i|
      return false if areas_2[i] != a
    end
    return true
  end

  def self.refine_until_good_solution(areas, points, good_ratio =
0.01, max = 10_000, n = (areas.first.size*0.9).to_i, doubt_region =
max*0.05)
    areas, points = areas.clone, points.clone
    current_entropy = last_entropy = areas.inject(0){ |ent, a| ent +=
a.entropy }
    enhance = 10*good_ratio
    count = 0
    best_solution = areas.clone
    best_entropy = current_entropy
    while( count < doubt_region || ( count < max && enhance >
good_ratio ) )
      last_entropy = current_entropy
      Area.refine areas, points, n
      current_entropy = areas.inject(0){ |ent, a| ent += a.entropy }
      enhance = (current_entropy - best_entropy)/best_entropy
      if best_entropy > current_entropy
        best_entropy = current_entropy
        best_solution = areas.clone
      end
      count += 1
      puts "Step: #{count}, Enhance: #{enhance}, Entropy:
#{current_entropy}"
    end
    best_solution.each &:fix!
    [ best_solution, best_entropy ]
  end
end

# Example:
points =
areas = 16.times.map do |n|
  area = Area.new
  (1000/16).times do |i|
    p = Point.new_by_rand
    p.at = area
    area << p
    points << p
  end
  area
end
Area.refine_until_good_solution(areas, points)

···

On 16 set, 20:08, Markus Schirp <m...@seonic.net> wrote:

Hi,

I'm not calling myself an expert on those problems, but here are my toughs:

In a first step you can classify each person to a "neighbor group", this
group could include persons in a specified region.

So the rest of your solution can work with "group" distances instead of
inter person distances. When the "neighbor groups" include enough
persons you'll save computational resources.

You could vary the "neighbor group" region size on demand, in your
example Southern California would/should get twice as much neighbor
groups as NY.

Than your algorithm "just" has to choose "target groups" with no
"neighbor group" duplicates to get an "initial solution".

Now you can start optimizing the "total-distance", just randomly (or
more intelligently) interchange members between the target groups and
calculate the total distance. Once you are happy you are done.

Possible Strategy for selecting an group member to interchange:

Look at group A, find two members A1 and A2 where (group) distance
between A1, and A2 is minimal. Try to push A1 to another group without
negatively affecting the "total-distance", if okay: proceed with group
B, if not try to push A2, and so on.

Additionally you can apply any scheme from:

Combinatorial optimization - Wikipedia

Once I have to implement an Vehicle Routing Problem Solver in *evil* VBA
(Bachelor Thesis for a friend), I did not do the math part, but it was
an interesting challenge.

Please let me know how you solved your problem, I'll have to solve one
similar in the future :wink:

Regards,

Markus

On 09/16/2010 11:09 PM, J.R. Gutierrez wrote:

> I am trying to create a program that maximizes the distances of
> addresses in Y amount of groups, essentially trying to decrease the
> chance that you are in the same group as your neighbors.

> Here is an example n it's simplest form. Say there are four people, two
> living in Los Angeles, one in New York, and one in Chicago that needed
> to be divided into two groups. The logical grouping would be {LA, CHI}
> and {LA, NY).

> Now a more complex example. Imagine 1000 people, 400 of which reside in
> the Southern California area, 200 in the NY area, 100 in the Northern
> California area, and the rest peppered in the area within the country.
> We need to separate them into 16 groups. Ideally, depending on
> distances, there would be 400/16 SoCal residents in a group.

> I've been trying hard to find out how to characterize this problem in a
> way that can be calculated with a computer using Ruby. The only way I
> can get an optimal answer by brute-forcing the distance between X people
> (X! calculations) and then doing something with that information to
> divide them into evenly distributed groups. I've also tried to think of
> this as reverse gravity problem, where the closer two people live to
> each other, the more they repel each other when deciding groups.

> So my question is, is there any kind of similar problem where I can
> borrow some kind of algorithm to compute an optimal answer? Or can
> anyone at least point me in the right direction?

I found a heuristic solution that has complexity O(n*log(n)) per
step (for 1000 elements and 16 areas, I got a good solution in 500
steps).

Erratum: complexity is O(n^2*log(n))
*but continue considering the lack of sleep...