[QUIZ][SOLUTION] #83 Short but Unique

Here is my quick solution to quiz #83.
First I find all possible abbreviations, then try to find unique ones
for each string. If this isn't possible, then it tries for at least a
different abbreviation for each string. Finally, it falls back to the
heighest weighted abbreviation.

Weighting is currently a bit simplistic, and could probably be improved.
Weight is increased for
  A) including the first character
  B) cutting out 'word boundary' characters (' ', '_', '-', '.')

···

--
Brian Mattern

#Code below:

require 'pp'

module Compressor
  # gets all compressed versions of str
  def self.compress(str, len, ellipses = '...')
    len = len.to_i
    return case
           when str.size <= len
             [str]
           else
             ret = []
             weight = {}
              cutout = str.size - len + ellipses.size - 1
             (0..(len-ellipses.size)).to_a.each do |i|
               a = str.dup
               a[i..(i + cutout)] = ellipses
               w = 0
               w += 1 if i > 0
               [' ', '_', '-', '.'].each do |c|
                w += 1 if str[i..(i+cutout)].include?(c)
               end
               ret << a
               weight[a] = w

             end

             ret.sort{|s1, s2| weight[s2] <=> weight[s1] }
           end
  end

  def self.compress_array(arr, len, ellipses = '...')
    candidates = {}
    arr.each { |s| candidates[s] = self.compress(s, len, ellipses) }

    results = {}
    candidates.each { |k, v|
      # first try to find a completely unique abbreviation
      results[k] = v.find { |s|
        candidates.all? { |k2, v2| k2 == k or !v2.include?(s)}
      }

      # if none was found, just pick one that's different from the other chosen ones
      results[k] = v.find { |s| !results.values.include?(s) } if results[k].nil?

      # if we still don't have one, pick the first one (heighest weighted)
      results[k] = v.first if results[k].nil?
    }
    arr.collect{ |s| results[s] }
  end
end

class Array
  def compress(len = 10)
    Compressor.compress_array(self, len)
  end
end

pp ARGV[1..-1].compress(ARGV[0])

One of the other solutions preferred cutting out the middle of words
(although it also found ambiguous abbreviations when non-ambiguous ones
exist).

To do something similar with mine, replace this line:

  w += 1 if i > 0

with:

  #weight by distance of cutout from middle of word (middle being highest)
  ideal = (str.size - cutout) / 2
  w += i if i <= ideal
  w += (ideal * 2) - i if i > ideal

···

--
brian

On Wed, Jun 21, 2006 at 04:42:28AM +0900, brian.mattern@gmail.com wrote:

Here is my quick solution to quiz #83.
First I find all possible abbreviations, then try to find unique ones
for each string. If this isn't possible, then it tries for at least a
different abbreviation for each string. Finally, it falls back to the
heighest weighted abbreviation.

Weighting is currently a bit simplistic, and could probably be improved.
Weight is increased for
  A) including the first character
  B) cutting out 'word boundary' characters (' ', '_', '-', '.')

--
Brian Mattern

#Code below:

require 'pp'

module Compressor
  # gets all compressed versions of str
  def self.compress(str, len, ellipses = '...')
    len = len.to_i
    return case
           when str.size <= len
             [str]
           else
             ret =
             weight = {}
              cutout = str.size - len + ellipses.size - 1
             (0..(len-ellipses.size)).to_a.each do |i|
               a = str.dup
               a[i..(i + cutout)] = ellipses
               w = 0
               w += 1 if i > 0
               [' ', '_', '-', '.'].each do |c|
                w += 1 if str[i..(i+cutout)].include?(c)
               end
               ret << a
               weight[a] = w

             end

             ret.sort{|s1, s2| weight[s2] <=> weight[s1] }
           end
  end

  def self.compress_array(arr, len, ellipses = '...')
    candidates = {}
    arr.each { |s| candidates[s] = self.compress(s, len, ellipses) }

    results = {}
    candidates.each { |k, v|
      # first try to find a completely unique abbreviation
      results[k] = v.find { |s|
        candidates.all? { |k2, v2| k2 == k or !v2.include?(s)}
      }

      # if none was found, just pick one that's different from the other chosen ones
      results[k] = v.find { |s| !results.values.include?(s) } if results[k].nil?

      # if we still don't have one, pick the first one (heighest weighted)
      results[k] = v.first if results[k].nil?
    }
    arr.collect{ |s| results[s] }
  end
end

class Array
  def compress(len = 10)
    Compressor.compress_array(self, len)
  end
end

pp ARGV[1..-1].compress(ARGV[0])