Detecting number ranges

I had to write a script this evening to take an unsorted input file of the
form:

database 1 is on server 3
database 8 is on server 7
...

and output it in the form:

server 3 handles database 1 through 7
server 7 handles database 8 through 11

My brain was in procedural mode, and I wrote the following ugliness:

···

---

def output_range(start_db, end_db, server_n)
   puts "server #{server_n} handles database #{start_db} through #{end_db}"
end

open("db_list.txt").readlines.each do |line|
  m = line.match(/^database (.*) is on server (.*)/)
  db_n = m[1].to_i
  server_n = m[2].to_i

  db_servers[db_n] = server_n
end

range_start = 1

db_servers.each_index do |i|
   next if i == 1
     if db_servers[i] != db_servers[i - 1]
       output_range(range_start, i - 1, db_servers[i - 1])
       range_start = i
   end
end

last_db = db_servers.size - 1
output_range(range_start, last_db, db_servers[last_db])
---

which in fact is not only horrid to look at, but has a bug (it doesn't like
it if the input skips over a database number, i.e. db 233 and 235 are
present but 234 is missing).

I feel like there's a much nicer way to express this in Ruby, but can't
think of what it might be...

--
Jay Levitt |
Boston, MA | My character doesn't like it when they
Faster: jay at jay dot fm | cry or shout or hit.
http://www.jay.fm | - Kristoffer

I'm not sure why data like this:

     database 1 is on server 3
     database 8 is on server 3

should produce:

     server 3 handles databases 1 to 8

since server 3 only hands db's 1 and 8.

Or, what to do if there is only one db for the server, say:

    database 1 is on server 3

Should the output be:

    server 3 handles database 1 to 1

Can there be duplicate lines like this:

database 1 is on server 3
database 8 is on server 7
database 1 is on server 3

Also, is the output supposed to be in ascending order by server?

In any case, here is the input I used:

database 8 is on server 7
database 10 is on server 7
database 10 is on server 7
database 5 is on server 9
database 1 is on server 3
database 2 is on server 3
database 133 is on server 3
database 4 is on server 144

and here is the output:

server 3 handles databases 1 to 133
server 7 handles databases 8 to 10
server 9 handles database 5
server 144 handles database 4

require "set"

def output_data(arr)
  arr.each do |elmt|
    min = elmt[1].min
    max = elmt[1].max

    if min == max
      puts "server #{elmt[0]} handles database #{min}"
    else
      puts "server #{elmt[0]} handles databases #{min} to #{max}"
    end
  end
end

servers = Hash.new{|hash, key| hash[key] = Set.new}

File.open("data.txt") do |file|
  file.each_line do |line|
    nums = line.scan(/\d+/)
    servers[nums[1].to_i].add(nums[0].to_i)
  end
end

data = servers.sort
output_data(data)

···

--
Posted via http://www.ruby-forum.com/.

Does this feel nicer?

$ cat db_ranges.rb && ruby db_ranges.rb
# db_ranges.rb
# 26 September 2007

···

On 9/26/07, Jay Levitt <jay+news@jay.fm> wrote:

I had to write a script this evening to take an unsorted input file of the
form:

database 1 is on server 3
database 8 is on server 7
...

and output it in the form:

server 3 handles database 1 through 7
server 7 handles database 8 through 11

I feel like there's a much nicer way to express this in Ruby, but can't
think of what it might be...

#

require 'enumerator'

file =<<END
database 8 is on server 7
database 10 is on server 7
database 9 is on server 7
database 5 is on server 9
database 1 is on server 3
database 2 is on server 3
database 133 is on server 3
database 4 is on server 144
END

dbs = Hash.new {|h,k| h[k] = }

file.each do |line|
m = line.match(/^database (.*) is on server (.*)/)
db_n = m[1].to_i
server_n = m[2].to_i
dbs[server_n] << db_n
end

dbs.each do |server, db|
  result = [[db[0]]]
  db.sort.each_cons(2) {|x, y| if y == x + 1 then result.last << y;
else result << [y]; end}
  result.each {|x| puts "Server #{server} handles databases #{x.first}
to #{x.last}"}
end

Server 144 handles databases 4 to 4
Server 7 handles databases 8 to 10
Server 3 handles databases 1 to 2
Server 3 handles databases 133 to 133
Server 9 handles databases 5 to 5

Kind regards,

Jesus.

Does this do it?

fp_arr = ["db 11 is on sv 3","db 9 is on sv 7","db 13 is on sv 3","db
12 is on sv 3","db 8 is on sv 7","db 18 is on sv 2"]

nums,info = ,{}
fp_arr.each {|x| nums << x.scan(/\d+/)}
sv = nums.map{|x| x[1]}.uniq
sv.each {|t| info[t] = nums.select{|x| x[1] == t}.sort.map {|a| a[0]}}
info.keys.sort.each {|j| print "server #{j} handles database
#{info[j][0]} through #{info[j][-1]}\n"}

#server 2 handles database 18 through 18
#server 3 handles database 11 through 13
#server 7 handles database 8 through 9

Harry

···

On 9/26/07, Jay Levitt <jay+news@jay.fm> wrote:

I had to write a script this evening to take an unsorted input file of the
form:

database 1 is on server 3
database 8 is on server 7
...

and output it in the form:

server 3 handles database 1 through 7
server 7 handles database 8 through 11

--
A Look into Japanese Ruby List in English

I'm not in the mood for hash.

ary = DATA.readlines
ary.map{|s| s[/\d+$/] }.uniq.each{|server|
  db = ary.grep(/ #{server}$/).map{|s|
    s[/\d+/].to_i}.sort
  puts "Server #{server} databases #{db[0]} to #{db[-1]}"
}

__END__
database 8 is on server 7
database 10 is on server 7
database 9 is on server 7
database 5 is on server 9
database 1 is on server 3
database 2 is on server 3
database 133 is on server 3
database 4 is on server 144

···

On Sep 26, 12:16 am, Jay Levitt <jay+n...@jay.fm> wrote:

I had to write a script this evening to take an unsorted input file of the
form:

database 1 is on server 3
database 8 is on server 7
...

and output it in the form:

server 3 handles database 1 through 7
server 7 handles database 8 through 11

My brain was in procedural mode, and I wrote the following ugliness:

---

def output_range(start_db, end_db, server_n)
   puts "server #{server_n} handles database #{start_db} through #{end_db}"
end

open("db_list.txt").readlines.each do |line|
  m = line.match(/^database (.*) is on server (.*)/)
  db_n = m[1].to_i
  server_n = m[2].to_i

  db_servers[db_n] = server_n
end

range_start = 1

db_servers.each_index do |i|
   next if i == 1
     if db_servers[i] != db_servers[i - 1]
       output_range(range_start, i - 1, db_servers[i - 1])
       range_start = i
   end
end

last_db = db_servers.size - 1
output_range(range_start, last_db, db_servers[last_db])
---

which in fact is not only horrid to look at, but has a bug (it doesn't like
it if the input skips over a database number, i.e. db 233 and 235 are
present but 234 is missing).

Or use Set#divide:

require 'set'

ss = Set[*DATA.readlines.map{ |l| l.scan( /\d+/).map{ |s| s.to_i}.reverse}]

ss = ss.divide{ |i, j| i[0] == j[0] && (i[1] - j[1]).abs <= 1}

ss = ss.inject( {}){ |h, s| sa = s.to_a.sort; (h[sa.first[0]] ||= ) << (1
== sa.size ? sa.first[1] : sa.first[1]..sa.last[1]); h}

ss.keys.sort.each{ |k| puts "Server #{k} handles db #{ss[k].sort_by{ |d|
d.is_a?( Range) ? d.begin : d}.join( ', ')}"}

__END__
database 8 is on server 7
database 10 is on server 7
database 9 is on server 7
database 5 is on server 9
database 1 is on server 3
database 2 is on server 3
database 133 is on server 3
database 4 is on server 144

=>
Server 3 handles db 1..2, 133
Server 7 handles db 8..10
Server 9 handles db 5
Server 144 handles db 4

Regards,
R.

···

2007/9/26, Jay Levitt <jay+news@jay.fm>:

I had to write a script this evening to take an unsorted input file of the
form:

database 1 is on server 3
database 8 is on server 7
...

and output it in the form:

server 3 handles database 1 through 7
server 7 handles database 8 through 11

My brain was in procedural mode, and I wrote the following ugliness:

---

def output_range(start_db, end_db, server_n)
   puts "server #{server_n} handles database #{start_db} through
#{end_db}"
end

open("db_list.txt").readlines.each do |line|
  m = line.match(/^database (.*) is on server (.*)/)
  db_n = m[1].to_i
  server_n = m[2].to_i

  db_servers[db_n] = server_n
end

range_start = 1

db_servers.each_index do |i|
   next if i == 1
     if db_servers[i] != db_servers[i - 1]
       output_range(range_start, i - 1, db_servers[i - 1])
       range_start = i
   end
end

last_db = db_servers.size - 1
output_range(range_start, last_db, db_servers[last_db])
---

which in fact is not only horrid to look at, but has a bug (it doesn't
like
it if the input skips over a database number, i.e. db 233 and 235 are
present but 234 is missing).

I feel like there's a much nicer way to express this in Ruby, but can't
think of what it might be...

--
Jay Levitt |
Boston, MA | My character doesn't like it when they
Faster: jay at jay dot fm | cry or shout or hit.
http://www.jay.fm | - Kristoffer

I'm not sure why data like this:

     database 1 is on server 3
     database 8 is on server 3

should produce:

     server 3 handles databases 1 to 8

since server 3 only hands db's 1 and 8.

Sorry, I was being a lazy typist. (In my defense, I only said the input
was "in the form!") Assume that, for the example I gave, the output is
correct and the input has more data than I showed.

Or, what to do if there is only one db for the server, say:

    database 1 is on server 3

Should the output be:

    server 3 handles database 1 to 1

Yep, exactly.

Can there be duplicate lines like this:

database 1 is on server 3
database 8 is on server 7
database 1 is on server 3

Nope, never. The actual input is akin to a list of symbolic NFS links,
something like:

/prod_dir/db1.file -> /mounts/server3/db1.file

So db1.file can only ever point to one place, and can only ever show up in
the input once.

For that same reason, the input's initially unsorted - it's *mostly*
sorted, but alphanumerically (db1, db11, db12, ... db199, db2, db21, etc.)

Also, is the output supposed to be in ascending order by server?

Nope, not necessary. Ascending by DB number probably makes it easier to
manually edit or check, though.

In any case, here is the input I used:

database 8 is on server 7
database 10 is on server 7
database 10 is on server 7
database 5 is on server 9
database 1 is on server 3
database 2 is on server 3
database 133 is on server 3
database 4 is on server 144

and here is the output:

server 3 handles databases 1 to 133
server 7 handles databases 8 to 10
server 9 handles database 5
server 144 handles database 4

Hmm, looks like it assumes the whole range if there are missing elements.
(I guess I didn't specify that behavior, did I?) If a database isn't
listed in the input, it shouldn't be included in ranges in the output.

So the actual output from the above input should be (in any order):

server 3 handles database 1 to 2
server 3 handles databases 133 to 133
server 7 handles databases 8 to 8
server 7 handles databases 10 to 10
server 9 handles databases 5 to 5
server 144 handles databases 4 to 4

I -think- the "missing database" problem is endemic to the Set approach,
right?

Jay

···

On Wed, 26 Sep 2007 16:49:09 +0900, 7stud -- wrote:

require "set"

def output_data(arr)
  arr.each do |elmt|
    min = elmt[1].min
    max = elmt[1].max

    if min == max
      puts "server #{elmt[0]} handles database #{min}"
    else
      puts "server #{elmt[0]} handles databases #{min} to #{max}"
    end
  end
end

servers = Hash.new{|hash, key| hash[key] = Set.new}

File.open("data.txt") do |file|
  file.each_line do |line|
    nums = line.scan(/\d+/)
    servers[nums[1].to_i].add(nums[0].to_i)
  end
end

data = servers.sort
output_data(data)

--
Jay Levitt |
Boston, MA | My character doesn't like it when they
Faster: jay at jay dot fm | cry or shout or hit.
http://www.jay.fm | - Kristoffer

Nice! I think this is the magic I was looking for.

···

On Wed, 26 Sep 2007 17:18:58 +0900, Jesús Gabriel y Galán wrote:

  db.sort.each_cons(2) {|x, y| if y == x + 1 then result.last << y;
else result << [y]; end}

--
Jay Levitt |
Boston, MA | My character doesn't like it when they
Faster: jay at jay dot fm | cry or shout or hit.
http://www.jay.fm | - Kristoffer

Hmm, that fails if I add "db 20 is on sv 2", which should retain "server 2
handles db 18 through 18" but also add "srver 2 handles db 20 through 20".
I can't quite follow what it's doing but I'm going to have to study that
more; map is a powerful tool in general.

···

On Wed, 26 Sep 2007 17:57:16 +0900, Harry Kakueki wrote:

fp_arr = ["db 11 is on sv 3","db 9 is on sv 7","db 13 is on sv 3","db
12 is on sv 3","db 8 is on sv 7","db 18 is on sv 2"]

nums,info = ,{}
fp_arr.each {|x| nums << x.scan(/\d+/)}
sv = nums.map{|x| x[1]}.uniq
sv.each {|t| info[t] = nums.select{|x| x[1] == t}.sort.map {|a| a[0]}}
info.keys.sort.each {|j| print "server #{j} handles database
#{info[j][0]} through #{info[j][-1]}\n"}

#server 2 handles database 18 through 18
#server 3 handles database 11 through 13
#server 7 handles database 8 through 9

--
Jay Levitt |
Boston, MA | My character doesn't like it when they
Faster: jay at jay dot fm | cry or shout or hit.
http://www.jay.fm | - Kristoffer

Definitely wins the "shortest" prize! But, um, no output:

#=> Server databases to

Looks like .each is called only once, with |server| == nil.

Weirder, when I just run:

ary = open("fake_data").readlines
ary.map{|s| s[/\d+$/] }.inspect
return

I get:

#=> wjames.rb:4: unexpected return (LocalJumpError)

I would have thought that ary.map{|s| s[/d+$] } should return an array
itself. Although I'm not clear what that line does - doesn't it look for a
1+ digit number n, and return the nth character of the line?

···

On Wed, 26 Sep 2007 14:57:04 -0700, William James wrote:

I'm not in the mood for hash.

ary = DATA.readlines
ary.map{|s| s[/\d+$/] }.uniq.each{|server|
  db = ary.grep(/ #{server}$/).map{|s|
    s[/\d+/].to_i}.sort
  puts "Server #{server} databases #{db[0]} to #{db[-1]}"
}

--
Jay Levitt |
Boston, MA | My character doesn't like it when they
Faster: jay at jay dot fm | cry or shout or hit.
http://www.jay.fm | - Kristoffer

> I'm not in the mood for hash.

> ary = DATA.readlines
> ary.map{|s| s[/\d+$/] }.uniq.each{|server|
> db = ary.grep(/ #{server}$/).map{|s|
> s[/\d+/].to_i}.sort
> puts "Server #{server} databases #{db[0]} to #{db[-1]}"
> }

Definitely wins the "shortest" prize! But, um, no output:

#=> Server databases to

I just copied, pasted, and ran the code. Output:
Server 7 databases 8 to 10
Server 9 databases 5 to 5
Server 3 databases 1 to 133
Server 144 databases 4 to 4

Change
ary = DATA.readlines
to
ary = DATA.readlines ; p ary
and see what you get.
Your lack of output could be caused by an added space at the end
of each data line.

Looks like .each is called only once, with |server| == nil.

Weirder, when I just run:

ary = open("fake_data").readlines
ary.map{|s| s[/\d+$/] }.inspect
return

I get:

#=> wjames.rb:4: unexpected return (LocalJumpError)

I would have thought that ary.map{|s| s[/d+$] } should return an array
itself. Although I'm not clear what that line does - doesn't it look for a
1+ digit number n, and return the nth character of the line?

irb(main):001:0> "is it foo bar again?"[ /.oo \w+/ ]
=> "foo bar"

s[ /\d+$/ ] returns a sequence of digits at the end of the string (or
just before
"\n" in the string).

Extra spaces won't bother this version:

ary = DATA.readlines.map{|s| s.scan(/\d+/).map{|s| s.to_i}}
p ary
ary.map{|a| a[1] }.uniq.each{|server|
  db = ary.select{|a| a[1]==server}.map{|a| a[0]}.sort
  puts "Server #{server} databases #{db[0]} to #{db[-1]}"
}

__END__
database 8 is on server 7
database 10 is on server 7
database 9 is on server 7
database 5 is on server 9
database 1 is on server 3
database 2 is on server 3
database 133 is on server 3
database 4 is on server 144

···

On Sep 27, 12:00 pm, Jay Levitt <jay+n...@jay.fm> wrote:

On Wed, 26 Sep 2007 14:57:04 -0700, William James wrote:

In that case, it outputs "server 2 handles database 18 to 20".
I also thought that is what you wanted. But after seeing your further
explanation, I see that it is not.
Here is the same code with comments I probably should have included
the first time.

fp_arr = ["db 11 is on sv 3","db 9 is on sv 7","db 13 is on sv 3","db
12 is on sv 3","db 8 is on sv 7","db 18 is on sv 2","db 20 is on sv
2"]

nums,info = ,{}
fp_arr.each {|x| nums << x.scan(/\d+/)}
#p nums #[["11", "3"], ["9", "7"], ["13", "3"], ["12", "3"]....]

sv = nums.map{|x| x[1]}.uniq #["3", "7", "2"] server list
sv.each {|t| info[t] = nums.select{|x| x[1] == t}.sort.map {|a| a[0]}}
#p info #{"7"=>["8", "9"], "2"=>["18", "20"], "3"=>["11", "12", "13"]}

info.keys.sort.each do |j|
print "server #{j} handles database #{info[j][0]} to #{info[j][-1]}\n"
end

#server 2 handles database 18 to 20
#server 3 handles database 11 to 13
#server 7 handles database 8 to 9

Harry

···

On 9/28/07, Jay Levitt <jay+news@jay.fm> wrote:

On Wed, 26 Sep 2007 17:57:16 +0900, Harry Kakueki wrote:

> fp_arr = ["db 11 is on sv 3","db 9 is on sv 7","db 13 is on sv 3","db
> 12 is on sv 3","db 8 is on sv 7","db 18 is on sv 2"]
>
> nums,info = ,{}
> fp_arr.each {|x| nums << x.scan(/\d+/)}
> sv = nums.map{|x| x[1]}.uniq
> sv.each {|t| info[t] = nums.select{|x| x[1] == t}.sort.map {|a| a[0]}}
> info.keys.sort.each {|j| print "server #{j} handles database
> #{info[j][0]} through #{info[j][-1]}\n"}
>
> #server 2 handles database 18 through 18
> #server 3 handles database 11 through 13
> #server 7 handles database 8 through 9

Hmm, that fails if I add "db 20 is on sv 2", which should retain "server 2
handles db 18 through 18" but also add "srver 2 handles db 20 through 20".
I can't quite follow what it's doing but I'm going to have to study that
more; map is a powerful tool in general.

--
A Look into Japanese Ruby List in English

# On Wed, 26 Sep 2007 14:57:04 -0700, William James wrote:
# > ary = DATA.readlines
# > ary.map{|s| s[/\d+$/] }.uniq.each{|server|
# > db = ary.grep(/ #{server}$/).map{|s|
# > s[/\d+/].to_i}.sort
# > puts "Server #{server} databases #{db[0]} to #{db[-1]}"
# > }
# Definitely wins the "shortest" prize!

i was impressed by James and Raf's solutions, so i combined the two, together using 1.9's group_by, so

C:\ruby1.9\bin>ruby test.rb
Server Databases
3 1 to 2, 133
7 8 to 10, 20, 30 to 33
9 5
144 3 to 4

C:\ruby1.9\bin>cat test.rb
require 'set'

module Enumerable
  def ranger
    to_set.divide {|i,j| (i - j).abs == 1}.
    map{|s| s.to_a.sort}.sort_by{|s| s.first}
  end
end

ary = DATA.readlines.map{|s| s.scan(/\d+/).map{|s| s.to_i}}

puts "Server\tDatabases"
ary.group_by{|a| a.last}.sort.each do |server,pairs|
  db=pairs.map{|a| a.first}.ranger.
     map{|a| "#{a.first}#{" to #{a.last}" if a.size>1}"}.join(", ")
  puts "#{server}\t#{db}"
end

__END__
database 8 is on server 7
database 10 is on server 7
database 9 is on server 7
database 5 is on server 9
database 1 is on server 3
database 2 is on server 3
database 133 is on server 3
database 4 is on server 144
database 3 is on server 144
database 20 is on server 7
database 30 is on server 7
database 31 is on server 7
database 32 is on server 7
database 33 is on server 7
C:\ruby1.9\bin>

kind regards -botp

···

From: Jay Levitt [mailto:jay+news@jay.fm]

Ah, close - it's a Windows carriage return which Cygwin doesn't like.
That's what happens when I use some Cygwin and some Windows apps
simultaneously. I was getting

["database 8 is on server 7\r\n",...\

And while "7\n"[/\d+$/] == "7", "7\r\n"[/\d+$/] == nil, at least on Cygwin.

Thanks.

···

On Thu, 27 Sep 2007 10:27:28 -0700, William James wrote:

Change
ary = DATA.readlines
to
ary = DATA.readlines ; p ary
and see what you get.
Your lack of output could be caused by an added space at the end
of each data line.

--
Jay Levitt |
Boston, MA | My character doesn't like it when they
Faster: jay at jay dot fm | cry or shout or hit.
http://www.jay.fm | - Kristoffer

Thanks! For my own educational purposes (and any future newbies reading
this), I renamed a bunch of variables, extracted some temp variables and
came up with the following. It still has one bug (doesn't handle
non-contiguous ranges like [1, 2, 133]), but I think I could fix that on my
own...

db_server_pairs =
server_dbs = {}

# .scan will return an array of matches
DATA.readlines.each do |line|
  db_server_pairs << line.scan(/\d+/).map{|num| num.to_i}
end

servers = db_server_pairs.map{|pair| pair[1]}.uniq
#=> [..., 7, ...]

servers.each do |server|
  my_pairs = db_server_pairs.select{|pair| pair[1] == server}
  #=> [[8, 7], [10, 7], ...]

  server_dbs[server] = my_pairs.sort.map {|pair| pair[0]}
  #=> [8, 9, 10]
end

server_dbs.keys.sort.each do |server|
  first = server_dbs[server][0]
  last = server_dbs[server][-1]
  print "server #{server} handles database #{first} to #{last}\n"
end

__END__
database 8 is on server 7
database 10 is on server 7
database 9 is on server 7
database 5 is on server 9
database 1 is on server 3
database 2 is on server 3
database 133 is on server 3
database 4 is on server 144

···

On Fri, 28 Sep 2007 07:13:39 +0900, Harry Kakueki wrote:

Here is the same code with comments I probably should have included
the first time.

--
Jay Levitt |
Boston, MA | My character doesn't like it when they
Faster: jay at jay dot fm | cry or shout or hit.
http://www.jay.fm | - Kristoffer

That's beautiful! I *love* ranger. And there are a lot of other techiques
I learned from here. Again, I've oversimplified it below for future
reference.

I've never seen "set.divide" before, and it's the key to the whole thing.
Question: Isn't it possible to give set.divide a block that'll give
inconsistent or crazy results? I can't quite think of how, but it's
assuming that the relationships are commutative..

It's also really interesting that I was initially using the group_by from
version 3726 of Rails:

   def group_by
     inject() do |groups, element|
       value = yield(element)
       if (last_group = groups.last) && last_group.first == value
   last_group.last << element
       else
   groups << [value, [element]]
       end
       groups
     end
   end

which resulted in two separate "server 7" groups - yet something later in
the chain healed that rift.

Finally, the function I broke out as "text_range" felt like an Escher
drawing at first...

def text_range(ary)
  "#{ary.first}#{" to #{ary.last}" if ary.size>1}"
end

But I finally realized that the braces must take precedence over the
quotes, so the middle quotes are "inner" quotes, as if it read:

def text_range(ary)
  s = "#{ary.first}"
  s << " to #{ary.last}" if ary.size > 1
  s
end

# botp.rb
require 'set'

# Using server 7 as our example

module Enumerable

  def ranger
    s = to_set
    #=> #<Set: {33, 30, 8, 31, 20, 9, 32, 10}>

    consecutives = s.divide {|i,j| (i - j).abs == 1}
    #=> #<Set: {#<Set: {20}>, #<Set: {8, 9, 10}>,
                #<Set: {33, 30, 31, 32}>}>

    inner_sorted = consecutives.map{|group| group.to_a.sort}
    # each group is sorted internally now
    #=> [[30, 31, 32, 33], [20], [8, 9, 10]]

    inner_sorted.sort_by{|s| s.first}
    # now groups are in sorted order as well
    #=> [[8, 9, 10], [20], [30, 31, 32, 33]]

  end

  def group_by
    inject({}) do |groups, element|
      (groups[yield(element)] ||= ) << element
      groups
    end
  end if RUBY_VERSION < '1.9'

end

def text_range(ary)
  # Note the quoted text - and another #{} - *inside*
  # the #{... if ary.size>1} construct.
  # Who knew you could do such a thing?

  "#{ary.first}#{" to #{ary.last}" if ary.size>1}"
end

db_server_pairs = DATA.readlines.map{|s| s.scan(/\d+/).map{|s| s.to_i}}
#=> [[8, 7], [10, 7], [9, 7], ... ]

pairs_by_server = db_server_pairs.group_by{|pair| pair.last}
#=> [[7, [[8, 7], [10, 7], [9, 7], ... ]], 3, ... ]

puts "Server\tDatabases"
pairs_by_server.sort.each do |server, pairs|
  # server => 7
  # pairs => [[8, 7], [10, 7], [9, 7], ...]

  my_dbs = pairs.map {|pair| pair.first}
  #=> [8, 10, 9, ... ]

  my_ranges = my_dbs.ranger
  #=> [[8, 9, 10], [20], [30, 31, 32, 33]]

  range_text = my_ranges.map{|r| text_range(r)}.join(", ")
  puts "#{server}\t#{range_text}"
end

__END__
database 8 is on server 7
database 10 is on server 7
database 9 is on server 7
database 5 is on server 9
database 1 is on server 3
database 2 is on server 3
database 133 is on server 3
database 4 is on server 144
database 3 is on server 144
database 20 is on server 7
database 30 is on server 7
database 31 is on server 7
database 32 is on server 7
database 33 is on server 7

···

On Fri, 28 Sep 2007 17:30:46 +0900, Peña, Botp wrote:

i was impressed by James and Raf's solutions, so i combined the two, together using 1.9's group_by, so

--
Jay Levitt |
Boston, MA | My character doesn't like it when they
Faster: jay at jay dot fm | cry or shout or hit.
http://www.jay.fm | - Kristoffer

Well I used the following snippet once

  module Enumerable
    def to_ranges
      compact.sort.uniq.inject() do | result, elem |
        result = [elem..elem] if result.length.zero?
        if [result[-1].end, result[-1].end.succ].include?(elem)
          result[-1] = result[-1].begin..elem
        else
          result.push elem..elem
        end
        result
      end
    end

    def member_includes?(item)
      find{|r| r.include?(item)} ? true : false
    end

    def implant(item)
     (map{|e| e.to_a }.flatten + Array(item)).to_ranges
    end

    def implant!(item)
      replace(implant(item))
    end
  end

You can basically make an array and the "collapse" it

···

On 30-sep-2007, at 23:40, Jay Levitt wrote:

Thanks! For my own educational purposes (and any future newbies reading
this), I renamed a bunch of variables, extracted some temp variables and
came up with the following. It still has one bug (doesn't handle
non-contiguous ranges like [1, 2, 133]), but I think I could fix that on my
own...

--
Julian 'Julik' Tarkhanov
please send all personal mail to
me at julik.nl