Managing Hierarchical Tree-Like Data

Team:
I'm looking for a pragmatic way of creating a tree from a list of
items without knowing the depth of each.

For example:

max_depth = 4 #considers each unique item below the depth of 4 to be one
item_tree = {}
["CarRedFast",
            "CarGreenSlow",
            "Car",
            "CarRedFastAgileSadCheeseBurger",
            "CarRedFastAgileHappyJoy",
            "GreenApplePie",
            "GreenApple"
].each do |i|
            items = i.gsub(/[A-Z][a-z]/) {|i| "_" + i}.gsub(/^_/, "").split("_")
end

This builds an array from items based on the case but many times, I
have needed to display this data in tree form as such:

···

#############################
# Desired output:
            Car(2)
                        Red(1)
                                    Fast(1)
                                                Agile(2)
                                                            HappyJoy
                                                            SadCheeseBurger
                        Green(1)
                                    Slow(0)
            Green(1)
                        Apple(1)
                                    Pie(0)
###############################

I've come across this numerous times and am pretty sure there's a
better way of doing it.

This is pretty much what I have now and it feels SO wrong to write!

#This just feels horrible writing
item_tree = {}
["CarRedFast",
        "CarGreenSlow",
        "Car",
        "CarRedFastAgileSadCheeseBurger",
        "CarRedFastAgileHappyJoy",
        "GreenApplePie",
        "GreenApple"
].each do |i|
        items = i.gsub(/[A-Z][a-z]/) {|i| "_" + i}.gsub(/^_/, "").split("_")
              items.each do |item|
                    if item_tree.include? item[0]
                          if item_tree[item[0]].include? item[1]
                              #yadadada
                          else
                              item_tree[item[0]][item[1]] = {}
                          end
                    else
                          item_tree[item[0]] = {}
                    end
              end
end

Please.. Any suggestions at all.. All comments welcome!! TIA

x1 wrote:

Team:
I'm looking for a pragmatic way of creating a tree from a list of
items without knowing the depth of each.

How about this:
class Hash
  def as_tree( depth=0 )
    map{ |k,subtree|
      me = " "*depth + "#{k} (#{subtree.length})"
      if subtree.empty?
        me
      else
        [ me, subtree.as_tree( depth+1 ) ]
      end
    }.flatten.join( "\n" )
  end
end

hash_of_hashes = lambda { Hash.new {|h,k| h[k] = hash_of_hashes.call} }
item_tree = hash_of_hashes.call

items = [ "CarRedFast", "CarGreenSlow", "Car",
"CarRedFastAgileSadCheeseBurger", "CarRedFastAgileHappyJoy",
"GreenApplePie", "GreenApple" ]
items.each do |item|
  current = item_tree
  item.scan( /[A-Z][a-z]+/ ).each{ |piece|
    current = current[ piece ]
  }
end

puts item_tree.as_tree
#=> Green (1)
#=> Apple (1)
#=> Pie (0)
#=> Car (2)
#=> Red (1)
#=> Fast (1)
#=> Agile (2)
#=> Sad (1)
#=> Cheese (1)
#=> Burger (0)
#=> Happy (1)
#=> Joy (0)
#=> Green (1)
#=> Slow (0)

x1 wrote:

/ ...

Please.. Any suggestions at all..

Sure, no problem. I could write an example of a hash-based tree data
structure that would print itself out in a logical way, but I am having a
problem with your data.

Could you provide a better data example, one that shows its hierarchical
nature more clearly?

In the meantime:

···

-----------------------------------------

#!/usr/bin/ruby -w

# Step 1: create some sample data

array = [
   [ "animal","fish","shark" ],
   [ "animal","fish","trout" ],
   [ "animal","fish","big","whale shark" ],
   [ "animal","moose" ],
   [ "animal","bear" ],
   [ "animal","small","squirrel" ],
   [ "plant","cactus" ],
   [ "plant","corn" ],
   [ "plant","daisy" ],
   [ "plant","big","oak tree" ],
   [ "element","silicon" ],
   [ "element", "dense","uranium" ],
   [ "element","light","helium" ],
   [ "state","Ohio" ],
   [ "state","tiny","Rhode Island" ]
]

# Step 2: assemble data into hash tree

tree = {}

array.each do |record|
   node = tree
   record.each do |field|
      node[field] = {} unless node[field]
      node = node[field]
   end
end

# Step 3: define recursive display function

def showTree(tab,node)
   ts = "\t" * tab
   node.keys.sort.each do |key|
      puts "#{ts}#{key}"
      showTree((tab+1),node[key])
   end
end

# Step 4: display the tree

showTree(0,tree)

-----------------------------------------

Output:

animal
        bear
        fish
                big
                        whale shark
                shark
                trout
        moose
        small
                squirrel
element
        dense
                uranium
        light
                helium
        silicon
plant
        big
                oak tree
        cactus
        corn
        daisy
state
        Ohio
        tiny
                Rhode Island

--
Paul Lutus
http://www.arachnoid.com

x1 wrote:

Team:
I'm looking for a pragmatic way of creating a tree from a list of
items without knowing the depth of each.

Here's a good algorithm for constructing a formal concept lattice.

It's not particularly pragmatic, but it's fast and ultimately
equivalent to what you're trying to do.

'cid 'ooh

Jesus! Give me an hour or so to make sense of this. I'm so glad you
provided this!! A++++++++++++

···

On 10/30/06, Phrogz <gavin@refinery.com> wrote:

x1 wrote:
> Team:
> I'm looking for a pragmatic way of creating a tree from a list of
> items without knowing the depth of each.

How about this:
class Hash
        def as_tree( depth=0 )
                map{ |k,subtree|
                        me = " "*depth + "#{k} (#{subtree.length})"
                        if subtree.empty?
                                me
                        else
                                [ me, subtree.as_tree( depth+1 ) ]
                        end
                }.flatten.join( "\n" )
        end
end

hash_of_hashes = lambda { Hash.new {|h,k| h[k] = hash_of_hashes.call} }
item_tree = hash_of_hashes.call

items = [ "CarRedFast", "CarGreenSlow", "Car",
"CarRedFastAgileSadCheeseBurger", "CarRedFastAgileHappyJoy",
"GreenApplePie", "GreenApple" ]
items.each do |item|
        current = item_tree
        item.scan( /[A-Z][a-z]+/ ).each{ |piece|
                current = current[ piece ]
        }
end

puts item_tree.as_tree
#=> Green (1)
#=> Apple (1)
#=> Pie (0)
#=> Car (2)
#=> Red (1)
#=> Fast (1)
#=> Agile (2)
#=> Sad (1)
#=> Cheese (1)
#=> Burger (0)
#=> Happy (1)
#=> Joy (0)
#=> Green (1)
#=> Slow (0)

Paul Lutus wrote:

array.each do |record|
   node = tree
   record.each do |field|
      node[field] = {} unless node[field]
      node = node[field]
   end
end

I like Paul's simple solution better than my
tricky-but-almost-obfuscated autovivifying Hash. One note (that I don't
think is golfing): the innermost code in the above can be rubyified as:
   node[field] ||= {}
   node = node[field]
or simplified further (golfed?) to:
   node = ( node[field] ||= {} )

Ok --

Phrogz, I find your script very interesting and I'll learn a lot by
going through it over and over. Thank you!

Paul, Sorry for the crappy data (it just happened to be what I was
working w/ at the time)

One of the key items both of you two used which I was not aware of
(and am ignorant for this) is the ability to use:
     current = item_tree #Phrogz
     node = tree # Paul

So what this is doing, is creating a new hash at the new depth,
without having to statically reference it. Nice. Yes, I'm a newb to
OO.

Thanks guys

···

On 10/30/06, x1 <caldridge@gmail.com> wrote:

Jesus! Give me an hour or so to make sense of this. I'm so glad you
provided this!! A++++++++++++

On 10/30/06, Phrogz <gavin@refinery.com> wrote:
> x1 wrote:
> > Team:
> > I'm looking for a pragmatic way of creating a tree from a list of
> > items without knowing the depth of each.
>
> How about this:
> class Hash
> def as_tree( depth=0 )
> map{ |k,subtree|
> me = " "*depth + "#{k} (#{subtree.length})"
> if subtree.empty?
> me
> else
> [ me, subtree.as_tree( depth+1 ) ]
> end
> }.flatten.join( "\n" )
> end
> end
>
> hash_of_hashes = lambda { Hash.new {|h,k| h[k] = hash_of_hashes.call} }
> item_tree = hash_of_hashes.call
>
> items = [ "CarRedFast", "CarGreenSlow", "Car",
> "CarRedFastAgileSadCheeseBurger", "CarRedFastAgileHappyJoy",
> "GreenApplePie", "GreenApple" ]
> items.each do |item|
> current = item_tree
> item.scan( /[A-Z][a-z]+/ ).each{ |piece|
> current = current[ piece ]
> }
> end
>
> puts item_tree.as_tree
> #=> Green (1)
> #=> Apple (1)
> #=> Pie (0)
> #=> Car (2)
> #=> Red (1)
> #=> Fast (1)
> #=> Agile (2)
> #=> Sad (1)
> #=> Cheese (1)
> #=> Burger (0)
> #=> Happy (1)
> #=> Joy (0)
> #=> Green (1)
> #=> Slow (0)
>