ParseTree: how to write a quick and dirty dependency analyzer

Original (and prettier) version at:

  http://www.zenspider.com/ZSS/Products/ParseTree/Examples/Dependencies.html

ParseTree can be found at:

  http://rubyforge.org/projects/parsetree/

Dependency Analyzer
A Simple Example using SexpProcessor and ParseTree

** Introduction

ParseTree includes a useful class called SexpProcessor. SexpProcessor
allows you to write very clean ruby language tools that focus on what
you are interested in, and ignore the rest. Below is a very basic
dependency analyzer. It records references to classes, and the method
it was referenced from and outputs a simple report.

It doesn't do too much, but you can't expect too much in 60 lines of
ruby (including empty lines)! However with a little effort, you can
make this into a truly useful ruby analysis tool.

Read the code, an explanation is further below.

** The Code, with walkthrough...

Most of the code is rather boring, I'll only make note of the
interesting or potentially confusing parts.

   #!/usr/local/bin/ruby -w

   old_classes = []
   new_classes = []

   require 'parse_tree'
   require 'sexp_processor'

   ObjectSpace.each_object(Module) { |klass| old_classes << klass }

   class DependencyAnalyzer < SexpProcessor

     attr_reader :dependencies
     attr_accessor :current_class

     def initialize
       super
       self.auto_shift_type = true
       @dependencies = Hash.new { |h,k| h[k] = [] }
       @current_method = nil
       @current_class = nil
     end

This is the front-end for the whole script. It grabs the parse tree
for each class specified and runs it through DependencyAnalyzer. Once
everything is run through, it iterates through the results and prints
a simple report (shown below).

     def self.process(*klasses)
       analyzer = self.new
       klasses.each do |start_klass|
         analyzer.current_class = start_klass
         analyzer.process(ParseTree.new.parse_tree(start_klass))
       end

       deps = analyzer.dependencies
       deps.keys.sort.each do |dep_to|
         dep_from = deps[dep_to]
         puts "#\{dep_to} referenced by:\n #\{dep_from.uniq.sort.join("\n ")}"
       end
     end

consists of a name, argument list, and a body. We simply grab the name
and record it for accounting purposes in the :const processor.

     def process_defn(exp)
       name = exp.shift
       @current_method = name
       return s(:defn, name, process(exp.shift), process(exp.shift))
     end

is a class and in others it is a regular const at the global or module
scope. We ask Object to do a const_get and figure out if it really is
a class or not. If it is, we add it to the dependency list.

     def process_const(exp)
       name = exp.shift
       const = "#\{@current_class}.#\{@current_method}"
       is_class = ! (Object.const_get(name) rescue nil).nil?
       @dependencies[name] << const if is_class
       return s(:const, name)
     end
   end

Finally, we require all the files specified on the command-line, find
all the new classes introduced, and then tell DependencyAnalyzer to
process them.

   if __FILE__ == $0 then
     ARGV.each { |name| require name }
     ObjectSpace.each_object(Module) { |klass| new_classes << klass }
     DependencyAnalyzer.process(*(new_classes - old_classes))
   end

** Details

*** SexpProcessor Basics

SexpProcessor has one basic method in it, process. That method takes a
sexp and generally returns a sexp. Internally, it looks at the type of
node it is currently processing, and either generically processes it,
or finds a custom processor that is implemented in a subclass and
dispatches to it. In the example above, process_const and process_defn
are examples of custom processors.

There are more features to process, but that is pretty much it for
basic usage. Read the code if you'd like, it isn't large (about 300
lines).

*** DependencyAnalyzer Details

In short, process_defn simply records the current method name, and
process_const records all constant accesses. It just so happens that
all class and module references are actually const references. Finally
DependencyAnalyzer.process hooks everything together and then prints a
readable report when done.

Not much eh? That is a good thing in my opinion. It means that in 60
lines you can make a quick and dirty tool. Imagine what you can do in
300 lines...

*** Example Output

Let's run our dependency analyzer against ruby2c, the real motivation
behind the ParseTree framework:

   % ./deps.rb rewriter support ruby_to_c type_checker typed_sexp_processor
   ArgumentError referenced by:
     CompositeSexpProcessor.<<
   Array referenced by:
     RubyToC.process_if
     Type.unify
   Fixnum referenced by:
     TypeChecker.process_lit
   Object referenced by:
     DependencyAnalyzer.process_const
   PP referenced by:
     PP::ObjectMixin.pretty_print_inspect
     PP::PPMethods.pp
   Sexp referenced by:
     Rewriter.process_call
     (6 other Rewriter.process_* methods)
     TypedSexp.sexp_types
   SexpProcessor referenced by:
     CompositeSexpProcessor.<<
   Symbol referenced by:
     PP::PPMethods.pp_object
   Thread referenced by:
     PP::PPMethods.guard_inspect_key
     PP::PPMethods.pp
   Type referenced by:
     RubyToC.process_call
     TypeChecker.bootstrap
     (24 other TypeChecker.process_* methods)
   TypeError referenced by:
     FunctionType.unify_components
     Type.unify
   TypedSexp referenced by:
     R2CRewriter.process_call
     TypedSexp.==

** Unadulterated Code

This code will also be included in the next release of ParseTree.

   #!/usr/local/bin/ruby -w

   old_classes = []; new_classes = []

   require 'pp'
   require 'parse_tree'
   require 'sexp_processor'

   ObjectSpace.each_object(Module) { |klass| old_classes << klass }

   class DependencyAnalyzer < SexpProcessor

     attr_reader :dependencies
     attr_accessor :current_class

     def initialize
       super
       self.auto_shift_type = true
       @dependencies = Hash.new { |h,k| h[k] = [] }
       @current_method = nil
       @current_class = nil
     end

     def self.process(*klasses)
       analyzer = self.new
       klasses.each do |start_klass|
         analyzer.current_class = start_klass
         analyzer.process(ParseTree.new.parse_tree(start_klass))
       end

       deps = analyzer.dependencies
       deps.keys.sort.each do |dep_to|
         dep_from = deps[dep_to]
         puts "#\{dep_to} referenced by:\n #\{dep_from.uniq.sort.join("\n ")}"
       end
     end

     def process_defn(exp)
       name = exp.shift
       @current_method = name
       return s(:defn, name, process(exp.shift), process(exp.shift))
     end

     def process_const(exp)
       name = exp.shift
       const = "#\{@current_class}.#\{@current_method}"
       is_class = ! (Object.const_get(name) rescue nil).nil?
       @dependencies[name] << const if is_class
       return s(:const, name)
     end
   end

   if __FILE__ == $0 then
     ARGV.each { |name| require name }
     ObjectSpace.each_object(Module) { |klass| new_classes << klass }
     DependencyAnalyzer.process(*(new_classes - old_classes))
   end

···

A :defn node is the top level node for a method definition. It
A :const node simply specifies what const to access. In some cases it

[root@localhost root]# gem list -r

*** REMOTE GEMS ***
Updating Gem source index for: http://gems.rubyforge.org
/usr/local/lib/ruby/1.8/yaml.rb:39: [BUG] rb_gc_mark(): unknown data type 0x38(0x609840) non object
ruby 1.8.1 (2003-12-25) [i686-linux]

Aborted

but If I do the same as a non-root user, gem don't throw any error.

What is the issue here?

Thanks,
Mohammad

Probably one of the many YAML/syck bugs fixed around May.
Could you try a newer version (such as a stable snapshot)?

···

On Fri, Nov 19, 2004 at 12:24:03AM +0900, Mohammad Khan wrote:

[root@localhost root]# gem list -r

*** REMOTE GEMS ***
Updating Gem source index for: http://gems.rubyforge.org
/usr/local/lib/ruby/1.8/yaml.rb:39: [BUG] rb_gc_mark(): unknown data type 0x38(0x609840) non object
ruby 1.8.1 (2003-12-25) [i686-linux]

Aborted

but If I do the same as a non-root user, gem don't throw any error.

What is the issue here?

--
Hassle-free packages for Ruby?
RPA is available from http://www.rubyarchive.org/

/usr/local/lib/ruby/1.8/yaml.rb:39: [BUG] rb_gc_mark(): unknown data type 0x38(0x609840) non object
ruby 1.8.1 (2003-12-25) [i686-linux]

update your version of ruby

Guy Decoux

Isn't 1.8.1 the latest stable version of ruby?

···

On Thu, 2004-11-18 at 10:36, ts wrote:

> /usr/local/lib/ruby/1.8/yaml.rb:39: [BUG] rb_gc_mark(): unknown data type 0x38(0x609840) non object
> ruby 1.8.1 (2003-12-25) [i686-linux]

update your version of ruby

Guy Decoux

--
MOhammad

Isn't 1.8.1 the latest stable version of ruby?

Yes, it's stable, the proof : you have retrieved the bug signaled in
[ruby-talk:100621] and corrected.

Guy Decoux

I am using the yaml that came with ruby 1.8.1

[mkhan@localhost 1.8]$ head yaml.rb
# -*- mode: ruby; ruby-indent-level: 4; tab-width: 4 -*- vim: sw=4 ts=4
# $Id: yaml.rb,v 1.9 2003/12/20 02:40:15 nobu Exp $

···

On Thu, 2004-11-18 at 10:51, ts wrote:

> Isn't 1.8.1 the latest stable version of ruby?

Yes, it's stable, the proof : you have retrieved the bug signaled in
[ruby-talk:100621] and corrected.

Guy Decoux

#
# YAML.rb
#
# Loads the parser/loader and emitter/writer.
#

Am I using the latest stable version of yaml?

--
MOhammad

Am I using the latest stable version of yaml?

The bug is in the extension syck used by yaml

Guy Decoux

Are you saying, there have a bug in latest stable version of yaml?
If not, what is the url to get the latest version of yaml?

Thanks,
Mohammad

···

On Thu, 2004-11-18 at 11:16, ts wrote:

> Am I using the latest stable version of yaml?

The bug is in the extension syck used by yaml

Guy Decoux

Are you saying, there have a bug in latest stable version of yaml?

There is a bug in the version of yaml distributed with ruby-1.8.1

If not, what is the url to get the latest version of yaml?

Good question :slight_smile:

Guy Decoux

Mohammad Khan ha scritto:

Are you saying, there have a bug in latest stable version of yaml?
If not, what is the url to get the latest version of yaml?

I think you should try the latest 'stable snapshot', wich is 1.8.1 from 2003 + bugfixes:
ftp://ftp.ruby-lang.org/pub/ruby/stable-snapshot.tar.gz