Making a simple parser

Hi all,

To automate some of the tests I have to run, I decided to use ruby to
generate some script files on a particular (very simple) language based
on several possible input files. My first approach at this was to use
inherited method on a parent class (which I called InputFormat) to hold
all the children in an array. Then, different formats could become a
child class of InputFormat and return a known data format (I decided to
use an array of hashes because the output is really really simple) to
the output generator code.

So the idea is something like:

class InputFormat
  @children = []

  def initialize(input)
    @input = input
  end

  def parse
    @children.each { |child|
      child.parse(@input) if child.supported?(@input)
    }
  end

  def self.inherited(child)
    @children << child
  end
end

class AInputFormat < InputFormat
  def supported?
    # check if we can parse this type of file
  end

  def parse
    # parse and generate array of hashes in known format
  end
end

Then on the core file I would have something like:

input = InputFormat.new(ARGV[0])
input.parse

As it turns out, this isn't working because AInputFormat will only
inherit from InputFormat at the time I actually use it, am I right ? Any
tips you guys could give me to achieve what I want ? (from several
possible input formats generate one output format)

···

--
Posted via http://www.ruby-forum.com/.

Felipe Balbi wrote in post #993252:

Hi all,

To automate some of the tests I have to run, I decided to use ruby to
generate some script files on a particular (very simple) language based
on several possible input files. My first approach at this was to use
inherited method on a parent class (which I called InputFormat) to hold
all the children in an array. Then, different formats could become a
child class of InputFormat and return a known data format (I decided to
use an array of hashes because the output is really really simple) to
the output generator code.

So the idea is something like:

class InputFormat
  @children =

  def initialize(input)
    @input = input
  end

  def parse
    @children.each { |child|
      child.parse(@input) if child.supported?(@input)
    }
  end

  def self.inherited(child)
    @children << child
  end
end

class AInputFormat < InputFormat
  def supported?
    # check if we can parse this type of file
  end

  def parse
    # parse and generate array of hashes in known format
  end
end

Then on the core file I would have something like:

input = InputFormat.new(ARGV[0])
input.parse

As it turns out, this isn't working because AInputFormat will only
inherit from InputFormat at the time I actually use it, am I right ?

No:

1)
class A
  def self.inherited(child)
    puts 'inherited called'
  end
end

class B < A
end

--output:--
inherited called

2)
class A
  def self.inherited(child)
    puts 'inherited called'
  end
end

B = Class.new(A)

--output:--
inherited called

···

possible input formats generate one output format)

--
Posted via http://www.ruby-forum.com/\.

hi felipe,

  well, i don't know how many possible input types you have, but if
they're not too very many, you could try a very different approach:

## for this test i created three files, "input.eng," "input.esp," and
"input.fr," each with a few lines of random text...

class Parser
  attr_reader :output
  def initialize(inputfile)
  @output = [] ## this can of course be changed to what best suits your
purposes
  @oktypes = %W[.eng .esp .fr]
  self.checkType(inputfile)
  end

  def checkType(file)
    if File.exists?(file)
  if ! @oktypes.include?(File.extname(file))
    puts "Unrecognized File Type"
  else
    @oktypes.collect{|type|
    case
    when file.downcase.include?(type)
      self.parseInput(file)
      false
    end
    }
  end
    else
  puts "File Not Found"
    end
  end

  def parseInput(file)
    case
  when file.downcase.include?(".eng")
    self.engParse(file)
  when file.downcase.include?(".esp")
    self.espParse(file)
  when file.downcase.include?(".fr")
    self.frParse(file)
    end
  end

  def loadData(inputfile)
    @data = []
    file = File.open(inputfile, 'r')
    file.collect{|line| @data << line.chomp}
    file.close
  end

## here's where you do whatever parsing you need to, my examples are
dumb... but the important thing is that you end up with @output

  def engParse(file)
    self.loadData(file)
    @data.collect{|line|
  @output << line.reverse}
  end

  def espParse(file)
    self.loadData(file)
    @data.collect{|line|
  @output << line.upcase}
  end

  def frParse(file)
    self.loadData(file)
    @data.collect{|line|
  @output << line.upcase.reverse}
  end

end #class

test = Parser.new("input.esp")
puts test.output

  this may be WAY too simple for what you're trying to do, but hey,
maybe not! :wink:

- j

···

--
Posted via http://www.ruby-forum.com/.

The above doesn't work, because the @children inside the instance
method "parse" is not the same as the @children inside the class
method "self.inherited". You have to give access to the class instance
variable, and then use that one from the parse method (the you will
see the next problem):

class InputFormat
  class << self
    attr_accessor :children
  end

  def initialize(input)
    @input = input
  end

  def parse
    self.class.children.each {|child| child.parse(@input) if
child.supported?(@input)}
  end

  def self.inherited(child)
    (@children ||= ) << child
  end
end

class AInputFormat < InputFormat
def supported?
   # check if we can parse this type of file
end

def parse
   # parse and generate array of hashes in known format
end
end

ruby-1.8.7-p334 :028 > input = InputFormat.new("test")
=> #<InputFormat:0xb738bf50 @input="test">
ruby-1.8.7-p334 :029 > input.parse
NoMethodError: undefined method `supported?' for AInputFormat:Class
  from (irb):11:in `parse'
  from (irb):11:in `each'
  from (irb):11:in `parse'
  from (irb):29

The next problem, as you see, is that you are defining instance
methods in the subclasses, but are calling them on the class. Maybe
the methods parse and supported? in the children could be class
methods, or maybe what you store in @children could be an instance of
the class.

Jesus.

···

On Sat, Apr 16, 2011 at 9:52 PM, Felipe Balbi <balbif@gmail.com> wrote:

Hi all,

To automate some of the tests I have to run, I decided to use ruby to
generate some script files on a particular (very simple) language based
on several possible input files. My first approach at this was to use
inherited method on a parent class (which I called InputFormat) to hold
all the children in an array. Then, different formats could become a
child class of InputFormat and return a known data format (I decided to
use an array of hashes because the output is really really simple) to
the output generator code.

So the idea is something like:

class InputFormat
@children =

def initialize(input)
@input = input
end

def parse
@children.each { |child|
child.parse(@input) if child.supported?(@input)
}
end

def self.inherited(child)
@children << child
end
end

class AInputFormat < InputFormat
def supported?
# check if we can parse this type of file
end

def parse
# parse and generate array of hashes in known format
end
end

Then on the core file I would have something like:

input = InputFormat.new(ARGV[0])
input.parse

As it turns out, this isn't working because AInputFormat will only
inherit from InputFormat at the time I actually use it, am I right ? Any
tips you guys could give me to achieve what I want ? (from several
possible input formats generate one output format)

I'm pretty unclear about what you are trying to do, but maybe this will
help:

class InputFormat
  @children = []

  def self.children
    @children
  end

  def initialize(input)
    @input = input
  end

  def parse
    InputFormat.children.each { |child|
      child.parse(@input) if child.supported?
    }
  end

  def self.inherited(sub_class)
    @children << sub_class.new('dummy')
  end
end

class InputFormatA < InputFormat
  def supported?
    true
  end

  def parse(str)
    puts "InputFormatA is parsing #{str}"
  end
end

class InputFormatB < InputFormat
  def supported?
    true
  end

  def parse(str)
    puts "InputFormatB is parsing #{str}"
  end
end

input = InputFormat.new('hello world')
input.parse

--output:--
InputFormatA is parsing hello world
InputFormatB is parsing hello world

Note that when inherited() is called, the methods of the subclass are
not defined yet, so the inherited initialize() is called.

···

--
Posted via http://www.ruby-forum.com/.

Hi Jake,

jake kaiden wrote in post #993355:

class Parser
  attr_reader :output
  def initialize(inputfile)
  @output = ## this can of course be changed to what best suits your
purposes
  @oktypes = %W[.eng .esp .fr]
  self.checkType(inputfile)
  end

  def checkType(file)
    if File.exists?(file)
  if ! @oktypes.include?(File.extname(file))
    puts "Unrecognized File Type"
  else
    @oktypes.collect{|type|
    case
    when file.downcase.include?(type)
      self.parseInput(file)
      false
    end
    }
  end
    else
  puts "File Not Found"
    end
  end

  def parseInput(file)
    case
  when file.downcase.include?(".eng")
    self.engParse(file)
  when file.downcase.include?(".esp")
    self.espParse(file)
  when file.downcase.include?(".fr")
    self.frParse(file)
    end
  end

  def loadData(inputfile)
    @data =
    file = File.open(inputfile, 'r')
    file.collect{|line| @data << line.chomp}
    file.close
  end

## here's where you do whatever parsing you need to, my examples are
dumb... but the important thing is that you end up with @output

  def engParse(file)
    self.loadData(file)
    @data.collect{|line|
  @output << line.reverse}
  end

  def espParse(file)
    self.loadData(file)
    @data.collect{|line|
  @output << line.upcase}
  end

  def frParse(file)
    self.loadData(file)
    @data.collect{|line|
  @output << line.upcase.reverse}
  end

end #class

Initially I thought about taking this approach, but frankly I don't
know how many input files I will have, then I wanted to have an
approach so that I don't need to mess with the core classes and
any core file. I wanted changes to be local to the place where they
are necessary. I mean, when I want to add another input format
all I would have to do would be to create a new class and the code
would just work.

With this approach, I would have keep on adding more and more
methods for doing the actual parsing of different formats and what
I wanted was to offload that to another class without touching the
caller code.

Oh well, I'll keep on trying. I'm sure there's some pattern for doing
just that, maybe I just didn't implement correctly :-p

···

--
Posted via http://www.ruby-forum.com/\.

7stud -- wrote in post #993588:

Note that when inherited() is called, the methods of the subclass are
not defined yet, so if you create objects of the subclass inside
inherited(), the initialize() method in the parent is called.

And you can get around that problem by letting InputFormat#parse create
the objects:

class InputFormat
  @children =

  def self.children
    @children
  end

  def initialize(input)
    @input = input
  end

  def parse
    InputFormat.children.each { |child|
      instance = child.new
      instance.parse(@input) if instance.supported?
    }
  end

  def self.inherited(sub_class)
    @children << sub_class
  end
end

class InputFormatA < InputFormat
  def initialize
    puts "Initializing instance of #{self.class}"
  end

  def supported?
    true
  end

  def parse(str)
    puts "InputFormatA is parsing #{str}"
  end
end

class InputFormatB < InputFormat
  def initialize
    puts "Initializing instance of #{self.class}"
  end

  def supported?
    true
  end

  def parse(str)
    puts "InputFormatB is parsing #{str}"
  end
end

input = InputFormat.new('hello world')
input.parse

--output:--
Initializing instance of InputFormatA
InputFormatA is parsing hello world
Initializing instance of InputFormatB
InputFormatB is parsing hello world

···

--
Posted via http://www.ruby-forum.com/\.

Hi,

"Jesús Gabriel y Galán" <jgabrielygalan@gmail.com> wrote in post
#993452:

@children.each { |child|
def supported?

input = InputFormat.new(ARGV[0])
input.parse

As it turns out, this isn't working because AInputFormat will only
inherit from InputFormat at the time I actually use it, am I right ? Any
tips you guys could give me to achieve what I want ? (from several
possible input formats generate one output format)

The above doesn't work, because the @children inside the instance
method "parse" is not the same as the @children inside the class
method "self.inherited". You have to give access to the class instance
variable, and then use that one from the parse method (the you will

aaa, you're right :slight_smile: Good point.

see the next problem):

class InputFormat
  class << self
    attr_accessor :children
  end

  def initialize(input)
    @input = input
  end

  def parse
    self.class.children.each {|child| child.parse(@input) if
child.supported?(@input)}
  end

  def self.inherited(child)
    (@children ||= ) << child
  end
end

class AInputFormat < InputFormat
def supported?
   # check if we can parse this type of file
end

def parse
   # parse and generate array of hashes in known format
end
end

ruby-1.8.7-p334 :028 > input = InputFormat.new("test")
=> #<InputFormat:0xb738bf50 @input="test">
ruby-1.8.7-p334 :029 > input.parse
NoMethodError: undefined method `supported?' for AInputFormat:Class
  from (irb):11:in `parse'
  from (irb):11:in `each'
  from (irb):11:in `parse'
  from (irb):29

The next problem, as you see, is that you are defining instance
methods in the subclasses, but are calling them on the class. Maybe
the methods parse and supported? in the children could be class
methods, or maybe what you store in @children could be an instance of
the class.

I'm not instantiating AInputFormat in any part of the code... so making
those
class methods is the way to go for me :slight_smile: Thanks for the tip :slight_smile:

···

On Sat, Apr 16, 2011 at 9:52 PM, Felipe Balbi <balbif@gmail.com> wrote:

--
balbi

--
Posted via http://www.ruby-forum.com/\.

Felipe Balbi wrote in post #993383:

Hi Jake,
I want to add another input format
all I would have to do would be to create a new class and the code
would just work.

With this approach, I would have keep on adding more and more
methods for doing the actual parsing of different formats and what
I wanted was to offload that to another class without touching the
caller code.

  a very good point - and really what inheritance is for. good luck
with a solution, i (and i imagine those who read this post) will keep
playing with the idea...

  hasta otro...

  -j

···

--
Posted via http://www.ruby-forum.com/\.