Infix to Postfix Method

I'm still getting to grips with Ruby, but I think I'm doing well. My
last post generated a lot of positive comments/advice which really
helped me, so I thought I'd post some more of my code to try and repeat
that!

The following is a method which will convert an infix expression (1+1)
to a postfix expression (1 1 +). It expects a string and returns an
array. Once again I'd really appreciate your comments.

def postfix(expression)

  precedence = {nil=>0,40=>0,42 =>2, 43=>1, 45=>1,47=>2}

  output = Array.new
  stack = Array.new
  temp = ''

  expression.each_byte do |asc|

    case asc
      when 41
        output.push(temp)
        temp = ''
        loop{
          if stack.empty? == false
            popped = stack.pop
            if popped == 40
              break
            else
              output.push(popped.chr)
            end
          else
            raise 'Missing "("!'
          end
        }
      when 40,42, 43, 45, 47 #*+-/
        output.push(temp)
        temp = ''
        unless asc == 40
          p_asc = precedence[asc]
          loop{
            if (p_asc > precedence[stack.last])
              break
            else
              output.push(stack.pop.chr)
            end
          }
        end
        stack.push(asc)
      else
        temp += asc.chr
    end
  end
  stack.each{|asc| output.push(asc.chr)}
  !output.include?('(') or raise 'Missing ")"!'
  output.delete('')
  output
end

···

--
Posted via http://www.ruby-forum.com/.

Hi~

I'm still getting to grips with Ruby, but I think I'm doing well. My
last post generated a lot of positive comments/advice which really
helped me, so I thought I'd post some more of my code to try and repeat
that!

The following is a method which will convert an infix expression (1+1)
to a postfix expression (1 1 +). It expects a string and returns an
array. Once again I'd really appreciate your comments.

  I have a very similar approach in a little parser I wrote for role based auth system. It uses a stack and converts to postfix before evaluating the expressions. It can parse strings like "(admin | moderator & !blacklist) | root" and compare to roles. I thought you might like to see it as it's very similar to the way your infix to postfix code works.

class RoleExp

   def initialize(expression = nil)
     @expression = expression
     # using @@class vars here so they don't show up in #inspect
     @@symbols ||= {"(" => :lparen, ")" => :rparen, "&" => :and, "|" => :or, "!" => :not}
     @@precedence ||= {:lparen => 0, :rparen => 0, :and => 1, :or => 1, :not => 2}
     compile unless @expression.empty?
   end

   def compile(expression = nil)
     @expression = expression || @expression
     @postfix = []
     stack = []
     @expression.scan(/(\w+|\W)/).flatten.each do |term|
       term = @@symbols[term] || term.strip
       next if term.to_s.empty?
       case term
         when :lparen
           stack.push term
         when :and, :or, :not
           until stack.empty?
             break if @@precedence[term] > @@precedence[stack.last]
             @postfix << stack.pop
           end
           stack.push term
         when :rparen
           while stack.last != :lparen
             @postfix << stack.pop
           end
           stack.pop
         else
         @postfix << term
       end
     end
     @postfix.concat(stack.reverse)
   end

   def compute
     return false if @postfix.empty?
     stack = []
     @postfix.each do |term|
       case term
         when :and
           rhs = stack.pop
           stack.push(stack.pop && rhs)
         when :or
           rhs = stack.pop
           stack.push(stack.pop || rhs)
         when :not
           stack.push(!stack.pop)
         else
           stack.push(yield(term))
       end
     end
     stack.first
   end

end

class User
   attr_accessor :roles
   def initialize(*roles)
     @roles = roles
   end
end

bool = RoleExp.new "(admin | moderator & !blacklist)"
p bool
puts '='*40

user = User.new 'admin', 'moderator'
p user
p bool.compute {|term| user.roles.include? term}
puts '='*40

user.roles << 'blacklist'
p user
p bool.compute {|term| user.roles.include? term}

Cheers-
-- Ezra Zygmuntowicz-- Lead Rails Evangelist
-- ez@engineyard.com
-- Engine Yard, Serious Rails Hosting
-- (866) 518-YARD (9273)

···

On Apr 28, 2007, at 4:34 PM, Peter Marsh wrote:

  precedence = {nil=>0,40=>0,42 =>2, 43=>1, 45=>1,47=>2}

In general it's better not to explicitly specify ASCII values, so I'd
write instead:

    precedence = {nil=>0, ?(=>0, ?*=>2, ?+=>1, ?-=>1, ?/=>2}

Note though that on newer versions of ruby, ?x will give you a string
containing "x" rather than the ascii value of x. So to be more
compatible, instead of writing:

  expression.each_byte do |asc|

you'll want to write:

    expression.each_byte do |c|
      c = c.chr if ?x == "x"

and use c wherever you were using asc before.

(I feel like there should be a better way to do this, but I haven't
found one. Pity there's no String#each_char).

Paul

···

On Sun, Apr 29, 2007 at 08:34:33AM +0900, Peter Marsh wrote:

Paul Brannan wrote:

···

On Sun, Apr 29, 2007 at 08:34:33AM +0900, Peter Marsh wrote:

  precedence = {nil=>0,40=>0,42 =>2, 43=>1, 45=>1,47=>2}

In general it's better not to explicitly specify ASCII values, so I'd
write instead:

Could you explain why please, I can't see how it makes any difference,
really (but I am a noob!).

--
Posted via http://www.ruby-forum.com/.

(I feel like there should be a better way to do this, but I haven't
found one. Pity there's no String#each_char).

There is a String#each_char, but it's spelled a little funny :wink: :

"abc".scan(/./m) do |char|
    p char
end

Paul

···

On 4/28/07, Paul Brannan <pbrannan@atdesk.com> wrote:

Peter Marsh wrote:

Paul Brannan wrote:

  precedence = {nil=>0,40=>0,42 =>2, 43=>1, 45=>1,47=>2}

In general it's better not to explicitly specify ASCII values, so I'd
write instead:

Could you explain why please, I can't see how it makes any difference, really (but I am a noob!).

In a high level language, it's good to use the high level functions. That way if low level representations of things ever change in a newer version, your existing scripts will be more likely to work in the newer version.

Consider this example, even though it's probably not a very good one: What if the upcoming unicode support changed something with the internal representation of strings so that "*" was no longer represented as "42"? Then the entry in the hash {42 => 2} would no longer be applicable, but {?* => 2} will continue to work, because if ruby's internal representation of "*" changes in some version, so will the value of ?*.

It also makes me happy that someone else is working on a converter from infix to postfix. I just (almost) finished implementing Dijkstra's Shunting Yard algorithm (http://en.wikipedia.org/wiki/Shunting_yard_algorithm) in Ruby, as part of a bigger project. If you want to look at it, it's at http://paste-bin.com/11526, but it's 227 lines of poorly commented meaty algorithm. It currently works on mathematical equations with variables. It doesn't do any evaluation, though--just parses to an array.

Anyway, have fun.
Dan

···

On Sun, Apr 29, 2007 at 08:34:33AM +0900, Peter Marsh wrote:

Thanks for that explination, it makes quite a lot of sense!

···

--
Posted via http://www.ruby-forum.com/.

Your implementation is very similar to mine (though I'm parsing boolean
search queries, ala Google) -- mine is a bit simpler:

    InputPriority = Hash.new(1000)
    InputPriority[:and] = 800
    InputPriority[:or] = 500
    InputPriority[:not] = 900
    InputPriority[nil] = 0
    InputPriority[:open] = 1000
    InputPriority[:close] = 1
    StackPriority = InputPriority.dup
    StackPriority[:open] = 1
    StackPriority[:close] = 1000

    def rpnise(itokens)
      stack = []
      instructions = []

      itokens.each do |itoken|
        if InputPriority[itoken] > StackPriority[stack.last]
          stack.push(itoken)
        elsif InputPriority[itoken] <= StackPriority[stack.last]
          while StackPriority[stack.last] >= InputPriority[itoken]
            instructions << stack.pop
          end
          stack.push(itoken)
        end
      end
      instructions.concat(stack.reverse)
    end

After this I sanitize the token stream and use it to build a tree of
BooleanAnd/Or/Not/SubStringQuery objects to represent the search (which
then rewrite themselves in a vague attempt at optimization and sorting).

Demo:
  # Token stream
  irb(main):009:0> toks = analyser.simplify(tokenizer.tokenize("foo AND bar OR (wibble OR wobble NOT foo)"))
  [#<NSearch::SubStringQuery:0x18a4a20 @substr="foo">,
   :and,
   #<NSearch::SubStringQuery:0x18a49f8 @substr="bar">,
   :or,
   :open,
   #<NSearch::SubStringQuery:0x18a49a8 @substr="wibble">,
   :or,
   #<NSearch::SubStringQuery:0x18a4958 @substr="wobble">,
   :and,
   :not,
   #<NSearch::SubStringQuery:0x18a4908 @substr="foo">,
   :close]

   irb(main):010:0> analyser.rpnise(toks)
   [#<NSearch::SubStringQuery:0x188a670 @substr="foo">,
    #<NSearch::SubStringQuery:0x188a648 @substr="bar">,
    :and,
    #<NSearch::SubStringQuery:0x188a5f8 @substr="wibble">,
    #<NSearch::SubStringQuery:0x188a5a8 @substr="wobble">,
    #<NSearch::SubStringQuery:0x188a558 @substr="foo">,
    :not,
    :and,
    :or,
    :or]

You can then follow the instructions, popping each SubStringQuery onto a
stack and popping them off on :and, :or and :not to build a tree of
query objects:

#to_s
  OR(OR(AND(NOT("foo"),"wobble"),"wibble"),AND("bar","foo"))

#to_sql
  (((NOT (Title LIKE "%foo%") AND Title LIKE "%wobble%") OR Title LIKE
    "%wibble%") OR (Title LIKE "%bar%" AND Title LIKE "%foo%"))

I'd recommend against using Ruby's own Generator, btw; it uses callcc
and has a tendancy towards being slow and leaky. If you just need a
#next/#peek method, just turn it into an array and shove it in an
appropriate object which tracks the position.

···

* Dan Zwell (dzwell@gmail.com) wrote:

It also makes me happy that someone else is working on a converter
from infix to postfix. I just (almost) finished implementing
Dijkstra's Shunting Yard algorithm
(http://en.wikipedia.org/wiki/Shunting_yard_algorithm) in Ruby, as
part of a bigger project. If you want to look at it, it's at
http://paste-bin.com/11526, but it's 227 lines of poorly commented
meaty algorithm. It currently works on mathematical equations with
variables. It doesn't do any evaluation, though--just parses to an
array.

--
Thomas 'Freaky' Hurst
    http://hur.st/

Thomas Hurst wrote:

Demo:
  # Token stream
  irb(main):009:0> toks = analyser.simplify(tokenizer.tokenize("foo AND bar OR (wibble OR wobble NOT foo)"))
  [#<NSearch::SubStringQuery:0x18a4a20 @substr="foo">,
   :and,
   #<NSearch::SubStringQuery:0x18a49f8 @substr="bar">,
   :or,
   :open,
   #<NSearch::SubStringQuery:0x18a49a8 @substr="wibble">,
   :or,
   #<NSearch::SubStringQuery:0x18a4958 @substr="wobble">,
   :and,
   :not,
   #<NSearch::SubStringQuery:0x18a4908 @substr="foo">,
   :close]

   irb(main):010:0> analyser.rpnise(toks)
   [#<NSearch::SubStringQuery:0x188a670 @substr="foo">,
    #<NSearch::SubStringQuery:0x188a648 @substr="bar">,
    :and,
    #<NSearch::SubStringQuery:0x188a5f8 @substr="wibble">,
    #<NSearch::SubStringQuery:0x188a5a8 @substr="wobble">,
    #<NSearch::SubStringQuery:0x188a558 @substr="foo">,
    :not,
    :and,
    :or,
    :or]

You can then follow the instructions, popping each SubStringQuery onto a
stack and popping them off on :and, :or and :not to build a tree of
query objects:

#to_s
  OR(OR(AND(NOT("foo"),"wobble"),"wibble"),AND("bar","foo"))

#to_sql
  (((NOT (Title LIKE "%foo%") AND Title LIKE "%wobble%") OR Title LIKE
    "%wibble%") OR (Title LIKE "%bar%" AND Title LIKE "%foo%"))

I'd recommend against using Ruby's own Generator, btw; it uses callcc
and has a tendancy towards being slow and leaky. If you just need a
#next/#peek method, just turn it into an array and shove it in an
appropriate object which tracks the position.

That's pretty neat. I hadn't thought of using this stuff to generate SQL. I'll probably acually use that idea some day. And thanks for the bit about Generator. I'm currently looking through a computational script (ported from c++)--I could never figure out why it used all my ram within a few seconds, but it used Generators several thousand times. I'm rewriting parts of it now, and we'll see what happens.

Dan

That's pretty neat. I hadn't thought of using this stuff to generate
SQL. I'll probably acually use that idea some day.

It can perform the match itself too:

irb(main):025:0> query = NSearch::QueryParser.new.parse('foo bar NOT heh')
#<NSearch::RootQuery:0x1715380
  @query=
   #<NSearch::BooleanAndQuery:0x1714fe8
    @criteria=
     [#<NSearch::SubStringQuery:0x1715240 @substr="bar">,
      #<NSearch::SubStringQuery:0x1715268 @substr="foo">,
      #<NSearch::BooleanNotQuery:0x1715038
       @criteria=#<NSearch::SubStringQuery:0x17151f0 @substr="heh">>]>>

I can then query.rewrite to have it normalize itself (sort, minor
optimization), have it turn itself into various forms (SQL, boolean),
and of course execute it:

irb(main):029:0> query.match('foo bar')
=> true
irb(main):030:0> query.match('foo bar heh')
=> false

I also have a C library and a Ruby extension which uses it which
accelerates matches. We're currently using it in our new search engine.
Performance is on the order of 5 million matches/second on a single
2GHz Opteron. In pure Ruby it's closer to 200,000/sec.

I'm planning to_ferret to produce an equivilent ferret query, and
extending it to provide tagged queries: "category:ruby lang:en OR
lang:de". Obviously the trickiest bit here is making it work in C.

None of this is public, alas, but if there's interest..

And thanks for the bit about Generator. I'm currently looking through
a computational script (ported from c++)--I could never figure out why
it used all my ram within a few seconds, but it used Generators
several thousand times. I'm rewriting parts of it now, and we'll see
what happens.

Yeah, that'll be the callcc. I really hope it's improved for 1.9.

···

* Dan Zwell (dzwell@gmail.com) wrote:

--
Thomas 'Freaky' Hurst
    http://hur.st/