Linefilter - looking for suggestions

I'm writing a small wrapper around ruby, meant to be used as part of a
unix pipeline filter - e.g.

ls -l | rbx 'cols(8,4).mapf(:to_s).formatrow(" ", 30, 8).endl'

It basically consists of a small 'rbx' executable, and a 'linefilter'
library, which contains useful extensions to String - ideally, it'll let
rbx be used as a convenient replacement for awk, sed and perl for quick
one-liners (code below).

Any suggestions for improvements or useful additional methods? One thing
I'm considering is an Array#apply, which invokes a differnt method on
each member of an array (for instance the above example could have it
inserted as .mapf(:to_s).apply(:ljust, :rjust).formatrow(...)) - do
people like the name?

martin

rbx (for want of a better short name) is

···

#----------------------------------------------------------------------------
#!/usr/bin/ruby

class Array
  def take_while!
    r = []
    while yield(at(0))
      r << shift
    end
    r
  end
end
    
# -- MAIN --

flags = ARGV.take_while! {|i| i =~ /^-/}
command = "'print \$_.to_s.instance_eval {#{ARGV.shift}}'"
files = ARGV.dup

system("ruby -rlinefilter #{flags.join(" ")} -ne #{command} #{files.join(" ")}")

#----------------------------------------------------------------------------

and linefilter contains the following methods (mostly from standard
class extensions):

module Enumerable
  def map_with_index
    a = []
    each_with_index {|e, i| a << yield(e,i)}
    a
  end

  def mapf(method, *args)
    collect do |value|
      value.send(method, *args)
    end
  end

end

class String
  
  def endl
    concat("\n")
  end

  def cols(*args)
    a = split(/\s+/)
    args.map {|i| a[i]}
  end

  def spjoin
    join(" ")
  end

  # from http://www.rubygarden.org/ruby?StringSub

  # 'number' leftmost chars
  def left(number = 1)
    self[0..number-1]
  end

  # 'number' rightmost chars
  def right(number = 1)
    self[-number..-1]
  end

  # 'number' chars starting at position 'from'
  def mid(from, number=1)
    self[from..from+number-1]
  end

  # chars from beginning to 'position'
  def head(position = 0)
    self[0..position]
  end

  # chars following 'position'
  def tail(position = 0)
    self[position+1..-1]
  end

  # Tabs left or right by n chars, using spaces
  def tab(n)
    if n >= 0
      gsub(/^/, ' ' * n)
    else
      gsub(/^ {0,#{-n}}/, "")
    end
  end

  alias_method :indent, :tab

  # Preserves relative tabbing.
  # The first non-empty line ends up with n spaces before nonspace.
  def tabto(n)
    if self =~ /^( *)\S/
      tab(n - $1.length)
    else
      self
    end
  end

end

class Array
  def formatrow(separator, *widths)
    map_with_index {|a,i|
      w = widths[i]
      (a.to_s).slice(0..(w-1)).ljust(w)
    }.join(separator)
  end
end

"Martin DeMello" <martindemello@yahoo.com> schrieb im Newsbeitrag
news:0JEvc.620827$Pk3.157220@pd7tw1no...

I'm writing a small wrapper around ruby, meant to be used as part of a
unix pipeline filter - e.g.

ls -l | rbx 'cols(8,4).mapf(:to_s).formatrow(" ", 30, 8).endl'

I'd perfer to have something in front of cols() that does the splitting -
otherwise it's always space separated which might limit usefulness.

You could do

class String
  alias split_old split

  def split(x)
    case x
      when Regexp
        split_old(x)
      when :space
        split_old( /\s+/ )
      when String
        split_old( x )
      else
        split_old( x.to_s )
    end
  end
end

···

It basically consists of a small 'rbx' executable, and a 'linefilter'
library, which contains useful extensions to String - ideally, it'll let
rbx be used as a convenient replacement for awk, sed and perl for quick
one-liners (code below).

Any suggestions for improvements or useful additional methods? One thing
I'm considering is an Array#apply, which invokes a differnt method on
each member of an array (for instance the above example could have it
inserted as .mapf(:to_s).apply(:ljust, :rjust).formatrow(...)) - do
people like the name?

martin

rbx (for want of a better short name) is

#-------------------------------------------------------------------------
---

#!/usr/bin/ruby

class Array
  def take_while!

This name is misleading since it suggests that array is manipulated in
place. Better remove the "!".

Even better use optionparse to process options.

    r =
    while yield(at(0))
      r << shift
    end
    r
  end
end

# -- MAIN --

flags = ARGV.take_while! {|i| i =~ /^-/}
command = "'print \$_.to_s.instance_eval {#{ARGV.shift}}'"
files = ARGV.dup

system("ruby -rlinefilter #{flags.join(" ")} -ne #{command}

#{files.join(" ")}")

Why do you spawn an extra process here? IMHO that's superfluous. If you
put the command into a block, your main loop will look like this:

while ( line = gets )
  line.chomp!
  command.call line
end

To do that you just need

command = eval %Q{ lambda {|line| puts line.instance_eval(
'#{ARGV.shift}' ) } }

Regards

    robert

#-------------------------------------------------------------------------
---

and linefilter contains the following methods (mostly from standard
class extensions):

module Enumerable
  def map_with_index
    a =
    each_with_index {|e, i| a << yield(e,i)}
    a
  end

  def mapf(method, *args)
    collect do |value|
      value.send(method, *args)
    end
  end

end

class String

  def endl
    concat("\n")
  end

  def cols(*args)
    a = split(/\s+/)
    args.map {|i| a[i]}
  end

  def spjoin
    join(" ")
  end

  # from http://www.rubygarden.org/ruby?StringSub

  # 'number' leftmost chars
  def left(number = 1)
    self[0..number-1]
  end

  # 'number' rightmost chars
  def right(number = 1)
    self[-number..-1]
  end

  # 'number' chars starting at position 'from'
  def mid(from, number=1)
    self[from..from+number-1]
  end

  # chars from beginning to 'position'
  def head(position = 0)
    self[0..position]
  end

  # chars following 'position'
  def tail(position = 0)
    self[position+1..-1]
  end

  # Tabs left or right by n chars, using spaces
  def tab(n)
    if n >= 0
      gsub(/^/, ' ' * n)
    else
      gsub(/^ {0,#{-n}}/, "")
    end
  end

  alias_method :indent, :tab

  # Preserves relative tabbing.
  # The first non-empty line ends up with n spaces before nonspace.
  def tabto(n)
    if self =~ /^( *)\S/
      tab(n - $1.length)
    else
      self
    end
  end

end

class Array
  def formatrow(separator, *widths)
    map_with_index {|a,i|
      w = widths[i]
      (a.to_s).slice(0..(w-1)).ljust(w)
    }.join(separator)
  end
end

"Martin DeMello" <martindemello@yahoo.com> schrieb im Newsbeitrag
news:0JEvc.620827$Pk3.157220@pd7tw1no...
> I'm writing a small wrapper around ruby, meant to be used as part of a
> unix pipeline filter - e.g.
>
> ls -l | rbx 'cols(8,4).mapf(:to_s).formatrow(" ", 30, 8).endl'

I'd perfer to have something in front of cols() that does the splitting -
otherwise it's always space separated which might limit usefulness.

Hm - I was trying to avoid an extraneous 'split', since cols always
requires one. Maybe check if the first argument is a string and split on
that, otherwise split on space.

>
> class Array
> def take_while!

This name is misleading since it suggests that array is manipulated in
place. Better remove the "!".

It is - I'm using 'shift'. I figured optionparse was overkill since I
just wanted to pass the options along to ruby.

Why do you spawn an extra process here? IMHO that's superfluous. If you
put the command into a block, your main loop will look like this:

Mostly because it started life as a shellscript, and then migrated over
when I coldn't figure out how to do option parsing properly :slight_smile:

while ( line = gets )
  line.chomp!
  command.call line
end

To do that you just need

command = eval %Q{ lambda {|line| puts line.instance_eval(
'#{ARGV.shift}' ) } }

But how will that let us pass options to the ruby interpreter?

martin

···

Robert Klemme <bob.news@gmx.net> wrote:

"Martin DeMello" <martindemello@yahoo.com> schrieb im Newsbeitrag
news:eTFvc.621056$Pk3.280803@pd7tw1no...

> "Martin DeMello" <martindemello@yahoo.com> schrieb im Newsbeitrag
> news:0JEvc.620827$Pk3.157220@pd7tw1no...
> > I'm writing a small wrapper around ruby, meant to be used as part of

a

> > unix pipeline filter - e.g.
> >
> > ls -l | rbx 'cols(8,4).mapf(:to_s).formatrow(" ", 30, 8).endl'
>
> I'd perfer to have something in front of cols() that does the

splitting -

> otherwise it's always space separated which might limit usefulness.

Hm - I was trying to avoid an extraneous 'split', since cols always
requires one. Maybe check if the first argument is a string and split on
that, otherwise split on space.

I'd find it more clean if it was not a parameter to cols but an extra
operation before. You can leave it as it is with two changes if you just
add method cols to Array (or enumerable). Then you can use the old
behavior (i.e. implicit split by white space) and additionally you can use
String#split to do the splitting.

> > class Array
> > def take_while!
>
> This name is misleading since it suggests that array is manipulated in
> place. Better remove the "!".

It is - I'm using 'shift'. I figured optionparse was overkill since I
just wanted to pass the options along to ruby.

Oh, yeah. Sorry, I overlooked that.

> Why do you spawn an extra process here? IMHO that's superfluous. If

you

> put the command into a block, your main loop will look like this:

Mostly because it started life as a shellscript, and then migrated over
when I coldn't figure out how to do option parsing properly :slight_smile:

"Historic reasons". :slight_smile:

> while ( line = gets )
> line.chomp!
> command.call line
> end
>
> To do that you just need
>
> command = eval %Q{ lambda {|line| puts line.instance_eval(
> '#{ARGV.shift}' ) } }

But how will that let us pass options to the ruby interpreter?

Which options do you need to pass on? If it's not too esoteric, you might
be able to set them via global variables.

Regards

    robert

···

Robert Klemme <bob.news@gmx.net> wrote:

>
> Hm - I was trying to avoid an extraneous 'split', since cols always
> requires one. Maybe check if the first argument is a string and split on
> that, otherwise split on space.

I'd find it more clean if it was not a parameter to cols but an extra
operation before. You can leave it as it is with two changes if you just
add method cols to Array (or enumerable). Then you can use the old
behavior (i.e. implicit split by white space) and additionally you can use
String#split to do the splitting.

Good point - and it's the nicely polymorphic way to do it too. I'll have
to see if the array/string idea can extend to other methds too. A third
option is to simply use -F, though if you're writing a one-liner going
back and adding it in if you realise you need it at the time you're
typing in 'cols' will be a pain.

> But how will that let us pass options to the ruby interpreter?

Which options do you need to pass on? If it's not too esoteric, you might
be able to set them via global variables.

The main ones I've found useful in practice are -i and -rwhatever
(ideally I'd like an interpreter switch for the $_.instance_eval loop,
parallel to -n and -p, but when I RCRd it it wasn't too popular). You're
right, they probably could be set via globals, which would save us the
extra process spawn. I might think up some options that make sense for
rbx but not for ruby too, in which case optparse is definitely the way
to go.

martin

···

Robert Klemme <bob.news@gmx.net> wrote: