Yet another simple command-line option parser

I just put in a good example for:

http://rcrchive.net/rcr/show/317

It is a simple option parser that has option defaults and
converts the options to the right type:

# these klass.from_s methods are the meat of the RCR
def Float.from_s(s);s.to_f;end
def Integer.from_s(s);s.to_i;end
def Symbol.from_s(s);s.to_sym;end
def String.from_s(s);s.to_s;end
def Regexp.from_s(s,*other);new(s,*other);end
require 'time.rb'
def Time.from_s(s,*other);Time.parse(s,*other);end

def argv_options(options)
    i = 0
    while arg = ARGV[i]
        if arg[0]==?-
            arg.slice!(0)
            ARGV.slice!(i)
            break if arg=="-" # -- terminates options
            opt = arg.to_sym
            default = options[opt]
            if default
                klass = default.class
                # would need a big case statement w/o RCR 317
                options[opt] = klass.from_s(ARGV.slice!(i))
            elsif default.nil?
                raise("unknown option -#{opt}")
            else # default==false
                options[opt] = true
            end
        else
            i += 1
        end
    end
    options
end

ARGV.replace(%w(
    -n 4
    -multiplier 3.14
    -q
    -title foobar
    -pattern fo+
    -time 5:55PM
    -method downcase
    a b c
))
# option => default (or false for a flag)
argv_options(
    :n => 1,
    :multiplier => 1.0,
    :q => false,
    :title => "hello world!",
    :pattern => /.*/,
    :time => Time.new,
    :method => :to_s
)

If you have an opinion about the usefulness of this RCR, go
vote and give a comment. I didn't have an example before to
show it in action.

···

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

That's pretty interesting Eric, to grab the type off the default.
I think I'll add that to CommandLine::OptionParser.

However, I'm still not sure if I like the #from_s form,
but I can see the utility of it. For the common cases,
I can use a simple case statement:

  case default
    when Float then Float(arg)
    when Fixnum then Integer(arg)
  end

But, as you can see with even these simple cases, there
are big issues and big questions to answer.
1. Fixnum does not match Integer
2. Do we use to_i or Integer(#) - Integer raises and to_i does not
3. Do we use Float or to_f - Float raises and to_f does not

Then, there are the tougher cases like

  require 'parsedate'
  case default
    when Time then Time.gm(*ParseDate.parsedate(arg))
    when Fixnum arg.to_i # what if yield bignum
  end

where we have to use a help class and helper method (not new) to get
the object we want. Or if the conversion method
returns something other than what we requested, like
Bignum instead of Fixnum.

Sadly, the #from_s RCR doesn't seem to address any of these issues.

···

ARGV.replace(%w(
    -n 4
    -multiplier 3.14
    -q
    -title foobar
    -pattern fo+
    -time 5:55PM
    -method downcase
    a b c
))
# option => default (or false for a flag)
argv_options(
    :n => 1,
    :multiplier => 1.0,
    :q => false,
    :title => "hello world!",
    :pattern => /.*/,
    :time => Time.new,
    :method => :to_s
)

--
Jim Freeze

Thanks for the input Jim. Comments below. I'm also putting
this in the RCR comments.

That's pretty interesting Eric, to grab the type off the
default.
I think I'll add that to CommandLine::OptionParser.

However, I'm still not sure if I like the #from_s form,
but I can see the utility of it. For the common cases,
I can use a simple case statement:

  case default
    when Float then Float(arg)
    when Fixnum then Integer(arg)
  end

But, as you can see with even these simple cases, there
are big issues and big questions to answer.
1. Fixnum does not match Integer

No problem. Fixnum inherits from Integer. Do you care whether
Fixnum.from_s returns a Fixnum or a Bignum? It could return
either just like many of the other Fixnum instance methods.

2. Do we use to_i or Integer(#) - Integer raises and to_i
does not
3. Do we use Float or to_f - Float raises and to_f does not

Good point. This RCR should to specify this. I would think it
best if an exception occur if the full string doesn't parse the
the target type. I'll change the implementation to use the
methods that raise exceptions.

Then, there are the tougher cases like

  require 'parsedate'
  case default
    when Time then Time.gm(*ParseDate.parsedate(arg))

I haven't dealt with dates and times to know what all the
options are. I threw this in at the last minute. If you don't
like the klass.from_s method, you could define your own derived
class (or override that klass.from_s):

require 'parsedate'
class MyTime < Time
  def self.from_s(s)
    gm(*ParseDate.parsedate(s))
  end
end

and then make the default be a MyTime instead of a Time.

    when Fixnum arg.to_i # what if yield bignum

answered above

···

--- Jim Freeze <jim@freeze.org> wrote:

  end

where we have to use a help class and helper method (not new)
to get
the object we want. Or if the conversion method
returns something other than what we requested, like
Bignum instead of Fixnum.

Sadly, the #from_s RCR doesn't seem to address any of these
issues.

> ARGV.replace(%w(
> -n 4
> -multiplier 3.14
> -q
> -title foobar
> -pattern fo+
> -time 5:55PM
> -method downcase
> a b c
> ))
> # option => default (or false for a flag)
> argv_options(
> :n => 1,
> :multiplier => 1.0,
> :q => false,
> :title => "hello world!",
> :pattern => /.*/,
> :time => Time.new,
> :method => :to_s
> )

--
Jim Freeze

______________________________________________________
Yahoo! for Good
Donate to the Hurricane Katrina relief effort.
http://store.yahoo.com/redcross-donate3/

Thanks for the input Jim. Comments below. I'm also putting
this in the RCR comments.

No problem. Fixnum inherits from Integer. Do you care whether
Fixnum.from_s returns a Fixnum or a Bignum? It could return
either just like many of the other Fixnum instance methods.

I was just thinking it would be strange for me to write

  Fixnum.from_s(...)

and get back a BigNum.

The natural thing to do is to have it return a BigNum, but that
is not what was requested. The symmetric thing to do (see below)
is to have it raise an exception, but that would have little use and
be quite annoying.

> 2. Do we use to_i or Integer(#) - Integer raises and to_i
> does not
> 3. Do we use Float or to_f - Float raises and to_f does not

Good point. This RCR should to specify this. I would think it
best if an exception occur if the full string doesn't parse the
the target type. I'll change the implementation to use the
methods that raise exceptions.

In the two cases above I think an exception should be raised.

require 'parsedate'
class MyTime < Time
  def self.from_s(s)
    gm(*ParseDate.parsedate(s))
  end
end

and then make the default be a MyTime instead of a Time.

Yes, that would work, and be a pain. My thought is that if #from_s was
integrated into Ruby, we wouldn't have this problem. The parsing functionality
would be built into the Time class.

But even so, there will always be cases like those above, but with different
classes. Is there a way to handle them in an aesthetically pleasing way?

···

On 9/13/05, Eric Mahurin <eric_mahurin@yahoo.com> wrote:

--
Jim Freeze

> Thanks for the input Jim. Comments below. I'm also
putting
> this in the RCR comments.
>
> No problem. Fixnum inherits from Integer. Do you care
whether
> Fixnum.from_s returns a Fixnum or a Bignum? It could
return
> either just like many of the other Fixnum instance methods.

I was just thinking it would be strange for me to write

  Fixnum.from_s(...)

and get back a BigNum.

The natural thing to do is to have it return a BigNum, but
that
is not what was requested. The symmetric thing to do (see
below)
is to have it raise an exception, but that would have little
use and
be quite annoying.

between Fixnum and Bignum is the efficiency (runtime and
memory) of Fixnum - a very good benefit. Otherwise, I think
you should consider them the same. Many of the methods in
Fixnum and Bignum can return either a Fixnum or Bignum.
(Fixnum/Bignum/Integer).from_s would be no exception. I just
think of Fixnum and Bignum being the same type.

> > 2. Do we use to_i or Integer(#) - Integer raises and to_i
> > does not
> > 3. Do we use Float or to_f - Float raises and to_f does
not
>
> Good point. This RCR should to specify this. I would
think it
> best if an exception occur if the full string doesn't parse
the
> the target type. I'll change the implementation to use the
> methods that raise exceptions.

In the two cases above I think an exception should be raised.

> require 'parsedate'
> class MyTime < Time
> def self.from_s(s)
> gm(*ParseDate.parsedate(s))
> end
> end
>
> and then make the default be a MyTime instead of a Time.

Yes, that would work, and be a pain. My thought is that if
#from_s was
integrated into Ruby, we wouldn't have this problem. The
parsing functionality
would be built into the Time class.

Maybe I just picked the wrong functionality for Time.from_s.

But even so, there will always be cases like those above, but
with different
classes. Is there a way to handle them in an aesthetically
pleasing way?

You'll also find just as many holes when trying to use obj.to_s
to convert an object to a string. Sometimes it isn't going to
do it like you want. You could argue that we don't need any
convention for #to_* methods based on that. I think some
convention for specifying default ways to go to and from
another specific class is a good thing. Preferrably those
methods could take optional arguments to modify the behavior.
Or just have additional methods.

If you have a better idea I'm listening. I just proposed about
the simplest way convert from an specific type to an arbitrary
type (#to_* provides arbitrary type to specific type). There
have been other solutions for going from arbitrary to
arbitrary, but I have yet to see an application for that.

···

--- Jim Freeze <jim@freeze.org> wrote:

On 9/13/05, Eric Mahurin <eric_mahurin@yahoo.com> wrote:

From my perspective, the only reason for the distinction

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around

In one way what you are suggesting is just a serialization
method, it just so happens that you are suggesting a string be used
to store the serialized data.

  obj.to_s.from_s == obj #=> true

Which is essentially a serialization of obj, and is OO.
Right now, Marshal is functional

   Marshal.load(Mashal.dump(obj)) == obj #=> true

One could argue that we have this, but with YAML.
  YAML.load(obj.to_yaml) == obj #=> true

So, why don't we add a #from_yaml or #yaml_load to every
object? I don't think I want this done automatically, but it
would be nice to have it available so I can extend a class or
object as needed.

I think that every time we visit this subject, it goes back to
similar arguments as to why every object doesn't have puts,
such as "fred".puts. Matz has made statements on this
which I agree with. Personally I think that puts("fred") is more natural.

But, I'm warming up to #from_s more and more, but still not sure yet.

···

On 9/13/05, Eric Mahurin <eric_mahurin@yahoo.com> wrote:

From my perspective, the only reason for the distinction
between Fixnum and Bignum is the efficiency (runtime and
memory) of Fixnum - a very good benefit. Otherwise, I think
you should consider them the same. Many of the methods in

Agreed

--
Jim Freeze

In one way what you are suggesting is just a serialization
method, it just so happens that you are suggesting a string
be used
to store the serialized data.

  obj.to_s.from_s == obj #=> true

With my proposal you would write the above as:

obj.class.from_s(obj.to_s) == obj #=> true

But, the intent is not to do something like Marshal does. I'm
only proposiing klass.from_s methods where it makes sense, not
for handling all classes like Marshal does. Only if you would
want to parse an object of a certain type from a human
readable/writable string would you want to make a from_s class
method for that type.

On top of that, Marshal really doesn't help with type
conversion. It only converts from a type to a machine readable
byte stream and back. It's kind of like changing the storage
from memory to file like mmap does?? YAML seems similar too.

Which is essentially a serialization of obj, and is OO.
Right now, Marshal is functional

   Marshal.load(Mashal.dump(obj)) == obj #=> true

One could argue that we have this, but with YAML.
  YAML.load(obj.to_yaml) == obj #=> true

So, why don't we add a #from_yaml or #yaml_load to every
object? I don't think I want this done automatically, but it
would be nice to have it available so I can extend a class or
object as needed.

I'm not sure what that would buy you.

I think that every time we visit this subject, it goes back
to
similar arguments as to why every object doesn't have puts,
such as "fred".puts. Matz has made statements on this
which I agree with. Personally I think that puts("fred") is
more natural.

It probably ends up like that because in C++ objects can write
themselves out to a stream and read themselves in from a
stream. I like what is there now too. I haven't used C++ in a
while, so I can't say a whole lot about how it is managed
(where the methods live).

But, I'm warming up to #from_s more and more, but still not
sure yet.

Good. I'd like to know what you finally decide on for your
command-line parser.

···

--- Jim Freeze <jim@freeze.org> wrote:

On 9/13/05, Eric Mahurin <eric_mahurin@yahoo.com> wrote:

> From my perspective, the only reason for the distinction
> between Fixnum and Bignum is the efficiency (runtime and
> memory) of Fixnum - a very good benefit. Otherwise, I
think
> you should consider them the same. Many of the methods in

Agreed

--
Jim Freeze

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around

[snip]

I think that every time we visit this subject, it goes back to
similar arguments as to why every object doesn't have puts,
such as "fred".puts. Matz has made statements on this
which I agree with. Personally I think that puts("fred") is more natural.

We don't have puts, but we have display

irb(main):005:0> 12.display
12=> nil
irb(main):006:0> [1,2,3].display
123=> nil
irb(main):007:0> "a string".display
a string=> nil

regards,

Brian

···

[snip]

--
http://ruby.brian-schroeder.de/

Stringed instrument chords: http://chordlist.brian-schroeder.de/

My problem with the RCR (which I added to the comments) goes like this:

#from_s is a good design pattern for your own classes. Do it.

The RCR is to change the core language classes. Why?
It's not for convenience, because the same methods already exist, just in different forms.

So I presume it's for consistency. The problem with this is that it still won't be consistent, because some classes do not have an unambiguous string->instance conversion path. String#split exists because there are lots of real-world cases with different delimiters for arrays. Array.from_s would need to
   * account for this (and thus have a different interface, with optional #split type param)
   * not account for it (requiring #split for all "non-standard" string cases)
   * not exist (as your RCR seems to suggest)

If this is about object serialization and later deserialization, Marshal and YAML and friends exist.

If this is about object deserialization only, then the real world jumps in and says:
   * Too many ambiguous cases to make this clear for more than a few core classes

   * So what's the point?

···

On Sep 13, 2005, at 3:50 PM, Eric Mahurin wrote:

But, the intent is not to do something like Marshal does. I'm
only proposiing klass.from_s methods where it makes sense, not
for handling all classes like Marshal does. Only if you would
want to parse an object of a certain type from a human
readable/writable string would you want to make a from_s class
method for that type.

I forgot about that one. This would be great to override if
the object was a big datastructure and you could easily write
it out a chunk at a time rather than convert it to one massive
string first. But, I haven't seen anybody use this or override
it.

The complement to the this obj.display(io) method would be a
klass.read(io) method, but it gets even harder to do than the
klass.from_s(s) of my proposal because you don't know how much
to read from the io (what would String.read(io) do? - maybe
stop at a newline?). At some point you just have to go to
traditional parsing. But, maybe it is still useful. I don't
know.

···

--- Brian Schröder <ruby.brian@gmail.com> wrote:

> [snip]
>
> I think that every time we visit this subject, it goes back
to
> similar arguments as to why every object doesn't have puts,
> such as "fred".puts. Matz has made statements on this
> which I agree with. Personally I think that puts("fred") is
more natural.
>

We don't have puts, but we have display

irb(main):005:0> 12.display
12=> nil
irb(main):006:0> [1,2,3].display
123=> nil
irb(main):007:0> "a string".display
a string=> nil

__________________________________
Yahoo! Mail - PC Magazine Editors' Choice 2005

It is not for consistency. It is for converting a string to a
somewhat arbitrary class. It doesn't need to be in every
class. Just like #to_i is in some classes and not others.
I've never suggested putting it in Array. I'm also not
suggesting it be used for object deserialization.

To get an idea of the usefulness, see the example I gave. How
would you implement this simple option parser API (what started
this thread)?

ARGV.replace(%w(
    -n 4
    -multiplier 3.14
    -q
    -title foobar
    -pattern fo+
    -time 5:55PM
    -method downcase
    a b c
))
options = argv_options(
    :n => 1,
    :multiplier => 1.0,
    :q => false,
    :title => "hello world!",
    :pattern => /.*/,
    :time => Time.new,
    :method => :to_s,
    :default => 1.23
)

#=> {:default=>1.23, :time=>Wed Sep 14 17:55:00 Central
Daylight Time 2005, :n=>4, :multiplier=>3.14, :title=>"foobar",
:q=>true, :method=>:downcase, :pattern=>/fo+/}

I don't think you'll find a cleaner/more flexible solution than
the one I gave.

···

--- Gavin Kistner <gavin@refinery.com> wrote:

On Sep 13, 2005, at 3:50 PM, Eric Mahurin wrote:
> But, the intent is not to do something like Marshal does.
I'm
> only proposiing klass.from_s methods where it makes sense,
not
> for handling all classes like Marshal does. Only if you
would
> want to parse an object of a certain type from a human
> readable/writable string would you want to make a from_s
class
> method for that type.

My problem with the RCR (which I added to the comments) goes
like this:

#from_s is a good design pattern for your own classes. Do it.

The RCR is to change the core language classes. Why?
It's not for convenience, because the same methods already
exist,
just in different forms.

So I presume it's for consistency. The problem with this is
that it
still won't be consistent, because some classes do not have
an
unambiguous string->instance conversion path. String#split
exists
because there are lots of real-world cases with different
delimiters
for arrays. Array.from_s would need to
   * account for this (and thus have a different interface,
with
optional #split type param)
   * not account for it (requiring #split for all
"non-standard"
string cases)
   * not exist (as your RCR seems to suggest)

If this is about object serialization and later
deserialization,
Marshal and YAML and friends exist.

If this is about object deserialization only, then the real
world
jumps in and says:
   * Too many ambiguous cases to make this clear for more
than a few
core classes

   * So what's the point?

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around