FEATURE SUGGESTION: Accept default value for to_f and to_i

I suggest that to_i() and to_f() have an optional parameter added with
the default value of 0 (for backwards compatibility).

This would allow code like

if astring.to_f(nil)
  # valid, so use it
else
  # not a valid float, nil was returned, so handle error
end

Currently, from the output of these functions, a conversion error is
indistiguishable from a valid input of 0.

It would allow even more succinct code when, say, reading in a
configuration value :
delay = configuration['delay'].to_f(DEFAULT_DELAY)

I find this pattern of providing a default that is returned in the event
of an error (instead of throwing an exception or returning nil or 0)
allows for simple, safe and succinct code.

Another example is xpath_value(aRootNode,aXPath,aDefault=nil) which
returns aDefault if there is any problem returning the value selected by
aXPath (it often doesn't matter what the problem is).

···

--
Posted via http://www.ruby-forum.com/.

Hi,

I suggest that to_i() and to_f() have an optional parameter added with
the default value of 0 (for backwards compatibility).

This would allow code like

if astring.to_f(nil)
# valid, so use it
else
# not a valid float, nil was returned, so handle error
end

I think Float(astring) that raises an exception for invalid string can
do the work for you.

It would allow even more succinct code when, say, reading in a
configuration value :
delay = configuration['delay'].to_f(DEFAULT_DELAY)

Try

  delay = Float(configuration['delay']) rescue DEFAULT_DELAY

              matz.

···

In message "Re: FEATURE SUGGESTION: Accept default value for to_f and to_i" on Tue, 27 Nov 2007 11:15:19 +0900, Mr Magpie <gazmcgheesubs@yahoo.com.au> writes:

if (num = astring.to_f) == 0
    # may or may not be valid
    begin
      num = Float(astring)
    rescue
      # not a valid float, nil was returned, so handle error
    end
  end

  # valid num, so use it

You can wrap it in a "monkey patch" if you like.

T.

···

On Nov 26, 9:15 pm, Mr Magpie <gazmcghees...@yahoo.com.au> wrote:

I suggest that to_i() and to_f() have an optional parameter added with
the default value of 0 (for backwards compatibility).

This would allow code like

if astring.to_f(nil)
  # valid, so use it
else
  # not a valid float, nil was returned, so handle error
end

Yukihiro Matsumoto wrote:

I think Float(astring) that raises an exception for invalid string can
do the work for you.

>It would allow even more succinct code when, say, reading in a
>configuration value :
>delay = configuration['delay'].to_f(DEFAULT_DELAY)

Try

  delay = Float(configuration['delay']) rescue DEFAULT_DELAY

              matz.

Thankyou for replying matz. I very much like the existing to_i and to_f
methods approach of never throwing exceptions as they allow things like
single line method chaining without fear of exceptions in the majority
of cases where 0 is the desired default.

I'm merely suggesting that the error default value be customisable to
distinguish bad input from valid input
eg. "0".to_i and "dds".to_i both return 0 and sometimes thats fine, but
other times we want to know whether the input was a valid integer or
not.

I believe exceptions are a performance drag, and these little functions
are often called thousands of times in a loop for processing input, so
I'd prefer to avoid a method that potentially causes exceptions. I think
they would be among the more performance critical of all Ruby methods,
which is why I'm suggesting these changes for the C based core rather
than just making my own monkey patch.

Anyway, thanks for Ruby and replying to me.

···

--
Posted via http://www.ruby-forum.com/\.

but, unless you use #to_i exactly as it is now that is still the case? with your suggestion this code

  'forty-two'.to_i(nil).abs

raises a NameError

so i think the point is that you either have

   'forty-two'.to_i.abs # let zero be the default

    Integer 'forty-two' # need to handle exceptions

and nothing in between because a to_i with a default that is non-numeric doesn't provide anything over the built-in Integer or Float.

fwiw i use this alot:

···

On Nov 26, 2007, at 9:19 PM, Mr Magpie wrote:

I believe exceptions are a performance drag, and these little functions
are often called thousands of times in a loop for processing input, so
I'd prefer to avoid a method that potentially causes exceptions. I think

#
# convert a string to an integer of any base
#
   def strtod(s, opts = {})
     base = getopt 'base', opts, 10
     case base
       when 2
         s = "0b#{ s }" unless s =~ %r/^0b\d/
       when 8
         s = "0#{ s }" unless s =~ %r/^0\d/
       when 16
         s = "0x#{ s }" unless s =~ %r/^0x\d/
     end
     Integer s
   end
#
# convert a string to an integer
#
   def atoi(s, opts = {})
     strtod("#{ s }".gsub(%r/^0+/,''), 'base' => 10)
   end

a @ http://codeforpeople.com/
--
share your knowledge. it's a way to achieve immortality.
h.h. the 14th dalai lama

Why not opts.fetch('base', 10)? Does getopt do something fancy?

Regards,
Jordan

···

On Nov 26, 10:33 pm, "ara.t.howard" <ara.t.how...@gmail.com> wrote:

     base = getopt 'base', opts, 10

ara.t.howard wrote:

I believe exceptions are a performance drag, and these little
functions
are often called thousands of times in a loop for processing input, so
I'd prefer to avoid a method that potentially causes exceptions. I
think

but, unless you use #to_i exactly as it is now that is still the
case? with your suggestion this code

  'forty-two'.to_i(nil).abs

raises a NameError

Of course if you were chaining class-specific methods like abs you would
have to use a default supporting that method.

so i think the point is that you either have

   'forty-two'.to_i.abs # let zero be the default

    Integer 'forty-two' # need to handle exceptions

and nothing in between because a to_i with a default that is non-
numeric doesn't provide anything over the built-in Integer or Float.

Yes, that is the choice at present.

The benefits my suggestion provides are :
1) allows an application specific default (of any type) to be supplied,
reducing code required.
2) allows bad input to be unambigously detected, which (can distinguish
"fds".to_i from "0".to_i)
3) because the result of to_i always evaluates to true, you can't do
     num.to_i ? 'valid int' : 'invalid int'
but with my sugestion you could do
     num.to_i(false) ? 'valid int' : 'invalid int'
4) would be a miniscule change to the existing optimised C unlike some
monkey patch I could do
5) would avoid performance-sapping exceptions
6) would avoid expensive regular expressions
7) as a default parameter, wouldn't affect existing code.

I don't think any other approach satisfies all of the above.

Thanks for your reply and the code examples.

Regards,

magpie

···

On Nov 26, 2007, at 9:19 PM, Mr Magpie wrote:

--
Posted via http://www.ruby-forum.com/\.

I could be dense; well, I probably am. No, I'm sure about it. :wink: But
let me give it a go anyhow...

All of the functionality you mention can be had now, it's just that it
wouldn't be as fast. So most of the points are moot. Only 5 & 6
remain. Also, 7 isn't exactly true, since it would require a extra
compare operation in the back-end to see if a default was given and
return that, or else return 0. But that is probably negligible.

Regarding 5 & 6. I benchmarked some code against the default to_i/_f.
Here are the code and the results:

$ cat test.rb && ./test.rb
#!/usr/bin/env ruby

class String
  def to_i2(default=0)
    Integer(self) rescue default
  end
  def to_f2(default=0)
    Float(self) rescue default
  end
  def num?
    self =~ /^[-+.0-9]+$/
  end
  def to_i3(default=0)
    self.num? ? self.to_i : default
  end
  def to_f3(default=0)
    self.num? ? self.to_f : default
  end
end

require 'benchmark'

s1 = "10"
s2 = "10a"
s3 = "1.0"
s4 = "1.0a"

n = 1000000
Benchmark.bm { |x|
  x.report("to_i valid ") { n.times { s1.to_i } }
  x.report("to_i invalid ") { n.times { s2.to_i } }
  x.report("to_f valid ") { n.times { s3.to_f } }
  x.report("to_f invalid ") { n.times { s4.to_f } }
  x.report("to_i2 valid ") { n.times { s1.to_i2 } }
  x.report("to_i2 invalid") { n.times { s2.to_i2 } }
  x.report("to_f2 valid ") { n.times { s3.to_f2 } }
  x.report("to_f2 invalid") { n.times { s4.to_f2 } }
  x.report("to_i3 valid ") { n.times { s1.to_i3 } }
  x.report("to_i3 invalid") { n.times { s2.to_i3 } }
  x.report("to_f3 valid ") { n.times { s3.to_f3 } }
  x.report("to_f3 invalid") { n.times { s4.to_f3 } }
}

      user system total real
to_i valid 1.160000 0.110000 1.270000 ( 1.307932)
to_i invalid 1.180000 0.100000 1.280000 ( 1.318455)
to_f valid 1.570000 0.190000 1.760000 ( 1.788322)
to_f invalid 1.980000 0.090000 2.070000 ( 2.105102)
to_i2 valid 2.310000 0.350000 2.660000 ( 2.703812)
to_i2 invalid 39.640000 1.240000 40.880000 ( 42.264511)
to_f2 valid 2.880000 0.310000 3.190000 ( 3.377140)
to_f2 invalid 40.680000 1.100000 41.780000 ( 43.211592)
to_i3 valid 6.470000 0.390000 6.860000 ( 6.975072)
to_i3 invalid 3.400000 0.350000 3.750000 ( 3.959219)
to_f3 valid 7.250000 0.320000 7.570000 ( 7.605764)
to_f3 invalid 3.600000 0.380000 3.980000 ( 4.005525)

As you can see, you were correct about point 5 when it is the
exceptional case; however, regarding point 6, performance is close to
within an order of magnitude of the built-in versions of to_i/_f.
That's not too awful.

If I may make three counter-points against your suggestion:

1.) It is wierd and completely unintuitive for to_i to return anything
*other than integer*! Maybe it's just me, but that would be like
calling to_a and getting back a String. Holy return types Batman, what
gives?

2.) Would a non-zero default really be used enough (or in cases where
the speed of using something like the code I listed above with regexps
is not fast enougg) to warrant inclusion? Do you have any real world
examples that are not just corner-cases?

3.) (Like Ara said...) If you're worried about the performance of
exceptions, how helpful is it to do something like: "10a".to_i(nil) %
2? That's either going to terminate with a NoMethodError, or you'll
have to rescue it (eating just as much cycles).

Regards,
Jordan

···

On Nov 26, 11:46 pm, Mr Magpie <gazmcghees...@yahoo.com.au> wrote:

The benefits my suggestion provides are :
1) allows an application specific default (of any type) to be supplied,
reducing code required.
2) allows bad input to be unambigously detected, which (can distinguish
"fds".to_i from "0".to_i)
3) because the result of to_i always evaluates to true, you can't do
     num.to_i ? 'valid int' : 'invalid int'
but with my sugestion you could do
     num.to_i(false) ? 'valid int' : 'invalid int'
4) would be a miniscule change to the existing optimised C unlike some
monkey patch I could do
5) would avoid performance-sapping exceptions
6) would avoid expensive regular expressions
7) as a default parameter, wouldn't affect existing code.

Hi,

···

In message "Re: FEATURE SUGGESTION: Accept default value for to_f and to_i" on Tue, 27 Nov 2007 14:46:50 +0900, Mr Magpie <gazmcgheesubs@yahoo.com.au> writes:

3) because the result of to_i always evaluates to true, you can't do
    num.to_i ? 'valid int' : 'invalid int'
but with my sugestion you could do
    num.to_i(false) ? 'valid int' : 'invalid int'

Argument for String#to_i is already taken for base specification, i.e.

  "abcd".to_i(16) # => 43981

              matz.

6) would avoid expensive regular expressions

First, you'd have to conjure up some expensive regular expressions,
you'll find that regular expressions are much more efficient that you
might think.

Pointless micro-benchmark time. String input of 'ab', 1 million
iterations.

                                                        user system
total real
string.to_i 0.625000 0.000000
0.625000 ( 0.657000)
Integer(string) rescue 57 32.422000 0.782000
33.204000 ( 34.844000)
/^-?\d+$/===string ? string.to_i : 57 1.125000 0.000000
1.125000 ( 1.218000)
string.to_f 0.718000 0.000000
0.718000 ( 0.843000)
Float(string) rescue 57 32.281000 0.765000
33.046000 ( 34.750000)
/^-?\d+(?=\.\d+)?$/===string ? string.to_f : 57 0.672000 0.000000
0.672000 ( 0.734000)

The only advantage to your proposal is to optimise an exceptional case.
If it's not an exceptional case, regex validation gives you almost as
much speed as you'd get with raw C.

Once you've written an application with this functionality, benchmarked
it, and found that that validation of string data as numeric is your
problem, you can go off and write a C extension to do what you want.
Raising this discussion before that point is just wasting your time.

Dan.

Not wanting to enter into the discussion I believe that OP's idea is a
sound one, it might however be better to allow default behavior be
expressed by a block.

def to_i &blk
    return conversion if valid
    return blk.call if blk
    ##" The tricky part here
    nil or 0, well 0 for backward compatibility
end

Now I would use this very often

s.to_i do raise MyError, "What a numba??" end

better to raise MyError than what #Integer(str) raises, right ;).

cheers
R.

···

On Nov 27, 2007 8:32 AM, Yukihiro Matsumoto <matz@ruby-lang.org> wrote:

Hi,

In message "Re: FEATURE SUGGESTION: Accept default value for to_f and to_i" > on Tue, 27 Nov 2007 14:46:50 +0900, Mr Magpie <gazmcgheesubs@yahoo.com.au> writes:

>3) because the result of to_i always evaluates to true, you can't do
> num.to_i ? 'valid int' : 'invalid int'
>but with my sugestion you could do
> num.to_i(false) ? 'valid int' : 'invalid int'

Argument for String#to_i is already taken for base specification, i.e.

  "abcd".to_i(16) # => 43981

                                                        matz.

--

http://ruby-smalltalk.blogspot.com/

---
All truth passes through three stages. First, it is ridiculed. Second,
it is violently opposed. Third, it is accepted as being self-evident.
Schopenhauer (attr.)

Another point you did not mention (as far as I can see): optimizing
the performance of the /exceptional/ case is likely to yield only
minor benefits if at all.

Kind regards

robert

···

2007/11/27, MonkeeSage <MonkeeSage@gmail.com>:

On Nov 26, 11:46 pm, Mr Magpie <gazmcghees...@yahoo.com.au> wrote:
> The benefits my suggestion provides are :
> 1) allows an application specific default (of any type) to be supplied,
> reducing code required.
> 2) allows bad input to be unambigously detected, which (can distinguish
> "fds".to_i from "0".to_i)
> 3) because the result of to_i always evaluates to true, you can't do
> num.to_i ? 'valid int' : 'invalid int'
> but with my sugestion you could do
> num.to_i(false) ? 'valid int' : 'invalid int'
> 4) would be a miniscule change to the existing optimised C unlike some
> monkey patch I could do
> 5) would avoid performance-sapping exceptions
> 6) would avoid expensive regular expressions
> 7) as a default parameter, wouldn't affect existing code.

I could be dense; well, I probably am. No, I'm sure about it. :wink: But
let me give it a go anyhow...

All of the functionality you mention can be had now, it's just that it
wouldn't be as fast. So most of the points are moot. Only 5 & 6
remain. Also, 7 isn't exactly true, since it would require a extra
compare operation in the back-end to see if a default was given and
return that, or else return 0. But that is probably negligible.

Regarding 5 & 6. I benchmarked some code against the default to_i/_f.
Here are the code and the results:

$ cat test.rb && ./test.rb
#!/usr/bin/env ruby

class String
  def to_i2(default=0)
    Integer(self) rescue default
  end
  def to_f2(default=0)
    Float(self) rescue default
  end
  def num?
    self =~ /^[-+.0-9]+$/
  end
  def to_i3(default=0)
    self.num? ? self.to_i : default
  end
  def to_f3(default=0)
    self.num? ? self.to_f : default
  end
end

require 'benchmark'

s1 = "10"
s2 = "10a"
s3 = "1.0"
s4 = "1.0a"

n = 1000000
Benchmark.bm { |x|
  x.report("to_i valid ") { n.times { s1.to_i } }
  x.report("to_i invalid ") { n.times { s2.to_i } }
  x.report("to_f valid ") { n.times { s3.to_f } }
  x.report("to_f invalid ") { n.times { s4.to_f } }
  x.report("to_i2 valid ") { n.times { s1.to_i2 } }
  x.report("to_i2 invalid") { n.times { s2.to_i2 } }
  x.report("to_f2 valid ") { n.times { s3.to_f2 } }
  x.report("to_f2 invalid") { n.times { s4.to_f2 } }
  x.report("to_i3 valid ") { n.times { s1.to_i3 } }
  x.report("to_i3 invalid") { n.times { s2.to_i3 } }
  x.report("to_f3 valid ") { n.times { s3.to_f3 } }
  x.report("to_f3 invalid") { n.times { s4.to_f3 } }
}

      user system total real
to_i valid 1.160000 0.110000 1.270000 ( 1.307932)
to_i invalid 1.180000 0.100000 1.280000 ( 1.318455)
to_f valid 1.570000 0.190000 1.760000 ( 1.788322)
to_f invalid 1.980000 0.090000 2.070000 ( 2.105102)
to_i2 valid 2.310000 0.350000 2.660000 ( 2.703812)
to_i2 invalid 39.640000 1.240000 40.880000 ( 42.264511)
to_f2 valid 2.880000 0.310000 3.190000 ( 3.377140)
to_f2 invalid 40.680000 1.100000 41.780000 ( 43.211592)
to_i3 valid 6.470000 0.390000 6.860000 ( 6.975072)
to_i3 invalid 3.400000 0.350000 3.750000 ( 3.959219)
to_f3 valid 7.250000 0.320000 7.570000 ( 7.605764)
to_f3 invalid 3.600000 0.380000 3.980000 ( 4.005525)

As you can see, you were correct about point 5 when it is the
exceptional case; however, regarding point 6, performance is close to
within an order of magnitude of the built-in versions of to_i/_f.
That's not too awful.

If I may make three counter-points against your suggestion:

1.) It is wierd and completely unintuitive for to_i to return anything
*other than integer*! Maybe it's just me, but that would be like
calling to_a and getting back a String. Holy return types Batman, what
gives?

2.) Would a non-zero default really be used enough (or in cases where
the speed of using something like the code I listed above with regexps
is not fast enougg) to warrant inclusion? Do you have any real world
examples that are not just corner-cases?

3.) (Like Ara said...) If you're worried about the performance of
exceptions, how helpful is it to do something like: "10a".to_i(nil) %
2? That's either going to terminate with a NoMethodError, or you'll
have to rescue it (eating just as much cycles).

--
use.inject do |as, often| as.you_can - without end

All of the functionality you mention can be had now, it's just that it
wouldn't be as fast. So most of the points are moot. Only 5 & 6
remain. Also, 7 isn't exactly true, since it would require a extra
compare operation in the back-end to see if a default was given and
return that, or else return 0. But that is probably negligible.

Wow, thanks for doing the numbers Jordan.

I know it can be done now, but such basic functionality is best done
fast and right ie. in C. There would be zillions of examples of tight
loops in frameworks, libraries and peoples applications out there that
does string to number conversions, eg. a SQL results to a Fixnum.

Some have said that performance is less of an issue in the exceptional
case, but just how exceptional bad input is depends on the application,
and shouldn't cause a 20x time difference. eg if 1 in 20 input values
are bad, the conversion takes twice as long.

<very useful numbers deleted, see previous post>

If I may make three counter-points against your suggestion:

1.) It is wierd and completely unintuitive for to_i to return anything
*other than integer*! Maybe it's just me, but that would be like
calling to_a and getting back a String. Holy return types Batman, what
gives?

I get this, but it would only do so because "you asked for it". This
kind of thing isn't uncommon in Ruby though.

2.) Would a non-zero default really be used enough (or in cases where
the speed of using something like the code I listed above with regexps
is not fast enougg) to warrant inclusion? Do you have any real world
examples that are not just corner-cases?

If I was implementing Ruby I would lean towards nil as the default (0
would come a close second best in my mind). It would allow the 'or'
operators to be used for any defaults eg. (aString.to_i || 0) would
achieve a default of 0.

The most common example that comes to mind is when reading in
configuration where you are reading a value from a string source eg. xml
and if a value isn't provided you return a sensible default which isn't
normally 0.

3.) (Like Ara said...) If you're worried about the performance of
exceptions, how helpful is it to do something like: "10a".to_i(nil) %
2? That's either going to terminate with a NoMethodError, or you'll
have to rescue it (eating just as much cycles).

In that example, you asked for a nil default, and thats what you got.

matz reminds us that to_i already takes a base argument. I guess the
default value would have to be the second default argument - not so
pretty.

Robert suggests a block handler. I don't know what the performance
implications are of blocks, but I guess it would work, and obviously
allow more advanced handling. Most of the time however I would just
return a value, not do any logic.

<Suggestion>

Because of the existing base argument on to_i, and the need to keep such
basic methods simple and fast, and the 7 points I listed previously, I
propose the following :

as_i(default=nil) and as_f(default=nil) methods added to Fixnum, Float,
String
For Float.as_i, NaN, Infinity etc would return the default.

If I'm outnumbered on the default argument, then as_i and as_f could
simply be equivalent to to_i and to_f, just with a nil default. I would
then use (aString.as_i || DEFAULT_VALUE).

If enough people would use an optional block and its not a significant
performance drag, that could be added too.

</Suggestion>

Thanks again Jordan for the numbers,

magpie.

···

--
Posted via http://www.ruby-forum.com/\.

NP. I was curious about the performance penalty myself. Might I
suggest, if you think it is truly worthy, that you write a small ruby
extension in C to add as_i/_f to class String. You could get the
behavior and speed you desire, and still be compatible with mri, and
if enough people found it useful it could find its way into the
standard lib.

Regards,
Jordan

···

On Nov 27, 8:18 pm, Mr Magpie <gazmcghees...@yahoo.com.au> wrote:

> All of the functionality you mention can be had now, it's just that it
> wouldn't be as fast. So most of the points are moot. Only 5 & 6
> remain. Also, 7 isn't exactly true, since it would require a extra
> compare operation in the back-end to see if a default was given and
> return that, or else return 0. But that is probably negligible.

Wow, thanks for doing the numbers Jordan.

I know it can be done now, but such basic functionality is best done
fast and right ie. in C. There would be zillions of examples of tight
loops in frameworks, libraries and peoples applications out there that
does string to number conversions, eg. a SQL results to a Fixnum.

Some have said that performance is less of an issue in the exceptional
case, but just how exceptional bad input is depends on the application,
and shouldn't cause a 20x time difference. eg if 1 in 20 input values
are bad, the conversion takes twice as long.

<very useful numbers deleted, see previous post>

> If I may make three counter-points against your suggestion:

> 1.) It is wierd and completely unintuitive for to_i to return anything
> *other than integer*! Maybe it's just me, but that would be like
> calling to_a and getting back a String. Holy return types Batman, what
> gives?

I get this, but it would only do so because "you asked for it". This
kind of thing isn't uncommon in Ruby though.

> 2.) Would a non-zero default really be used enough (or in cases where
> the speed of using something like the code I listed above with regexps
> is not fast enougg) to warrant inclusion? Do you have any real world
> examples that are not just corner-cases?

If I was implementing Ruby I would lean towards nil as the default (0
would come a close second best in my mind). It would allow the 'or'
operators to be used for any defaults eg. (aString.to_i || 0) would
achieve a default of 0.

The most common example that comes to mind is when reading in
configuration where you are reading a value from a string source eg. xml
and if a value isn't provided you return a sensible default which isn't
normally 0.

> 3.) (Like Ara said...) If you're worried about the performance of
> exceptions, how helpful is it to do something like: "10a".to_i(nil) %
> 2? That's either going to terminate with a NoMethodError, or you'll
> have to rescue it (eating just as much cycles).

In that example, you asked for a nil default, and thats what you got.

matz reminds us that to_i already takes a base argument. I guess the
default value would have to be the second default argument - not so
pretty.

Robert suggests a block handler. I don't know what the performance
implications are of blocks, but I guess it would work, and obviously
allow more advanced handling. Most of the time however I would just
return a value, not do any logic.

<Suggestion>

Because of the existing base argument on to_i, and the need to keep such
basic methods simple and fast, and the 7 points I listed previously, I
propose the following :

as_i(default=nil) and as_f(default=nil) methods added to Fixnum, Float,
String
For Float.as_i, NaN, Infinity etc would return the default.

If I'm outnumbered on the default argument, then as_i and as_f could
simply be equivalent to to_i and to_f, just with a nil default. I would
then use (aString.as_i || DEFAULT_VALUE).

If enough people would use an optional block and its not a significant
performance drag, that could be added too.

</Suggestion>

Thanks again Jordan for the numbers,

magpie.
--
Posted viahttp://www.ruby-forum.com/.