Scanning a string for decimal numbers

David Vallner wrote:

Dňa Utorok 14 Február 2006 19:07 Jeppe Jakobsen napísal:

Thank you for clearing things, up for me, but could you explain what
the last part of the expression Wilson provided me with means?

it's (?=[^\d])

These is equivalent (?=\D)

A negative lookahead might work, too: (?!\d)

That's a positive zero-width lookahead. I think. Gotta love
regexspeak.

In English: look for a single character that's not a decimal digit,
and don't include it in the match.

I'd go with this quite simple regexp

/[-+]?\d+(?:,\d+)?/

If numbers like "1," should be detected, too, then just change the "+" in
the last group to "*".

If one wants to prevent to match numbers with leading zeros then it
becomes more complicated but it seems not be worth the effort in this
case.

Kind regards

    robert

ok, but I think but wouldn't this regex do the same for me?:

/[-+]?\d+\.?\d+/

Except that it will return an array containing my digit?

···

2006/2/12, Wilson Bilkovich <wilsonb@gmail.com>:

The scan process returns an array of arrays, so:
digits[0] is an Array containing '24.4'.
You could do:
digits.flatten!
just before digits[0], and get what you expect.

On 2/12/06, Jeppe Jakobsen <jeppe88@gmail.com> wrote:
> Yes that worked, but I intend to convert the digits of my array to
floats,
> and I get a NoMethodError on to_f now when I do this:
>
> digits[0] = digits[0].to_f
>
> I don't understand that :-/
>
>
> 2006/2/12, Wilson Bilkovich <wilsonb@gmail.com>:
> >
> > Well, that's what I get for dashing off a quick e-mail before dinner.
> > The last problem Alexis mentioned is caused by the overly-specific
> > lookahead at the end. Here's a version that fixes that:
> >
> > irb(main):013:0> a = '24.5 + 24 + 24. + 24.4.'
> > => "24.5 + 24 + 24. + 24.4."
> > irb(main):014:0> a.scan /[-+]?(\d+(?:\.\d+)?)(?=[^\d])/
> > => [["24.5"], ["24"], ["24"], ["24.4"]]
> > irb(main):015:0>
> >
> > One of the characters '-' or '+', optionally
> > Followed by at least one digit.
> > Followed by an optional group containing a period, and one or more
digits.
> > The capturing group ends when the next character is something other
> > than a digit.
> >
> > The (?:slight_smile: mess is there so that '24.' doesn't end up with the period on
the
> > end.
> >
> > On 2/11/06, Jeppe Jakobsen <jeppe88@gmail.com> wrote:
> > > Seems I accidently got my text marked as a qoute in my last mail, so
> > I'll
> > > just send it a again:
> > >
> > > Let me see if I got it right then. I'll like to use periods only for
my
> > > decimal numbers. I also need normal integers so 24. being accepted
won't
> > > matter. Will this fix the problems you presented?:
> > > /[-+]?(\d+\.?\d*)(?=\s|$)/
> > >
> > >
> > > I don't know if it takes care of the last problem, because I didn't
> > > understand it.
> > >
> > >
> > > 2006/2/12, Jeppe Jakobsen <jeppe88@gmail.com>:
> > > >
> > > > 2006/2/12, Alexis Reigel <mail@koffeinfrei.org>:
> > > > >
> > > > > >
> > > > > > This should handle periods or commas as the separator.
> > > > > >
> > > > > > a = "24,4 + 55,2 + 55 - 44,0"
> > > > > > => "24,4 + 55,2 + 55 - 44,0"
> > > > > > a.scan /(\d+,?.?\d*)(?=\s|$)/
> > > > > > => [["24,4"], ["55,2"], ["55"], ["44,0"]]
> > > > > >
> > > > >
> > > > > Some problems here:
> > > > > - signs are disregarded ("-24,4" becomes "24,4")
> > > > > - Invalid numbers are accepted: eg. "24,.4" "24,." "24." "24,"
> > > > > - "." should be escaped. As you used it here, it means "any
> > character"
> > > > > (except newline), so many invalid numbers are accepted (e.g.
> > "24w"...)
> > > > > - If something different from whitespace follows the number, it
is
> > not
> > > > > or false accepted, e.g. "24.4." becomes "4." instead of "24.4"
> > > > > - ...
> > > > >
> > > > >
> > > > > Alexis.
> > > >
> > > >
> > >
> > >
> >
> >
>
>
> --
> "winners never quit, quitters never win"
>
>

--
"winners never quit, quitters never win"

Yes, as long as the numbers are always at least two digits.

···

On 2/12/06, Jeppe Jakobsen <jeppe88@gmail.com> wrote:

ok, but I think but wouldn't this regex do the same for me?:

/[-+]?\d+\.?\d+/

Except that it will return an array containing my digit?

2006/2/12, Wilson Bilkovich <wilsonb@gmail.com>:
>
> The scan process returns an array of arrays, so:
> digits[0] is an Array containing '24.4'.
> You could do:
> digits.flatten!
> just before digits[0], and get what you expect.
>
>
> On 2/12/06, Jeppe Jakobsen <jeppe88@gmail.com> wrote:
> > Yes that worked, but I intend to convert the digits of my array to
> floats,
> > and I get a NoMethodError on to_f now when I do this:
> >
> > digits[0] = digits[0].to_f
> >
> > I don't understand that :-/
> >
> >
> > 2006/2/12, Wilson Bilkovich <wilsonb@gmail.com>:
> > >
> > > Well, that's what I get for dashing off a quick e-mail before dinner.
> > > The last problem Alexis mentioned is caused by the overly-specific
> > > lookahead at the end. Here's a version that fixes that:
> > >
> > > irb(main):013:0> a = '24.5 + 24 + 24. + 24.4.'
> > > => "24.5 + 24 + 24. + 24.4."
> > > irb(main):014:0> a.scan /[-+]?(\d+(?:\.\d+)?)(?=[^\d])/
> > > => [["24.5"], ["24"], ["24"], ["24.4"]]
> > > irb(main):015:0>
> > >
> > > One of the characters '-' or '+', optionally
> > > Followed by at least one digit.
> > > Followed by an optional group containing a period, and one or more
> digits.
> > > The capturing group ends when the next character is something other
> > > than a digit.
> > >
> > > The (?:slight_smile: mess is there so that '24.' doesn't end up with the period on
> the
> > > end.
> > >
> > > On 2/11/06, Jeppe Jakobsen <jeppe88@gmail.com> wrote:
> > > > Seems I accidently got my text marked as a qoute in my last mail, so
> > > I'll
> > > > just send it a again:
> > > >
> > > > Let me see if I got it right then. I'll like to use periods only for
> my
> > > > decimal numbers. I also need normal integers so 24. being accepted
> won't
> > > > matter. Will this fix the problems you presented?:
> > > > /[-+]?(\d+\.?\d*)(?=\s|$)/
> > > >
> > > >
> > > > I don't know if it takes care of the last problem, because I didn't
> > > > understand it.
> > > >
> > > >
> > > > 2006/2/12, Jeppe Jakobsen <jeppe88@gmail.com>:
> > > > >
> > > > > 2006/2/12, Alexis Reigel <mail@koffeinfrei.org>:
> > > > > >
> > > > > > >
> > > > > > > This should handle periods or commas as the separator.
> > > > > > >
> > > > > > > a = "24,4 + 55,2 + 55 - 44,0"
> > > > > > > => "24,4 + 55,2 + 55 - 44,0"
> > > > > > > a.scan /(\d+,?.?\d*)(?=\s|$)/
> > > > > > > => [["24,4"], ["55,2"], ["55"], ["44,0"]]
> > > > > > >
> > > > > >
> > > > > > Some problems here:
> > > > > > - signs are disregarded ("-24,4" becomes "24,4")
> > > > > > - Invalid numbers are accepted: eg. "24,.4" "24,." "24." "24,"
> > > > > > - "." should be escaped. As you used it here, it means "any
> > > character"
> > > > > > (except newline), so many invalid numbers are accepted (e.g.
> > > "24w"...)
> > > > > > - If something different from whitespace follows the number, it
> is
> > > not
> > > > > > or false accepted, e.g. "24.4." becomes "4." instead of "24.4"
> > > > > > - ...
> > > > > >
> > > > > >
> > > > > > Alexis.
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> >
> >
> > --
> > "winners never quit, quitters never win"
> >
> >
>
>

--
"winners never quit, quitters never win"