Parsing periods of time: Code and questions

Hi!

I am presently working on a new version of my feed aggregator
’rubric’. This time I am focussing on security issues so among other
changes the config file will no longer be a Ruby script. For parsing
a string that defines a period of time I wrote the following little
routine (GPL applies, long lines but still < 80 characters):

def to_sec(argument)
return argument if argument.class < Integer
if argument.class == String
case argument
when /^(.?)+,$/ then to_sec($1) + to_sec($2)
when /^\s
([0-9_]+)\s**(.+)$/ then $1.to_i * to_sec($2)
when /^\s*[0-9_]+\s*(s(ec(ond)?s?)?)?\s*$/ then argument.to_i
when /^\s*([0-9_]+)\sm(in(ute)?s?)?\s$/ then $1.to_i * 60
when /^\s*([0-9_]+)\sh(ours?)?\s$/ then $1.to_i * 3600
when /^\s*([0-9_]+)\sd(ays?)?\s$/ then $1.to_i * 86400
when /^\s*([0-9_]+)\sw(eeks?)?\s$/ then $1.to_i * 604800
when /^\s*([0-9_]+)\smonths?\s$/ then $1.to_i * 2419200
else 0
end
end
end

Two questions arise:

  1. Should one require integral values or would it be better to allow
    the use of decimal numbers?
  2. If decimal numbers are allowed would it be a good idea to allow
    the use of exponential notation?

What the above code allows you to do: You can specify

n seconds as: n, n s, n sec, n secs, n second, n seconds
n minutes as: n m, n min, n mins, n minute, n minutes
n hours as: n h, n hour, n hours
n weeks as: n w, n week, n weeks
n months as: n month n months

where a month has 28 days. You can use multiplications so that the
following values are identical:

  • 1 month
  • 4 weeks
  • 4 * 24 hours
  • 4 * 24 * 60 minutes
  • 4 * 24 * 60 * 60 seconds

You can also use additions which can be written in a mathematical way
using a ‘+’ sign or as you read (using colons) so that the following
values are identical:

  • 2 month + 3 weeks + 5 days + 7 hours + 11 minutes + 13 seconds
  • 2 month, 3 weeks, 5 days, 7 hours, 11 minutes, 13 seconds
  • 2 * 4 weeks, 3 * 7 days, 5 * 24 hours, 7 * 60 minutes,
    11 * 60 seconds, 13 seconds

Even ‘2 * 4 * 7 * 24 * 60 * 60 s, 3 * 7 * 24 * 60 * 60 s,
5 * 24 * 60 * 60 s, 7 * 60 * 60 s, 11 * 60 s, 13 s’ is possible
(although it is not recommended :-).

Josef ‘Jupp’ SCHUGT

···


http://oss.erdfunkstelle.de/ruby/ - German comp.lang.ruby-FAQ
http://rubyforge.org/users/jupp/ - Ruby projects at Rubyforge
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Germany 2004: To boldly spy where no GESTAPO / STASI has spied before

“Josef ‘Jupp’ SCHUGT” jupp@gmx.de schrieb im Newsbeitrag
news:20040120230057.GA3604@jupp%gmx.de

Hi!

I am presently working on a new version of my feed aggregator
‘rubric’. This time I am focussing on security issues so among other
changes the config file will no longer be a Ruby script. For parsing
a string that defines a period of time I wrote the following little
routine (GPL applies, long lines but still < 80 characters):

def to_sec(argument)
return argument if argument.class < Integer
if argument.class == String
case argument
when /^(.?)+,$/ then to_sec($1) +
to_sec($2)
when /^\s
([0-9_]+)\s**(.+)$/ then $1.to_i *
to_sec($2)
when /^\s*[0-9_]+\s*(s(ec(ond)?s?)?)?\s*$/ then argument.to_i
when /^\s*([0-9_]+)\sm(in(ute)?s?)?\s$/ then $1.to_i * 60
when /^\s*([0-9_]+)\sh(ours?)?\s$/ then $1.to_i * 3600
when /^\s*([0-9_]+)\sd(ays?)?\s$/ then $1.to_i * 86400
when /^\s*([0-9_]+)\sw(eeks?)?\s$/ then $1.to_i * 604800
when /^\s*([0-9_]+)\smonths?\s$/ then $1.to_i * 2419200
else 0
end
end
end

Two questions arise:

  1. Should one require integral values or would it be better to allow
    the use of decimal numbers?

Seems better to me. Seconds may in fact be fractional values.

  1. If decimal numbers are allowed would it be a good idea to allow
    the use of exponential notation?

Why not? The standard conversion does it anyway. But maybe I’d change
the code above to reduce the number of regexps. Maybe something along
these lines:

def to_sec(argument)
return argument if argument.class < Integer
if argument.class == String
tokens = argument.scan /[*+/-]
>[±]?\d+(.\d+)?([eE][±]?\d+)?
>\w+
/x
# process tokens
end
end

Well, a full blown version would use it’s own parser…

Regards

robert
  1. Should one require integral values or would it be better to allow
    the use of decimal numbers?

Only integers.

  1. If decimal numbers are allowed would it be a good idea to allow
    the use of exponential notation?

No.

I think the above should be simple, so the developer’s effort could be put
on the following wish :).

where a month has 28 days.

Wish: take into account the month the item was retrieved to calculate the
month’s length. So, an item retrieved on the 13th will be deleted on the
13th of the next month (with hold time of 1 month).

All sounds really interesting and useful. One question: Are you sure
about making a month arbitrarily 28 days? I imagine that for somebody
using this method for the first time it would be very easy for this to
cause a bug that might take a while to debug.

Francis

“Josef ‘Jupp’ SCHUGT” jupp@gmx.de wrote in message news:<20040120230057.GA3604@jupp%gmx.de>…

···

Hi!

I am presently working on a new version of my feed aggregator
‘rubric’. This time I am focussing on security issues so among other
changes the config file will no longer be a Ruby script. For parsing
a string that defines a period of time I wrote the following little
routine (GPL applies, long lines but still < 80 characters):

def to_sec(argument)
return argument if argument.class < Integer
if argument.class == String
case argument
when /^(.?)+,$/ then to_sec($1) + to_sec($2)
when /^\s
([0-9_]+)\s**(.+)$/ then $1.to_i * to_sec($2)
when /^\s*[0-9_]+\s*(s(ec(ond)?s?)?)?\s*$/ then argument.to_i
when /^\s*([0-9_]+)\sm(in(ute)?s?)?\s$/ then $1.to_i * 60
when /^\s*([0-9_]+)\sh(ours?)?\s$/ then $1.to_i * 3600
when /^\s*([0-9_]+)\sd(ays?)?\s$/ then $1.to_i * 86400
when /^\s*([0-9_]+)\sw(eeks?)?\s$/ then $1.to_i * 604800
when /^\s*([0-9_]+)\smonths?\s$/ then $1.to_i * 2419200
else 0
end
end
end

Two questions arise:

  1. Should one require integral values or would it be better to allow
    the use of decimal numbers?
  2. If decimal numbers are allowed would it be a good idea to allow
    the use of exponential notation?

What the above code allows you to do: You can specify

n seconds as: n, n s, n sec, n secs, n second, n seconds
n minutes as: n m, n min, n mins, n minute, n minutes
n hours as: n h, n hour, n hours
n weeks as: n w, n week, n weeks
n months as: n month n months

where a month has 28 days. You can use multiplications so that the
following values are identical:

  • 1 month
  • 4 weeks
  • 4 * 24 hours
  • 4 * 24 * 60 minutes
  • 4 * 24 * 60 * 60 seconds

You can also use additions which can be written in a mathematical way
using a ‘+’ sign or as you read (using colons) so that the following
values are identical:

  • 2 month + 3 weeks + 5 days + 7 hours + 11 minutes + 13 seconds
  • 2 month, 3 weeks, 5 days, 7 hours, 11 minutes, 13 seconds
  • 2 * 4 weeks, 3 * 7 days, 5 * 24 hours, 7 * 60 minutes,
    11 * 60 seconds, 13 seconds

Even ‘2 * 4 * 7 * 24 * 60 * 60 s, 3 * 7 * 24 * 60 * 60 s,
5 * 24 * 60 * 60 s, 7 * 60 * 60 s, 11 * 60 s, 13 s’ is possible
(although it is not recommended :-).

Josef ‘Jupp’ SCHUGT