Why does 'Date::strptime' accept invalid strings as valid month names?

Hi,

I was looking for if I can use *Date::strptime* method to check if a string
can be validate as *valid* month or not, Thus I tested it with few random
strings. It was meeting my expectation too.

Date::strptime("January",'%B')
# => #<Date: 2014-01-01 ((2456659j,0s,0n),+0s,2299161j)>
Date::strptime("JaNuary",'%B')
# => #<Date: 2014-01-01 ((2456659j,0s,0n),+0s,2299161j)>
Date::strptime("JaNua",'%B')
# => #<Date: 2014-01-01 ((2456659j,0s,0n),+0s,2299161j)>
2.0.0-p451 :005 > Date::strptime("Foo",'%B')
# ArgumentError: invalid date
Date::strptime("Jan",'%B')
# => #<Date: 2014-01-01 ((2456659j,0s,0n),+0s,2299161j)>
Date::strptime("Janu",'%B')
# => #<Date: 2014-01-01 ((2456659j,0s,0n),+0s,2299161j)>

Now all are going above my head.

Date::strptime("Marche",'%B')
# => #<Date: 2014-03-01 ((2456718j,0s,0n),+0s,2299161j)>
Date::strptime("Marcheee",'%B')
#=> #<Date: 2014-03-01 ((2456718j,0s,0n),+0s,2299161j)>
Date::strptime("Mayil",'%B')
# => #<Date: 2014-05-01 ((2456779j,0s,0n),+0s,2299161j)>
Date::strptime("Juneixyt",'%B')
# => #<Date: 2014-06-01 ((2456810j,0s,0n),+0s,2299161j)>

Does that mean valid English month name checking this way is not safe ? :frowning:

···

--

Regards,
Arup Rakshit

Debugging is twice as hard as writing the code in the first place. Therefore,
if you write the code as cleverly as possible, you are, by definition, not
smart enough to debug it.

--Brian Kernighan

It looks like MRI is checking the words only up to the month's name
length (check: ruby/ext/date/date_strptime.c at a3a6da5ec51c17390a6c8c6fe733a3a59eca4a69 · ruby/ruby · GitHub)

     case 'B':
     case 'b':
     case 'h':
{
   int i;

   for (i = 0; i < (int)sizeof_array(month_names); i++) {
size_t l = strlen(month_names[i]);
if (strncasecmp(month_names[i], &str[si], l) == 0) {
   si += l;
   set_hash("mon", INT2FIX((i % 12) + 1));
   goto matched;
}
   }
   fail();
}

So, if your string contains the name of the month, it will match, even
though it contains further characters.

Jesus.

···

On Mon, Jun 9, 2014 at 4:35 PM, Arup Rakshit <aruprakshit@rocketmail.com> wrote:

Hi,

I was looking for if I can use *Date::strptime* method to check if a string
can be validate as *valid* month or not, Thus I tested it with few random
strings. It was meeting my expectation too.

Date::strptime("January",'%B')
# => #<Date: 2014-01-01 ((2456659j,0s,0n),+0s,2299161j)>
Date::strptime("JaNuary",'%B')
# => #<Date: 2014-01-01 ((2456659j,0s,0n),+0s,2299161j)>
Date::strptime("JaNua",'%B')
# => #<Date: 2014-01-01 ((2456659j,0s,0n),+0s,2299161j)>
2.0.0-p451 :005 > Date::strptime("Foo",'%B')
# ArgumentError: invalid date
Date::strptime("Jan",'%B')
# => #<Date: 2014-01-01 ((2456659j,0s,0n),+0s,2299161j)>
Date::strptime("Janu",'%B')
# => #<Date: 2014-01-01 ((2456659j,0s,0n),+0s,2299161j)>

Now all are going above my head.

Date::strptime("Marche",'%B')
# => #<Date: 2014-03-01 ((2456718j,0s,0n),+0s,2299161j)>
Date::strptime("Marcheee",'%B')
#=> #<Date: 2014-03-01 ((2456718j,0s,0n),+0s,2299161j)>
Date::strptime("Mayil",'%B')
# => #<Date: 2014-05-01 ((2456779j,0s,0n),+0s,2299161j)>
Date::strptime("Juneixyt",'%B')
# => #<Date: 2014-06-01 ((2456810j,0s,0n),+0s,2299161j)>

Does that mean valid English month name checking this way is not safe ? :frowning:

Some more exception. How the below are valid also, as junks between the actual
names also not making Ruby to throw error as *Invalid date*, as it supposed to
do:

2.0.0-p451 :003 > Date::strptime("Jantuary",'%B')
=> #<Date: 2014-01-01 ((2456659j,0s,0n),+0s,2299161j)>
2.0.0-p451 :004 > Date::strptime("Janxtuary",'%B')
=> #<Date: 2014-01-01 ((2456659j,0s,0n),+0s,2299161j)>
2.0.0-p451 :005 >

···

On Monday, June 09, 2014 05:56:14 PM Jesús Gabriel y Galán wrote:

So, if your string contains the name of the month, it will match, even
though it contains further characters.

Jesus.

--

Regards,
Arup Rakshit

Debugging is twice as hard as writing the code in the first place. Therefore,
if you write the code as cleverly as possible, you are, by definition, not
smart enough to debug it.

--Brian Kernighan

You can see in the link I pasted before, that it's trying to match the
following names:

static const char *month_names = {
    "January", "February", "March", "April",
    "May", "June", "July", "August", "September",
    "October", "November", "December",
    "Jan", "Feb", "Mar", "Apr", "May", "Jun",
    "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"
};

So, even though January doesn't match, Jan does, and so it is parsed.

Jesus.

···

On Mon, Jun 9, 2014 at 5:20 PM, Arup Rakshit <aruprakshit@rocketmail.com> wrote:

On Monday, June 09, 2014 05:56:14 PM Jesús Gabriel y Galán wrote:

So, if your string contains the name of the month, it will match, even
though it contains further characters.

Jesus.

Some more exception. How the below are valid also, as junks between the actual
names also not making Ruby to throw error as *Invalid date*, as it supposed to
do:

2.0.0-p451 :003 > Date::strptime("Jantuary",'%B')
=> #<Date: 2014-01-01 ((2456659j,0s,0n),+0s,2299161j)>
2.0.0-p451 :004 > Date::strptime("Janxtuary",'%B')
=> #<Date: 2014-01-01 ((2456659j,0s,0n),+0s,2299161j)>
2.0.0-p451 :005 >

Another thing to note here for Time objects. I used *Time#strftim* by mistake
with *%Q* option and got the below output, which forced me to check the doc
and I found it is not there. I expected some error,while invalid *string
formats*. But it gave me the *string* back.

time = Time.now
time.strftime("%Q") # => "%Q"

···

On Monday, June 09, 2014 06:23:56 PM Jesús Gabriel y Galán wrote:

So, even though January doesn't match, Jan does, and so it is parsed.

Jesus.

--

Regards,
Arup Rakshit

Debugging is twice as hard as writing the code in the first place. Therefore,
if you write the code as cleverly as possible, you are, by definition, not
smart enough to debug it.

--Brian Kernighan

So, even though January doesn't match, Jan does, and so it is parsed.

Jesus.

Another thing to note here for Time objects. I used *Time#strftim* by mistake
with *%Q* option and got the below output, which forced me to check the doc
and I found it is not there. I expected some error,while invalid *string
formats*. But it gave me the *string* back.

From the doco:

The directives begin with a percent (%) character. Any text not listed as a
directive will be passed through to the output string.

I interpret that to mean that "%<anything-not-actually-a-directive>" will pass through just like your example:

···

On Jun 10, 2014, at 12:05, Arup Rakshit <aruprakshit@rocketmail.com> wrote:

On Monday, June 09, 2014 06:23:56 PM Jesús Gabriel y Galán wrote:

time = Time.now
time.strftime("%Q") # => "%Q"

pass through just like your example:

Thank Ryan. Yes it is mentioned in the doc, and somehow I skipped that while I
was reading the doc.

···

On Tuesday, June 10, 2014 04:10:49 PM Ryan Davis wrote:

From the doco:
> The directives begin with a percent (%) character. Any text not listed as
> a
> directive will be passed through to the output string.

I interpret that to mean that "%<anything-not-actually-a-directive>" will

--

Regards,
Arup Rakshit

Debugging is twice as hard as writing the code in the first place. Therefore,
if you write the code as cleverly as possible, you are, by definition, not
smart enough to debug it.

--Brian Kernighan