Refactoring underlying C code for MRI Ruby Time and Date leap year - am I missing something

I've been looking at the C code for the Date class (and at some of the
C code for the Time class) and it seems to be possible to make some of
this quite a bit faster (from 5 to 70 times faster, depending on the
function, computer, compiler options, etc), partly by replacing some
floating point calculations by integer calculations, and partly by
using some different algorithms.

While doing that I found these leap year functions in
ruby-2.4.0/ext/date/date_core.c

#define NMOD(x,y) ((y)-(-((x)+1)%(y))-1)
#define MOD(n,d) ((n)<0 ? NMOD((n),(d)) : (n)%(d))

inline static int
c_gregorian_leap_p(int y)
{
    return (MOD(y, 4) == 0 && y % 100 != 0) || MOD(y, 400) == 0;
}

and in ruby-2.4.0/time.c

static int
leap_year_p(long y)
{
    return ((y % 4 == 0) && (y % 100 != 0)) || (y % 400 == 0);
}

Unless I'm missing something (which is not impossible):

1. Instead of using "MOD", the date_core.c code could just use the
time.c code, of course with int y instead of long y. I don't see that
there is any need to use "MOD" instead of "%": all we are interested
in here is whether y is or is not exactly divisible by 4, 100 or 400,
and it doesn't matter whether, for example,
    "y % 4" is implemented as "y - 4 * trunc(y / 4)" or "y - 4 * floor(y / 4)"

2. In the time.c code, suppose y is not exactly divisible by 4:

the first equality in ((y % 4 == 0) && (y % 100 != 0)) evaluates as false,
and, as I understand it (and unless the compiler notices this)
(y % 400 == 0) is then also evaluated as false.

Compare with this:
    return (y % 4 == 0) && ((y % 100 != 0) || (y % 400 == 0));
Here (y % 4 == 0) evaluates as false, and that's the end of it.

Using gcc to compile and test functions using both formats, the
re-factored return expression seems a little faster, suggesting that
the time.c code is unecessarily evaluating "(y % 400 == 0)" for the
three-quarters of years which are not exactly divisible by 4.

So as I see it, the time.c (and date_core.c) code isn't wrong, because
it gives the correct result but it is a little slower than it could
be, and this could be remedied by simply moving two brackets.

Unless I'm missing something, hence this post to Ruby-Talk before I
try to report this.

I've been looking at the C code for the Date class (and at some of the
C code for the Time class) and it seems to be possible to make some of
this quite a bit faster (from 5 to 70 times faster, depending on the
function, computer, compiler options, etc), partly by replacing some
floating point calculations by integer calculations, and partly by
using some different algorithms.

While doing that I found these leap year functions in
ruby-2.4.0/ext/date/date_core.c

#define NMOD(x,y) ((y)-(-((x)+1)%(y))-1)
#define MOD(n,d) ((n)<0 ? NMOD((n),(d)) : (n)%(d))

inline static int
c_gregorian_leap_p(int y)
{
   return (MOD(y, 4) == 0 && y % 100 != 0) || MOD(y, 400) == 0;
}

I think that this is accounting for the fact that the Gregorian calendar has no year "0", after 1 B.C. comes A.D. 1
See Proleptic Gregorian calendar - Wikipedia for too much detail. ;-/

You seem to be correct in that the grouping of terms could be changed to:

   return MOD(y, 4) == 0 && (y % 100 != 0 || MOD(y, 400) == 0);

to avoid the needless check against 400 most of the time.

-Rob

···

On 2017-Apr-3, at 16:18 , Colin Bartlett <colinb2r@googlemail.com> wrote:

and in ruby-2.4.0/time.c

static int
leap_year_p(long y)
{
   return ((y % 4 == 0) && (y % 100 != 0)) || (y % 400 == 0);
}

Unless I'm missing something (which is not impossible):

1. Instead of using "MOD", the date_core.c code could just use the
time.c code, of course with int y instead of long y. I don't see that
there is any need to use "MOD" instead of "%": all we are interested
in here is whether y is or is not exactly divisible by 4, 100 or 400,
and it doesn't matter whether, for example,
   "y % 4" is implemented as "y - 4 * trunc(y / 4)" or "y - 4 * floor(y / 4)"

2. In the time.c code, suppose y is not exactly divisible by 4:

the first equality in ((y % 4 == 0) && (y % 100 != 0)) evaluates as false,
and, as I understand it (and unless the compiler notices this)
(y % 400 == 0) is then also evaluated as false.

Compare with this:
   return (y % 4 == 0) && ((y % 100 != 0) || (y % 400 == 0));
Here (y % 4 == 0) evaluates as false, and that's the end of it.

Using gcc to compile and test functions using both formats, the
re-factored return expression seems a little faster, suggesting that
the time.c code is unecessarily evaluating "(y % 400 == 0)" for the
three-quarters of years which are not exactly divisible by 4.

So as I see it, the time.c (and date_core.c) code isn't wrong, because
it gives the correct result but it is a little slower than it could
be, and this could be remedied by simply moving two brackets.

Unless I'm missing something, hence this post to Ruby-Talk before I
try to report this.

Unsubscribe: <mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk&gt;

I've been looking at the C code for the Date class (and at some of the
C code for the Time class) and it seems to be possible to make some of
this quite a bit faster (from 5 to 70 times faster, depending on the
function, computer, compiler options, etc), partly by replacing some
floating point calculations by integer calculations, and partly by
using some different algorithms.

While doing that I found these leap year functions in
ruby-2.4.0/ext/date/date_core.c

#define NMOD(x,y) ((y)-(-((x)+1)%(y))-1)
#define MOD(n,d) ((n)<0 ? NMOD((n),(d)) : (n)%(d))

inline static int
c_gregorian_leap_p(int y)
{
    return (MOD(y, 4) == 0 && y % 100 != 0) || MOD(y, 400) == 0;
}

and in ruby-2.4.0/time.c

static int
leap_year_p(long y)
{
    return ((y % 4 == 0) && (y % 100 != 0)) || (y % 400 == 0);
}

Unless I'm missing something (which is not impossible):

1. Instead of using "MOD", the date_core.c code could just use the
time.c code, of course with int y instead of long y.

Is it guaranteed that for all x : long this holds:

x % 100 == ((int) x) % 100

? Put differently, since 100 is not a pure power of two wouldn't we
change the outcome of that calculation?

I don't see that
there is any need to use "MOD" instead of "%": all we are interested
in here is whether y is or is not exactly divisible by 4, 100 or 400,
and it doesn't matter whether, for example,
    "y % 4" is implemented as "y - 4 * trunc(y / 4)" or "y - 4 * floor(y / 4)"

If you care for efficiency and we do not have negative values we could
also replace x % 4 == 0 with

x & 0x3 == 0

But I would consider that a micro optimization that the compiler might
do anyway.

So as I see it, the time.c (and date_core.c) code isn't wrong, because
it gives the correct result but it is a little slower than it could
be, and this could be remedied by simply moving two brackets.

Can you show some figures about the speed difference? It would also be
interesting what difference it makes for real Ruby programs.

Kind regards

robert

···

On Mon, Apr 3, 2017 at 10:18 PM, Colin Bartlett <colinb2r@googlemail.com> wrote:

--
[guy, jim, charlie].each {|him| remember.him do |as, often| as.you_can
- without end}
http://blog.rubybestpractices.com/

Personally, I would unit test all preprocessor procedures.

···

On Tue, Apr 4, 2017 at 10:14 AM, Rob Biedenharn <rob.biedenharn@gmail.com> wrote:

On 2017-Apr-3, at 16:18 , Colin Bartlett <colinb2r@googlemail.com> wrote:

I've been looking at the C code for the Date class (and at some of the
C code for the Time class) and it seems to be possible to make some of
this quite a bit faster (from 5 to 70 times faster, depending on the
function, computer, compiler options, etc), partly by replacing some
floating point calculations by integer calculations, and partly by
using some different algorithms.

While doing that I found these leap year functions in
ruby-2.4.0/ext/date/date_core.c

#define NMOD(x,y) ((y)-(-((x)+1)%(y))-1)
#define MOD(n,d) ((n)<0 ? NMOD((n),(d)) : (n)%(d))

inline static int
c_gregorian_leap_p(int y)
{
   return (MOD(y, 4) == 0 && y % 100 != 0) || MOD(y, 400) == 0;
}

I think that this is accounting for the fact that the Gregorian calendar has no year "0", after 1 B.C. comes A.D. 1
See Proleptic Gregorian calendar - Wikipedia for too much detail. ;-/

You seem to be correct in that the grouping of terms could be changed to:

   return MOD(y, 4) == 0 && (y % 100 != 0 || MOD(y, 400) == 0);

to avoid the needless check against 400 most of the time.

-Rob

and in ruby-2.4.0/time.c

static int
leap_year_p(long y)
{
   return ((y % 4 == 0) && (y % 100 != 0)) || (y % 400 == 0);
}

Unless I'm missing something (which is not impossible):

1. Instead of using "MOD", the date_core.c code could just use the
time.c code, of course with int y instead of long y. I don't see that
there is any need to use "MOD" instead of "%": all we are interested
in here is whether y is or is not exactly divisible by 4, 100 or 400,
and it doesn't matter whether, for example,
   "y % 4" is implemented as "y - 4 * trunc(y / 4)" or "y - 4 * floor(y / 4)"

2. In the time.c code, suppose y is not exactly divisible by 4:

the first equality in ((y % 4 == 0) && (y % 100 != 0)) evaluates as false,
and, as I understand it (and unless the compiler notices this)
(y % 400 == 0) is then also evaluated as false.

Compare with this:
   return (y % 4 == 0) && ((y % 100 != 0) || (y % 400 == 0));
Here (y % 4 == 0) evaluates as false, and that's the end of it.

Using gcc to compile and test functions using both formats, the
re-factored return expression seems a little faster, suggesting that
the time.c code is unecessarily evaluating "(y % 400 == 0)" for the
three-quarters of years which are not exactly divisible by 4.

So as I see it, the time.c (and date_core.c) code isn't wrong, because
it gives the correct result but it is a little slower than it could
be, and this could be remedied by simply moving two brackets.

Unless I'm missing something, hence this post to Ruby-Talk before I
try to report this.

Unsubscribe: <mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk&gt;

Unsubscribe: <mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk&gt;

I think that this is accounting for the fact that the Gregorian calendar
has no year "0", after 1 B.C. comes A.D. 1
See Proleptic Gregorian calendar - Wikipedia for too
much detail. ;-/

My apologies for my delay in replying. The proleptic Gregorian Calendar is
something I hadn't looked at in detail, but after quickly looking at that
article, as I understand it if the year before year 1 CE is year 1 BCE then
that year 1 BCE is a leap year, so I don't think that the absence of year 0
is the reason for Date using "MOD". Also Date is using the proleptic
Gregorian calendar in the version with a year 0.
require "date"
v = Date.civil(1, 1, 1, -1.0/0.0) #=> #<Date: 0001-01-01
((1721426j,0s,0n),+0s,-Infj)>
v - 1 #=> #<Date: 0000-12-31 ((1721425j,0s,0n),+0s,-Infj)>

You seem to be correct in that the grouping of terms could be changed to:

   return MOD(y, 4) == 0 && (y % 100 != 0 || MOD(y, 400) == 0);

to avoid the needless check against 400 most of the time.

Thanks - I was at least 99.9% sure of that, which made me wonder what - if
anything - I was missing. As I say in my reply to Robert Klemme below, I
must confess that my main objection to the Date formulation is why not just
use the Time formulation, and my main objection to the Time formulation is
that it offends my mathematical sense of aesthetics!

···

On Tue, Apr 4, 2017 at 6:14 PM, Rob Biedenharn <rob.biedenharn@gmail.com> wrote:

First, my apologies for my delay in replying. Thanks for your comments.

Is it guaranteed that for all x : long this holds:
x % 100 == ((int) x) % 100
Put differently, since 100 is not a pure power of two
wouldn't we change the outcome of that calculation?

I suspect that it is not guaranteed that that identity holds, but I wasn't
suggesting converting a "long" to an "int". (More precisely, I didn't
intend to suggest that conversion.) We may be talking at cross-purposes
here, so to be explicit:

ruby-2.4.0/time.c uses:
static int
leap_year_p(long y)
{
    return ((y % 4 == 0) && (y % 100 != 0)) || (y % 400 == 0);
}

ruby-2.4.0/ext/date/date_core.c uses:
#define NMOD(x,y) ((y)-(-((x)+1)%(y))-1)
#define MOD(n,d) ((n)<0 ? NMOD((n),(d)) : (n)%(d))
inline static int
c_gregorian_leap_p(int y)
{
    return (MOD(y, 4) == 0 && y % 100 != 0) || MOD(y, 400) == 0;
}

But why not just use:
inline static int
c_gregorian_leap_p(int y)
{
    return ((y % 4) == 0 && y % 100 != 0) || (y % 400) == 0;
}

Also, why use MOD(y, 4) and MOD(y, 400), but "y % 100" and not "MOD(y,
100)"?

***** ***** *****
I must confess that my main objection to the Date c_gregorian_leap_p code
is that it offends my sense of mathematical aesthetics! It uses a somewhat
more complicated - and aesthetically mixed - calculation than Time
leap_year_p for, I suspect, no particularly good reason.

Aesthetics, rather than speed, is also my main objection to the Time
leap_year_p code: from my perspective if a year isn't exactly divisible by
4 it's definitely not a leap year, so now ignore it, but that's not what
the Time code does. If the Time code was actually faster than what I
consider to be aesthetically better code, then I'd start thinking about
whether the speed increase is worth the worse aesthetics, but as it doesn't
seem faster I prefer the - from my view - aesthetically better code.

I strongly suspect the impact of this on the speed of Ruby will be minimal,
so I wouldn't be raising it if it weren't that I think there are other
changes which could be made to code in Date (and maybe also in Time) which
might be significantly faster, and - at any rate in some cases - give
shorter and/or cleaner code. (For example on a modernish laptap the Date
jd_to_ordinal C code takes about 800 nanoseconds, and refactoring reduces
this to about 40 nanoseconds; on an oldish netbook the timings are about
1400 nanoseconds reduced to about 160 nanoseconds.)

That said, if anyone is still using an older version of Date which is
written in pure Ruby rather than C, then my recollection from years ago
(when I first noticed this) is that the change makes a small but measurable
difference:

def self.gregorian_leap? (y) y % 4 == 0 && y % 100 != 0 || y % 400 == 0 end

If you care for efficiency and we do not have negative values we could
also replace x % 4 == 0 with
x & 0x3 == 0
But I would consider that a micro optimization that the compiler might
do anyway.

Yes, I've tried this function:
c_gregorian_leap_p_maybe_faster(int y)
    if (y < 0)
        y = -400 <= y ? y + 400 : -(y + 400);
    if ((y & 3) != 0)
        return 0;
    and so on.

My unexpert general view on trying to optimize code is you should
concentrate on writing clear code and leave it to the compiler to optimize
it, *unless* you are sure that you know better than the compiler writers
for your particular use case. (An example is although integer division by a
constant may well be optimized by a compiler, if you know that a
calculation will only use a small range of integers, your own optimized
division calculation may be somewhat faster than a compiler's more general
optimzation.)

I have benchmarked this C code, with mixed results. My general view of
benchmarking is that I only trust the results when they seem to confirm
what I think ought to be the case on general grounds (all other things
being equal, having more operations should not speed up the code, etc), so
I present the following with more than somewhat trepidation.

I used MinGW on Microsoft Windows and gcc on Lubuntu with optimizing
options -O0, -O1, -O2, -O3.
* lkly means dates that are between 1901 and 2099, which I consider most
likely;
* ulky means dates that are unlikely, that is dates that are not most
likely.
The reason for this distinction is that for likely dates we can sometimes
use two or three inequalities instead of a division or modulo operation.

* gregorian_leap_p: (MOD(y, 4) == 0 && y % 100 != 0) || MOD(y, 400) == 0
* gregorian_leap_p_as_time: (y % 4 == 0 && y % 100 != 0) || y % 400 == 0
* c_gregorian_leap_p_refactor: y % 4 == 0 && (y % 100 != 0 || y % 400 == 0)
* c_gregorian_leap_p_perhaps_faster: optimization for likely years
* gregorian_leap_p_maybe_faster: optimization for likely years; y&3 for
y%4;

average times per call in nanoseconds - "?" marks timings that seem
anomalous:
(I don't know how Ruby Talk will format the following: it should be OK if
copied and pasted into an editor using a fixed width font.)

Laptop - 2.50 GHz CPU; gcc -O0; gcc -O1; gcc -O2; gcc -O3
Lenovo ThinkPad T420s; =========; =========; =========; =========
Microsoft Windows 7 - 64bit; lkly ulky; lkly ulky; lkly ulky; lkly ukly
gregorian_leap_p; 3.7 3.7; 2.7 2.6; 2.5 2.4; 2.4 2.4
gregorian_leap_p_as_time; 3.0 3.0; 2.0 1.9; 2.5 2.4; 2.4 2.4
gregorian_leap_p_refactor; 2.6 2.6; 2.2? 2.1?; 1.5 1.4; 1.7 1.7
gregorian_leap_p_maybe_faster; 2.4 3.2; 2.3? 2.4?; 1.2 1.4; 1.0 1.4
gregorian_leap_p_perhaps_faster; 2.2 3.0; 1.3 1.6; 1.0 1.4; 1.1 1.5

Netbook - 1.66 GHz CPU; gcc -O0; gcc -O1; gcc -O2; gcc -O3
Samsung N220; =========; =========; =========; =========
Lubuntu Linux 32bit; lkly ulky; lkly ulky; lkly ulky; lkly ukly
gregorian_leap_p; 29 29; 14 14; 14 14; 14 14
gregorian_leap_p_as_time; 24 23; 13 13; 14 14; 14 14
gregorian_leap_p_refactor; 15 15; 9 9; 7 7; 7 7
gregorian_leap_p_maybe_faster; 14 17; 6 9; 6 9; 5 7
gregorian_leap_p_perhaps_faster; 15 18; 5 7; 5 8; 4 7

···

On Thu, Apr 6, 2017 at 2:27 PM, Robert Klemme <shortcutter@googlemail.com> wrote: