Date arithmetic and Zune bug



I read with disbelief about how most of Microsoft's 30 GB Zune players hung on Dec 31st, due to not being able to handle day number 366 of a leap year, and started thinking about some discussions we've had here over the years about the fastests/tightest/most elegant ways to convert between year-month-day and day number.

The code I wrote to handle this problem was written in Jan 2000, then we had another discussion in 2005, at which time one of the IBM regulars (probably Hack?) pointed me at an IBM report which used very similar ideas.

Anyway, my code from 2005 uses about 10-15 regular instructions, plus two unsigned multiplications, to do ymd2days, so about 10-20 clock cycles:

/* #define USE_UNSIGNED_MASK */

#ifdef USE_UNSIGNED_SHIFT
typedef unsigned mask_t;

# define INC_IF_MASK(x,mask) ((x) += (mask) & 1)
# define DEC_IF_MASK(x,mask) ((x) -= (mask) & 1)
#else
typedef int mask_t;

# define INC_IF_MASK(x,mask) ((x) -= (mask))
# define DEC_IF_MASK(x,mask) ((x) += (mask))
#endif

/* These operations are valid for any date after 1584,
* according to the Gregorian (modern) calendar
*/

#define STARTYEAR 1200 /* Must be divisible by 400! */

/* Table to simplify conversions between mm-dd and day_in_year */
static short daysToMonth[13] =
{-1,30,60,91,121,152,183,213,244,274,305,336,366};

unsigned ymd2days(unsigned y, unsigned m, unsigned d)
{
unsigned days, y100;

/* Start by setting March = month zero: */
mask_t mask = (mask_t) (m -= 3);
/* Mask will now be in the (mask_t) -2 to 9 range, a shift by 4 is enough to make it -1 or 0: */
mask >>= 4;

DEC_IF_MASK(y, mask); /* Decrement year if jan/feb */
m += (mask & 12); /* and increment the month: The year starts in March! */
y -= STARTYEAR;
y100 = y / 100; /* Centuries for leap year calc */
days = y * 365 + (y >> 2) - y100 + (y100 >> 2);
days += (unsigned) daysToMonth[m] + d;
return days;
}

The reverse conversion is the fun one: It turns out that the fastest method is to guess the year, then correct it if off by one. :-)

The code below will work on any platform which provides unsigned integers with at least 29 bits of precision, but it will be slightly faster with 30+ bit twos-complement signed ints.

void days2ymd(unsigned days, unsigned *yy, unsigned *mm, unsigned *dd)
{
unsigned y400, y100, y, m, d, gd;
mask_t mask;

/* Start by subtracting any full 400-year cycles: */
y400 = days / 146097;
days -= y400 * 146097;

#if 0
/* Very good approximation to the number of years, will be wrong
* (too high) 257 times
* in a 400-year cycle, i.e. in 0.18% of all calls.
*/

/* A good compiler will replace this with a scaled reciprocal MUL! */
y = (days + 1) * 400 / 146096;
#else
/* Useful and faster approximation to the number of years, will be
* wrong (too high) 910 times
* in a 400-year cycle, i.e. in 0.62% of all calls.
*
* Requires unsigned values with up to 29 significant bits!
*/
y = (days * 2871 + 1983) >> 20;
/* y = (days * 22967 + 59235) >> 23; */
/* peter (at) horizon.com suggested this! */
#endif

/* Calculate # of centuries:
* Since y will be in the 0 to 400 range, the following
* approximation can be used instead of a division by 100:
* y100 = y / 100; ~ (y * 41) >> 12;
*/
y100 = (y * 41) >> 12;

/* # of days in those full years */
gd = y * 365 /* Normal years */
+ (y >> 2) /* + leap years every 4 years */
- y100 /* - missing century leap years */
+ (y100 >> 2); /* + every 400 years */

/* test for the small chance of a wrong year: */
if (gd > days) {
y--; /* y will be in the [0-399] range! */
y100 = (y * 41) >> 12;
/* The 400-year correction can be skipped! */
gd = y * 365 + (y >> 2) - y100 /* + (y100 >> 2) */;
}

/* Calculate the offset into the current year: */
days -= gd;
/* Correct for starting date and 400-year cycles: */
y += STARTYEAR + y400 * 400;

#if 1
/* Make a FAST guess at the current month, can be too low: */
m = days >> 5;

mask = (mask_t) daysToMonth[m+1] - (mask_t) days;
/* mask will be in the -18 to 18 range, use shift by 5: */
mask >>= 5;
/* If the guess was wrong then the mask will be -1, otherwise 0: */

/* Increment month if needed */
INC_IF_MASK(m, mask);

/* The remainder is the day of the month */
d = days - (unsigned) daysToMonth[m];

/* Correct for the March 1 starting point: */
mask = (mask_t) (9 - m) >> 4;
m += 3;
INC_IF_MASK(y, mask);
m -= (unsigned) (mask & 12);

#else
/* Based on code from an IBM report: Slightly slower, but
* uses no table lookups!
*/
m = ((days + 31) * 1071);
d = (((m & 0x7fff) * 62669) >> 26) + 1;
m >>= 15;

mask = (mask_t) (10 - m) >> 4;
m += 2;
INC_IF_MASK(y, mask);
m -= (unsigned) (mask & 12);
#endif

*yy = y; *mm = m; *dd = d;
}

This second function uses four integer multiplications, one of them is a reciprocal to avoid division, the rest are all shifts/adds/subs etc.

Running time is on the order of 30-70 cycles, depending upon the speed of integer muls.

Terje


--
- <Terje.Mathisen@xxxxxxxxxxxxx>
"almost all programming can be viewed as an exercise in caching"
.



Relevant Pages