[fpc-devel] Magic numbers used in TryEncodeDate

Fri Apr 21 13:47:46 CEST 2006

Hi,

Here is the explanation I promised:

Graeme Geldenhuys wrote:
> What is the meaning of these magic numbers used in the TryEncodeDate method?
> I would like to document them for FPC once I know the answer.
> 
> 146097
> 1461
> 153
> 693900
> 
> I saw the constant DateDelta, but that is never used in TryEncodeDate
> (whereas in Delphi it is).
> [...]
> Function TryEncodeDate(Year,Month,Day : Word; Var Date : TDateTime) : Boolean;
> var
>   c, ya: cardinal;
> begin
>   Result:=(Year>0) and (Year<10000) and
>           (Month in [1..12]) and
>           (Day>0) and (Day<=MonthDays[IsleapYear(Year),Month]);
>  If Result then
>    begin
>      if month > 2 then
>       Dec(Month,3)
>      else
>       begin
>         Inc(Month,9);
>         Dec(Year);
>       end;
>      c:= Year DIV 100;
>      ya:= Year - 100*c;
>      Date := (146097*c) SHR 2 + (1461*ya) SHR 2 +
> (153*cardinal(Month)+2) DIV 5 + cardinal(Day) - 693900;
>    end
> end;

Given are Year, Month and Day. We intend to convert this to the number of days since a certain data, such as 1/1/1, 1/1/1900 or 1/1/1970, or 30/12/1899 for TDateTime. Obviously, if we have some formula that enumerates the days, regardless of the starting date, we can always add or subtract a 'delta' to change the starting date.

Intuitively we might think of a formula that looks like:

 365 * Year + 30 * Month + Day

Of course this formula is terribly wrong, and that is (mainly) because of two reasons:
1) not every year has 365 days in it
2) not every month has 30 days in it

Let us first deal with 2). As noted, not every month has 30 days, but even worse, some have 30, some 31 and one has 28/29: February. We are looking for a more accurate formula than the 30 * Month above. It should represent the number of days in the months before Month in the year. A simple trick that will allow us to give a simple formula, and will help us dealing with leap Februaries, is: let a year run from March until February. The following map is taken:

 December 2005 -> 'month'  9 of 'year' 2005
 January  2006 -> 'month' 10 of 'year' 2005
 February 2006 -> 'month' 11 of 'year' 2005
 March    2006 -> 'month'  0 of 'year' 2006
 April    2006 -> 'month'  1 of 'year' 2006
 ...
 December 2006 -> 'month'  9 of 'year' 2006
 January  2007 -> 'month' 10 of 'year' 2006
 February 2007 -> 'month' 11 of 'year' 2006
 March    2007 -> 'month'  0 of 'year' 2007

(I will use 'year' and 'month' instead of year and month to differentiate them.)

Now, the number of days in the current 'year' in the 'months' 0 .. Month-1 are given by:

 March:      0 -> 0
 April:      1 -> 31  (the 31 days of March)
 May:        2 -> 61  (the 31 days of March + the 30 days of April)
 June:       3 -> 92  (the 31 days of March + the 30 days of April + the 30 days of May)
 July:       4 -> 122 (... + 30 days of June)
 August:     5 -> 153 (... + 31 days of July)
 September:  6 -> 184 (... + 31 days of August)
 October:    7 -> 214 (... + 30 days of September)
 November:   8 -> 245 (... + 31 days of October)
 December:   9 -> 275 (... + 30 days of November)
 January:   10 -> 306 (... + 31 days of December)
 February:  11 -> 337 (... + 31 days of January)

This function is almost linear. The trick is to round a linear function, such that the increments are sometimes 30 and sometimes 31. For example take

 floor( 30.6 * Month + 0.4 ) = ( 153*Month + 2 ) DIV 5

This formula exactly gives the above function! To get the day of the 'year', we thus have the formula

 floor( 30.6 * Month + 0.4 ) + Day

Let us now deal with 1): we need a formula for the number of days in the previous 'years'. Remember that a year, called a leap year, has 366 days instead of 365 when:
 it is a multiple of 4,
 but not a multiple of 100,
 unless it is a multiple of 400.
So 2000 and 2004 are leap years, but 1900 is not.

So every period of 400 years has exactly the same number of days, because the leap year rule is periodic with period 400. The number of days is

 365*400 + 400/4 - 400/100 + 400/400 = 146097

So every century has about 146097/4 = 36524.25 days in it. Most centuries have 36524 days in it; only every fourth century has 36525 days in it. Similarly, every fourth year (within a century, e.g. 1901..1999) has 366 instead of 365 days in it.

Recall that we used 'years' running from March until February the next year. This means that the leap day of a leap year is actually the last day before that year. For example, the 'year' 2000 runs from March 2000 until February 2001, and the leap day of 200 is 29 February 2000, the last day of the 'year' 1999.

We now dissect a Year into two pieces: the century (c) and the year-within-the-century (ya). That is, we take Year = 100*c + ya, where 0 <= ya < 100. (Note: This does not correspond with the usual definition of century, which says e.g. that the 21th century runs from 2001 until 2100.)

We now write

  the number of days in the years before Year =
= the number of days in the centuries before c + the number of days in years before ya in the current century

For us, a century starts at March c*100 and ends at February (c+1)*100. E.g. c = 19 runs from 1 March 1900 until 29 February 2000. Hence, the extra leap year of every fourth century has its leap day not in e.g. c = 20, but as the last day of c = 19. Since we are looking at the quantity 'the number of days in the centuries before c', this leap day appears first in this quantity for c's divisible by 4, e.g. for c = 20. This explains in detail why 'the number of days in the centuries before c' can be given by

 floor( 36524.25 * c ) = (146097 * c) SHR 2

This function 'jumps' by 36525 instead of 36524 exactly when c is a multiple of 4.

Similarly, the leap day of every fourth year within a century is part of the 'year' before it. E.g. our 'year' 1903 runs (from 1 March 1903) until the leap day 29 February 1904. So now we see that 'the number of days in years before ya in the current century' can be given by

 floor( 365.25 * ya ) = ( (365*4+1) * ya ) SHR 2 = (1461 * ya) SHR 2

This function 'jumps' by 366 instead of 365 exactly when ya is a multiple of 4.

Putting this together, we find that 'the number of days in the years before Year' is given by

 floor( 36524.25 * c ) + floor( 365.25 * ya ) = (146097 * c) SHR 2 + (1461 * ya) SHR 2

Putting 1) and 2) together we find that the day enumeration is given by

 (146097 * c) SHR 2 + (1461 * ya) SHR 2 + ( 153*Month + 2 ) DIV 5 + Day + Delta,

where Delta depends on the choice of the starting date. For TDateTime, we want 31 December 1899 to be day 1 (by definition). Note that this date has has 'year' 1899 and 'month' 9, so solve Delta from

 1 = (146097 * 18) SHR 2 + (1461 * 99) SHR 2 + ( 153*9 + 2 ) DIV 5 + 31 + Delta =
   = 657436              + 36159             + 275                 + 31 + Delta =
   = 693901                                                             + Delta,

hence

 Delta = 693900

The DateDelta in Delphi is used to convert between Delphi 1 and Delphi 2 TDateTime. In D1, TDateTime was the number of days elapsed since 1 January 1; in D2 and onwards it is the number of days elapsed since 30 December 1899. So they introduced DateDelta, the number of days from 1/1/0001 to 31/12/1899, which is 693594.

The DateDelta comes close to our 693900, but it is not exactly the same, as it also incorporates the shift of a year by two months and the inclusion of the year 0 in our (146097*c) SHR 2 term. (Sidenote: actually the year 0 never existed; the year before 1 AD is 1 BC and not a magic year 0.)

Looking closer at the routine, things go wrong there for pre-12 December 1899 dates. The constants in the formula are ok, but the calculation is done using cardinals, which are unsigned! EncodeDate(1,1,1) returns 4294273703 instead of the negative number -693593.

Another note, about Math.DivMod: it seems to be implemented only for Words at the moment:

 procedure DivMod(Dividend: Integer; Divisor: Word;  var Result, Remainder: Word);

This is no real problem for this routine -- we can make c and ya Words and cast them to Longints in the formula (this also solves the pre-30/12/1899 problem, though that can also be done to cast the whole part before the minus sign at the end) -- but why is there no Cardinal (or Longint) version? And: is it {$INLINE}? Or does the compiler already optimize things like

 result   :=a div b;
 remainder:=a mod b;

or the here used

 result   :=a div b;
 remainder:=a-result*b;

BTW, we might need to check the DecodeDate function and others too :)

Regards,

Bram

PS Sorry if this mail arrives twice; I'm resending it because I haven't seen it appearing on the list yet in about an hour; also I used another email address than the one I'm subscribed with at the list, but funny enough yesterday, when I did that too, it worked :S