SP-Lang date/time¤
Type datetime
is a value that represents a date and time in the UTC, using broken time structure.
Broken time means that year, month, day, hour, minute, second and microsecond are stored in dedicated fields; different from the e.g. UNIX timestamp.
- Timezone: UTC
- Resolution: microseconds (six decimal digits)
Useful tools
Bit layout¤
The datetime is stored in 64bit unsigned integer (ui64
); little-endian format, Intel/AMD 64bit native.
Position | Component | Bits | Mask | Type* | Range | Remark |
---|---|---|---|---|---|---|
58-63 | 4 | 0…15 | OK (0)/Error (8)/Reserved | |||
46-57 | year | 14 | si16 |
-8190…8191 | ||
42-45 | month | 4 | 0x0F | ui8 |
1…12 | Indexed from 1 |
37-41 | day | 5 | 0x1F | ui8 |
1…31 | Indexed from 1 |
32-36 | hour | 5 | 0x1F | ui8 |
0…24 | |
26-31 | minute | 6 | 0x3F | ui8 |
0…59 | |
20-25 | second | 6 | 0x3F | ui8 |
0…60 | 60 is for leap second |
0-19 | microsecond | 20 | ui32 |
0…1000000 |
Note
*) Type is recommended/minimal byte-aligned type for a respective component.
Timezone details¤
Timezone information originates from pytz respectively from the IANA Time Zone Database.
Note
The time zone database has precision down to the minute, it means that seconds and microseconds remain untouched when converting from/to UTC.
The timezone data is represented by a filesystem directory structure commonly located at /usr/share/splang
or at location specified by SPLANG_SHARE_DIR
environment variable.
The actual timezone data are stored at tzinfo
subfolder.
The timezone data are generated by a script generate_datetime_timezones.py
during installation of SPLang.
Example of the tzinfo
folder
```
.
└── tzinfo
├── Europe
│ ├── Amsterdam.sptl
│ ├── Amsterdam.sptb
│ ├── Andorra.sptl
│ ├── Andorra.sptb
```
.sptl
and .sptb
files contain speed-optimized binary tables that supports fast lookups for local time <-> UTC conversions.
.sptl
is for little-endian CPU architectures (x86 and x86-64), .sptb
is for big-endian architectures.
The file is memory-mapped into the SP-Lang process memory space, aligned on 64byte boundary, so that it can be directly used as a lookup.
Common structures¤
ym
: Year & month,ym = (year << 4) + month
dhm
: Day, hour & minute,dhm = (day << 11) + (hour << 6) + minute
Both structures are bit-wise parts of the datetime
scalar value and can be extracted from datetime
using AND
and SHR
.**
Timezone file header¤
Header length in 64 bytes.
Unspecified bytes are set to 0
and reserved for a future use.
- Position
00...03
:SPt
/ magic identifier - Position
04
:<
for little-endian CPU architecture,>
for big-endian - Position
05
: Version (currently1
ASCII character) - Position
08...09
: Minimal year/month (min_ym
) in this file, month MUST BE 1 - Position
10...11
: Maximal year/month (min_ym
) in this file - Position
12...15
: The position of the "parser table" in the file, multiplied by 64, typically1
b/c the parser table is stored directly after a header
Timezone parser table¤
The parser table is a lookup table used for conversion from the local date/time into UTC.
The table is organised into rows/years and columns/months.
The cell is 4 bytes (32bits) wide, the row is then 64 bytes long.
First 12 cells are "primary parser cells" (in light blue color), the number reflect the number of the month (1...12).
The remaining 4 cells are "parser next cells", the number nX
is the index.
Primary parser cell¤
The position of the cell for a given date/time is calculated as pos = (ym - min_ym) << 5
which means that year and month is used for a cell localization, minus the minimal year&month value for a table.
Structure of the cell:
16
bits: range, 16bits,dhm
3
bits:next
7
bits: hour offset from UTC6
bits: minute offset from UTC
dhm
denotes the day, hour and minute in the year/month, when the time change (e.g. Daylight-saving time start/end) is observed.
For a typical month - where there is no time change is observed - the dhm
value represents the maximum in the given month.
If dhm
for a input date/time is mathematically lower than dhm
from the primary cell, then the hour
and minute
information is used to adjust date/time from local to UTC.
If dhm
is greater, then the next
contains a number of the "parser next cell"; present at the end of the relevant parser table row.
Parser next cell¤
The "parser next cell" contain a "continuation" of the information for a month where the time change is observed. The "continuation" means the offset from UTC that happens when local time passed time change boundary.
Structure of the cell:
16
bits: range, 16bits,dhm
3
bits: not used, set to 07
bits: hour offset from UTC6
bits: minute offset from UTC
dhm
denotes the day, hour and minute in the year/month, when the NEXT time change (e.g. Daylight-saving time start/end) is observed.
Because currently we only support the single time change in the month, this field is set to maximum dhm
for a given month.
The hour
and minute
information is used to adjust date/time from local to UTC.
Note
Currently, only one time change per month is supported, which seems to be fully sufficient for all info in IANA time zone database.
Empty/unused next cells are zeroed.
Errors¤
If datetime
bit 63 is set, then the date/time value represents an error.
Likely the expression that produced this value failed in some way.
The error code is stored in lower 32bits.
Mixed types¤
Since datetime
is 64bit unsigned integer, it could happen - yet this is NOT recommended - that another date/time representation is used.
This is an table how to automatically detect a what format is used for a date/time representation.
Representation | 1st Jan 2000 | 1st Jan 2100 | Lower range | Upper range |
---|---|---|---|---|
UNIX timestamp | 946 681 200 | 4 102 441 200 | 0 | 10 000 000 000 |
UNIX timestamp (milli) | 946 681 200 000 | 4 102 441 200 000 | 100 000 000 000 | 10 000 000 000 000 |
UNIX timestamp (micro) | 946 681 200 000 000 | 4 102 441 200 000 000 | 100 000 000 000 000 | 10 000 000 000 000 000 |
SP-Lang datetime | 140 742 023 840 793 010 | 147 778 898 258 559 000 | 100 000 000 000 000 000 | - |