Skip to content

SP-Lang date/time¤

Type datetime is a value that represents a date and time in the UTC, using broken time structure. Broken time means that year, month, day, hour, minute, second and microsecond are stored in dedicated fields; different from the e.g. UNIX timestamp.

  • Timezone: UTC
  • Resolution: microseconds (six decimal digits)

Bit layout¤

The datetime is stored in 64bit unsigned integer (ui64); little-endian format, Intel/AMD 64bit native.

Schema of the data/time bit layout

Position Component Bits Mask Type* Range Remark
58-63 4 0…15 OK (0)/Error (8)/Reserved
46-57 year 14 si16 -8190…8191
42-45 month 4 0x0F ui8 1…12 Indexed from 1
37-41 day 5 0x1F ui8 1…31 Indexed from 1
32-36 hour 5 0x1F ui8 0…24
26-31 minute 6 0x3F ui8 0…59
20-25 second 6 0x3F ui8 0…60 60 is for leap second
0-19 microsecond 20 ui32 0…1000000

Note

*) Type is recommended/minimal byte-aligned type for a respective component.

Timezone details¤

Timezone information originates from pytz respectively from the IANA Time Zone Database.

Note

The time zone database has precision down to the minute, it means that seconds and microseconds remain untouched when converting from/to UTC.

The timezone data is represented by a filesystem directory structure commonly located at /usr/share/splang or at location specified by SPLANG_SHARE_DIR environment variable. The actual timezone data are stored at tzinfo subfolder. The timezone data are generated by a script generate_datetime_timezones.py during installation of SPLang.

Example of the tzinfo folder

```
.
└── tzinfo
  ├── Europe
    │  ├── Amsterdam.sptl
    │  ├── Amsterdam.sptb
    │  ├── Andorra.sptl
    │  ├── Andorra.sptb
```

.sptl and .sptb files contain speed-optimized binary tables that supports fast lookups for local time <-> UTC conversions. .sptl is for little-endian CPU architectures (x86 and x86-64), .sptb is for big-endian architectures.

The file is memory-mapped into the SP-Lang process memory space, aligned on 64byte boundary, so that it can be directly used as a lookup.

Common structures¤

  • ym: Year & month, ym = (year << 4) + month
  • dhm: Day, hour & minute, dhm = (day << 11) + (hour << 6) + minute

Both structures are bit-wise parts of the datetime scalar value and can be extracted from datetime using AND and SHR.**

Timezone file header¤

Header length in 64 bytes. Unspecified bytes are set to 0 and reserved for a future use.

  • Position 00...03: SPt / magic identifier
  • Position 04: < for little-endian CPU architecture, > for big-endian
  • Position 05: Version (currently 1 ASCII character)
  • Position 08...09: Minimal year/month (min_ym) in this file, month MUST BE 1
  • Position 10...11: Maximal year/month (min_ym) in this file
  • Position 12...15: The position of the "parser table" in the file, multiplied by 64, typically 1 b/c the parser table is stored directly after a header

Timezone parser table¤

The parser table is a lookup table used for conversion from the local date/time into UTC.

Organisation of the parser table

The table is organised into rows/years and columns/months.
The cell is 4 bytes (32bits) wide, the row is then 64 bytes long.

First 12 cells are "primary parser cells" (in light blue color), the number reflect the number of the month (1...12). The remaining 4 cells are "parser next cells", the number nX is the index.

Primary parser cell¤

The position of the cell for a given date/time is calculated as pos = (ym - min_ym) << 5 which means that year and month is used for a cell localization, minus the minimal year&month value for a table.

Structure of the cell:

  • 16 bits: range, 16bits, dhm
  • 3 bits: next
  • 7 bits: hour offset from UTC
  • 6 bits: minute offset from UTC

dhm denotes the day, hour and minute in the year/month, when the time change (e.g. Daylight-saving time start/end) is observed. For a typical month - where there is no time change is observed - the dhm value represents the maximum in the given month.

If dhm for a input date/time is mathematically lower than dhm from the primary cell, then the hour and minute information is used to adjust date/time from local to UTC.

If dhm is greater, then the next contains a number of the "parser next cell"; present at the end of the relevant parser table row.

Parser next cell¤

The "parser next cell" contain a "continuation" of the information for a month where the time change is observed. The "continuation" means the offset from UTC that happens when local time passed time change boundary.

Structure of the cell:

  • 16 bits: range, 16bits, dhm
  • 3 bits: not used, set to 0
  • 7 bits: hour offset from UTC
  • 6 bits: minute offset from UTC

dhm denotes the day, hour and minute in the year/month, when the NEXT time change (e.g. Daylight-saving time start/end) is observed. Because currently we only support the single time change in the month, this field is set to maximum dhm for a given month.

The hour and minute information is used to adjust date/time from local to UTC.

Note

Currently, only one time change per month is supported, which seems to be fully sufficient for all info in IANA time zone database.

Empty/unused next cells are zeroed.

Errors¤

If datetime bit 63 is set, then the date/time value represents an error. Likely the expression that produced this value failed in some way.

The error code is stored in lower 32bits.

Mixed types¤

Since datetime is 64bit unsigned integer, it could happen - yet this is NOT recommended - that another date/time representation is used. This is an table how to automatically detect a what format is used for a date/time representation.

Representation 1st Jan 2000 1st Jan 2100 Lower range Upper range
UNIX timestamp 946 681 200 4 102 441 200 0 10 000 000 000
UNIX timestamp (milli) 946 681 200 000 4 102 441 200 000 100 000 000 000 10 000 000 000 000
UNIX timestamp (micro) 946 681 200 000 000 4 102 441 200 000 000 100 000 000 000 000 10 000 000 000 000 000
SP-Lang datetime 140 742 023 840 793 010 147 778 898 258 559 000 100 000 000 000 000 000 -