Skip to content

Archive

The LogMan.io Receiver archive is immutable, column-oriented, append-only data storage of the received raw logs.

Each commlink feeds data into the stream. The stream is a infinite table with fields. The stream name is composed by the received. prefix, the name of the tenant and a commlink (ie. received.mytenant.udp-8889)

The archive stream contains following fields for each log entry:

  • raw: Raw log (string, digitally signed)
  • row_id: Primary identifier of the row unique across all streams (64bit unsigned integer)
  • collected_at: Date&time of the log collection at the collector
  • received_at: Date&time of the log receival to the receiver
  • source: Description of the log source (string)

The source field contains:

  • for TCP inputs: <ip address> <port> S (S is for a stream)
  • for UDP inputs: <ip address> <port> D (D is for a datagram)
  • for file inputs: a filename
  • for other inputs: optional specification of the source

The source field for a log delivered over UDP

192.168.100.1 61562 D

The log was collected from IP address 192.168.100.1 and port UDP/61562.

Partition

Every stream is divided into partitions. Partitions of the same stream can be located on different receiver instances.

Info

Partitions can share identical periods of time. This means that data entries from the same span of time could be found in more than one partition.

Each partition has its number (part_no), starting from 0. This number monotonically increases for new partitions in the archive, across streams. The partition number is globally unique, in terms of the cluster.

The partition number is encoded into the partition name. The partition name is 6 character name, which starts with aaaaaa (aka partition #0) and continues to aaaaab (partition #1) and so on.

The parititon can be investigated in the Zookeeper:

/lmio/receiver/db/received.mytenant.udp-8889/aaaaaa.part

partno: 0  # The partition number, translates to aaaaaa
count: 4307  # Number of rows in this partition
size: 142138  # Size of the partition in bytes (uncompressed)

created_at:
  iso: '2023-07-01T15:22:53.265267'
  unix_ms: 1688224973265267

closed_at:
  iso: '2023-07-01T15:22:53.283168'
  unix_ms: 1688224973283167

extra:
  address: 192.168.100.1 49542  # Address of the collector
  identity: ABCDEF1234567890  # Identity of the collector
  stream: udp-8889
  tenant: mytenant

columns:
  raw:
    type: string

  collected_at:
    summary:
      max:
        iso: '2023-06-29T20:33:18.220173'
        unix_ms: 1688070798220173
      min:
        iso: '2023-06-29T18:25:03.363870'
        unix_ms: 1688063103363870
    type: timestamp

  received_at:
    summary:
      max:
        iso: '2023-06-29T20:33:18.549359'
        unix_ms: 1688070798549359
      min:
        iso: '2023-06-29T18:25:03.433202'
        unix_ms: 1688063103433202
    type: timestamp

  source:
    summary:
      token:
        count: 2
    type: token:rle

Tip

Because the partition name is globally unique, it is possible to move partition to a shared storage, ie. NAS or a cloud storage from a different nodes of the cluster. The lifecycle is designed in a way that partition names will not collide, so data will not be overwritten by different receivers but reassembled correctly on the "shared" storage.

Lifecycle

The partition lifecycle is defined by phases.

The ingest partitions are partitions that receives the data. Once the ingest is completed, aka rotated to the new partition, the former partition is closed. The partition cannot be reopen.

When the partition is closed, the partition lifecycle starts. Each phase is configured to point to a specific directory on the filesystem.

The lifecycle is defined on the stream level, at /lmio/receiver/db/received... entry in the ZooKeeper.

Tip

Partitions can be also moved manualy into a desired phase by the API call.

Default lifecycle

The default lifecycle consists of three phases: hot, warm and cold.

graph LR
  I(Ingest) --> H[Hot];
  H --1 week--> W[Warm];
  W --3 months--> D(Delete);
  H --immediately-->C[Cold];
  C --18 months--> CD(Delete);

The ingest is done into the hot phase. Once the ingest is completed and the partition is closed, the partition is copied into the cold phase. After a week, the partition is moved to the warm phase. It means that the partition is duplicated - one copy is in the cold phase storage, the second copy is in the warm phase storage.

The partition on the warm phase storage is deleted after 6 months.

The partition on the cold phase storage is compressed using xz/LZMA. The partition is deleted from the cold phase after 18 months.

Default lifecycle definition

define:
  type: jizera/stream

ingest: # (1)
  phase: hot
  rotate_size: 30G
  rotate_time: daily

lifecycle:

  hot:
    - move: # (2)
        age: 1w
        phase: warm

    - copy: # (3)
        phase: cold

  warm:
    - delete: # (4)
        age: 3M

  cold:
    - compress:  # (5)
        type: xz
        preset: 6
        threads: 4

    - delete: # (6)
        age: 18M
  1. Ingest new logs into the hot phase.
  2. After one week, move the partition from a hot to a warm phase.
  3. Copy the partition into a cold phase immediately after closing of ingest.
  4. Delete the partition after 3 months.
  5. Compress the partition immediatelly on arrival to the cold phase.
  6. Delete the partition after 18 months from the cold phase.

The phase storage tiers recommendations:

  • Hot phase should be located on SSDs
  • Warm phase should be located on HDDs
  • Cold phase is an archive, could be located on NAS or slow HDDs.

Note

For more information, visit the Administration manual, chapter about Disk storage.

Lifecycle rules

  • move: Move the partition at specified age to the specified phase.
  • copy: Copy the partition at specified age to the specified phase.
  • delete: Delete the partition at specified age.

The age can be e.g. "3h" (three hours), "5M" (five months), "1y" (one year) and so on.

Supported age postfixes:

  • y: year, respectively 365 days
  • M: month, respectively 31 days
  • w: week
  • d: day
  • h: hour
  • m: minute

Note

If age is not specified, then the age is set to 0, which means that the lifecycle action is taken immediately.

Compression rule

compress: Compress the data on receival to the phase.

Currently type: xz is supported with following options:

preset: The xz compression preset.

The compression preset levels can be categorised roughly into three categories:

0 ... 2

Fast presets with relatively low memory usage. 1 and 2 should give compression speed and ratios comparable to bzip2 1 and bzip2 9, respectively.

3 ... 5

Good compression ratio with low to medium memory usage. These are significantly slower than levels 0-2.

6 ... 9

Excellent compression with medium to high memory usage. These are also slower than the lower preset levels.

The default is 6.

Unless you want to maximize the compression ratio, you probably don't want a higher preset level than 7 due to speed and memory usage.

threads: Maximum number of CPU threads used for a compression.

The default is 1.

Set to 0 to use as many threads as there are processor cores.

Manual decompression

You can use xz --decopress or unxz from XZ Utils. You can use Z-Zip to decompress archive files on Windows. Always work on the copy of files in the archive; copy all files out of the archive first, and don't modify (decompress) files in the archive.

Replication rule

replica: Specify the number of data copies (replicas) should be present in the phase.

Replicas are stored on a different receiver instances, so that the number of replicas should NOT be greater than the number of receivers in the cluster that operates a given phase. Otherwise the "excessive" replica will not be created because the available receiver instance is not found.

Replication in the hot phase

define:
  type: jizera/stream

lifecycle:

  hot:
    - replica:
        factor: 2

...

factor: A number of copies of the data in the phase, the default value is 1.

Rotation

A partition rotation is a mechanism that closed ingest partitions at specific conditions. When the ingest partition is closed, new data are stored in the newly created another ingest partition. This ensures more or less even slicing of the infinite stream of the data.

The rotation is configured on the stream level by:

  • rotate_time: the period (ie daily) the partition can be in the ingest mode
  • rotate_size: the maximum size of the partition; T, G, M and k postfixes are supported using base 10.

Both options can be applied simultanously.

The default stream rotation is daily and 30G.

Roadmap

Only daily option is available at the moment for rotate_time.

Data vending

The data can be extracted from the archive (ie. for third party processing, migration and so one) by copying out the data directory of partitions in scope.

Use Zookeeper to identify what partitions are in scope of the vending and where they are physically located on storages.

The raw column can be directly processed by third party tools. When the data are compressed by the lifecycle configuration, the decompression can be needed.

Note

It means that you don't need to move partition from ie. cold phase into warm or hot phase.

Replay of the data

The archived logs can be replayed to subsequent central components.

Non-repudiation

The archive is a cryptographically secured, designed for traceability and non-repudiation. Digital signatures are used to verify the authenticity and integrity of the data, providing assurance that the logs have not been tampered with and were indeed generated by the stated log source.

This digital signature-based approach to maintaining logs is an essential aspect of secure logging practices and a cornerstone of a robust information security management system. These logs are vital tools for forensic analysis during an incident response, detecting anomalies or malicious activities, auditing, and regulatory compliance.

We use following cryptographical algorithms to ensure the security of logs: SHA256, ECDSA.

The hash function, SHA256, is applied to each raw log entry. This function takes the input raw log entry and produces a fixed-size string of bytes. The output (or hash) is unique to the input data; a slight alteration in the input will produce a dramatically different output, a characteristic known as the "avalanche effect".

This unique hash is then signed using a private signing key through the ECDSA algorithm, which generates a digital signature that is unique to both the data and the key. This digital signature is stored alongside the raw log data, certifying that the log data originated from the specified log source and has not been tampered with during storage.

Digital signatures of raw columns are stored in the ZooKeeper (the canonical location) and in the filesystem, under the filename col-raw.sig. Each partition is also equipped with a unique SSL signing certificate, named signing-cert.der. This certificate, in conjunction with the digital signature, can be used to verify that the col-raw.data (the original raw logs) has not been altered, thus ensuring data integrity.

Important

Please note that the associated private signing key is not stored anywhere but in the process memory for security purposes. The private key is removed as soon as the partition has finished its data ingest.

The signing certificate is issued by an internal Certificate Authority (CA). The CA's certificate is available in ZooKeeper at /lmio/receiver/ca/cert.der.

Digital signature verification

You can verify the digital signature by using the following OpenSSL commands:

$ openssl x509 -inform der -in signing-cert.der -pubkey -noout > signing-publickey.pem
$ openssl dgst -sha256 -verify signing-publickey.pem -signature col-raw.sig col-raw.data
Verified OK

These commands extract the public key from the certificate (signing-cert.der), and then use that public key to verify the signature (col-raw.sig) against the data file (col-raw.data). If the data file matches the signature, you'll see a Verified OK message.

Additionally, verify also the signing-cert.der, this certificate has to be issued by the internal CA.

Practical example

The practical example of archive applied on the log stream from Microsoft 365. The "cold" phase is stored on NAS, mounted to /data/nas with XZ compression enabled.

Statistics

  • Date range: 3 months
  • Rotation: daily (typically one partition is created per day)
  • Total size: 8.3M compressed, compression ratio: 92%
  • Total file count: 1062

Content of directories

tladmin@lm01:/data/nas/receiver/received.default.o365-01$ ls -l
total 0
drwxr-x--- Jul 25 20:59 aaaebd.part
drwxr-x--- Jul 25 21:02 aaaebe.part
drwxr-x--- Jul 26 21:02 aaaebg.part
drwxr-x--- Jul 27 21:03 aaaeph.part
drwxr-x--- Jul 28 21:03 aaagaf.part
drwxr-x--- Jul 29 21:04 aaagfn.part
drwxr-x--- Jul 30 21:05 aaagjm.part
drwxr-x--- Jul 31 21:05 aaagog.part
drwxr-x--- Aug  1 21:05 aaahik.part
drwxr-x--- Aug  2 21:05 aaahmb.part
drwxr-x--- Aug  3 12:49 aaaifj.part
drwxr-x--- Aug  3 17:50 aaaima.part
drwxr-x--- Aug  3 18:46 aaaiok.part
drwxr-x--- Aug  4 18:46 aaajaf.part
drwxr-x--- Aug  5 18:46 aaajbk.part
drwxr-x--- Aug  6 18:47 aaajcj.part
drwxr-x--- Aug  7 11:33 aaajde.part
drwxr-x--- Aug  7 11:34 aaajeg.part
drwxr-x--- Aug  7 12:22 aaajeh.part
drwxr-x--- Aug  7 13:51 aaajem.part
drwxr-x--- Aug  8 09:50 aaajen.part
drwxr-x--- Aug  8 09:59 aaajfk.part
drwxr-x--- Aug  8 10:06 aaajfo.part
....
drwxr-x--- Oct 25 15:44 aadcne.part
drwxr-x--- Oct 26 06:23 aadcnp.part
drwxr-x--- Oct 26 09:54 aadcof.part
drwxr-x--- Oct 27 09:54 aadcpc.part
tladmin@lm01:/data/nas/receiver/received.default.o365-01/aadcpc.part$ ls -l
total 104
-r--------  1824 Oct 27 09:54 col-collected_at.data.xz
-r-------- 66892 Oct 27 09:54 col-raw.data.xz
-r--------  2076 Oct 27 09:54 col-raw.pos.xz
-r--------    72 Oct 27 09:54 col-raw.sig
-r--------  1864 Oct 27 09:54 col-received_at.data.xz
-r--------    32 Oct 27 09:54 col-source-token.data.xz
-r--------    68 Oct 27 09:54 col-source-token.pos.xz
-r--------    68 Oct 27 09:54 col-source.data.xz
-r--------   496 Oct 27 09:54 signing-cert.der.xz
-r--------  1299 Oct 27 09:54 summary.yaml