Welcome!

SP-Lang Documentation¤

Welcome to the documentation for SP-Lang. SP-Lang stands for Stream Processing Language. SP-Lang is designed to be intuitive and easy to use language, even for people who don't have experience with programming. We strive to make it as simple to use as spreadsheet macros or SQL, allowing you to perform powerful data processing tasks with minimal effort.

The key goal of SP-Lang is that it does a lot of the heavy lifting for you, so you can focus on what you want to accomplish rather than worrying about the details of how to implement it. This low-code approach means that you can get up and running quickly, without having to learn a lot of complex programming concepts.

We hope that this documentation will provide you with all the information you need to get started with our language and start taking advantage of its powerful stream processing capabilities. Thank you for choosing our language, and we look forward to seeing what you can accomplish with it!

Made with by TeskaLabs

SP-Lang is the technology built at TeskaLabs.

Introduction¤

SP-Lang is a functional language that uses the YAML syntax.

SP-Lang delivers very high performance because it is compiled to the machine code. This, together with extensive optimizations, gives the performance in the same category as C, Go or Rust; respective the highest possible performance.

For that reason, SP-Lang is a natural candidate for a cost-effective processing of the massive data streams in the cloud or on‑premise applications.

Hello world! in SP-Lang

!ADD
- Hello
- " "
- world
- "!"

The same example in the visual form of SP-Lang

Your first steps with SP-Lang start in the tutorial.

Features of the SP-Lang¤

📜 Declarative language
🔗 Functional language
🔐 Strongly typed
💡 Type inference
🐍 Interpreted in Python
🚀 Compiled by LLVM
Syntax is based on YAML

Dedication¤

This work is dedicated to the memory of my mother, whose belief in me was as steadfast as it was unconditional. Though the technicalities of a language design were beyond her realm, her unwavering faith in me was the beacon that guided me through.

Her spirit, love, and resilience remain within me, inspiring and driving me forward. I hope that this language, a fruit of my labor and love, is a testament to her indomitable spirit.

I would also like to extend my sincere gratitude to all the contributors whose dedication and expertise helped shape this project. Your collaboration has made this creation possible.

Thank you, Mom. This is for you.
Ales Teska

SP-Lang Tutorial¤

Introduction¤

Welcome to the SP-Lang tutorial. SP-Lang, short for Stream Processing Language, is a domain-specific language (DSL). It's based on YAML, a human-readable data serialization language. This tutorial aims to introduce the basic elements of SP-Lang.

Hello World¤

Let's get started with a simple example:

---
Hello world!

In SP-Lang, the triple dashes (---) signal the start of the code.

Hello world! here is a value that you want to return. In this case, it's our friendly "Hello world!" greeting.

SP-Lang is based on YAML¤

SP-Lang is built on the YAML (Yet Another Markup Language). YAML emphasizes simplicity and readability, making it a great foundation for SP-Lang.

Important

YAML relies heavily on indentation, which is significant in its syntax. As a best practice, we recommend using two spaces for indentation. Do note that TABs are not supported in YAML.

Comments¤

As you progress with writing your code, it's beneficial to leave comments. This makes it easier for others (and your future self) to understand what your code does.

---
# This is a comment.
Hello world!

Comments in SP-Lang begin with a #. SP-Lang ignores anything that follows the # on the same line, making it useful for adding notes or describing the code.

SP-Lang Expressions¤

Expressions in SP-Lang are commands that perform operations. Let's look at an arithmetic example:

This code sums two numbers, specifically it calculates 5+8.

---
!ADD
- 5
- 8

The above expression sums two numbers, 5 and 8, to get the result 13.

Expressions in SP-Lang start with an exclamation mark (!).

Tip

The term "Expression" is an alternative term for a function.

In this example, !ADD is the expression for arithmetic addition that sums up the provided numbers.

The numbers you want to add are provided as a list because !ADD is a Sequence expression. This means that it can sum multiple input values:

---
!ADD
- 5
- 8
- 9
- 15

This list of input values is created using a dash - at the beginning of the line containing the value. Each line represents an individual item in the list.

You can also write expressions in a more concise way using the flow form, which can be freely combined with the default style of SP-Lang code:

---
!ADD [5, 8, 9, 15]

Mapping expressions¤

Another type of expression is a mapping expression. Instead of a list of inputs, mapping expressions use input names, which can be found in the expression's documentation.

---
!ENDSWITH
what: "FooBar"
postfix: "Bar"

The !ENDSWITH expression checks whether the value of the input what ends with the value of the input postfix. It returns true if it does, and false if it doesn't.

The flow form can also be used with mapping expressions:

---
!ENDSWITH {what: "FooBar", postfix: "Bar"}

Compose expressions¤

SP-Lang lets you combine expressions to create more complex and powerful solutions. You can "plug" the output of one expression into the input of another.

---
!MUL
- 5
- !ADD [6, 2, 3]
- 9
- !SUB [10, 5]

This example is equivalent to the arithmetic operation 5 * (6 + 2 + 3) * 9 * (10 - 5).

Arguments¤

Arguments are how data is passed into SP-Lang. Depending on the context of the call, an expression can have zero, one, or more arguments. Each argument has a unique name.

You can access vale of the argument by !ARG expression.

In the following example, the prescribed argument for the expression is name:

---
!ADD ["Hi ", !ARG name, "!"]

This would take the value of name and insert it into the string, forming a personalized greeting.

Conclusion¤

In this tutorial, we've covered the basics of SP-Lang, including how to write simple expressions, compose expressions, and use arguments. With these basics, you're ready to start exploring more complex policy definitions in SP-Lang. As you continue, remember to make ample use of the documentation to understand the various expressions and their required inputs.

Happy coding!

SP-Lang Syntax¤

Info

SP-Lang syntax is using YAML 1.2

Comments¤

An comment is marked by a # indicator.

# This file contains no
# SP-Lang, only comments.

Numbers¤

Integer¤

canonical: 12345
positive decimal: +12345
negative decimal: -12345
octal: 0o14
hexadecimal: 0xC

Floating Point¤

fixed: 1230.15
canonical: 1.23015e+3
exponential: 12.3015e+02
negative infinity: -.inf
not a number: .nan

Strings¤

string: '012345'
string without quotes: You can specify string without any quotation as well
emoji: 😀🚀⭐

Quoted strings:

unicode: "Sosa did fine.\u263A"
control: "\b1998\t1999\t2000\n"
hex esc: "\x0d\x0a is \r\n"

single: '"Howdy!" he cried.'
quoted: ' # Not a ''comment''.'

Multiline strings:

|
   _____ _____        _                       
  / ____|  __ \      | |                      
 | (___ | |__) |_____| |     __ _ _ __   __ _ 
  \___ \|  ___/______| |    / _` | '_ \ / _` |
  ____) | |          | |___| (_| | | | | (_| |
 |_____/|_|          |______\__,_|_| |_|\__, |
                                         __/ |
                                        |___/

The literal style (indicated by |) preserves initial spaces.

>
  Mark McGwire's
  year was crippled
  by a knee injury.

The folded style (denoted by >) removes eventual YAML indentation.

Booleans¤

True boolean: true
False boolean: false

Expressions¤

All SP-Lang expressions (aka functions) starts with !, SP-Lang expressions are therefore YAML tags (!TAG).

Expressions can be of thee types:

Mapping
Sequence
Scalar

Mapping expression¤

Example:

!ENDSWITH
what: FooBar
postfix: Bar

A flow form example:

!ENDSWITH {what: FooBar, postfix: Bar}

YAML specification

See chapter 10.2. Mapping Styles

Sequence expression¤

Example:

!ADD  
- 1  
- 2  
- 3

A flow form example:

!ADD [1, 2, 3]

YAML specification

See chapter 10.1. Sequence Styles

Sequence expression could be defined using with argument as well:

!ADD
with: [1, 2, 3]

Tip

This is actually a mapping form of the sequence expression.

Scalar expressions¤

Example:

!ITEM EVENT potatoes

YAML specification

See chapter 9. Scalar Styles

Anchors and Aliases¤

SP-Lang leverages YAML anchors and aliases. It means that you may refer to the result of the other expression by the anchor. The anchor is a string starting with "&". The result of the expression annotated by the anchor can then be reused by the alias, which is a string starting with "*", sharing the anchor's name. One anchor can be referenced by many aliases.

Example:

!ADD
- 1
- &subcount !MUL
  - 2
  - 3
- *subcount
- *subcount

Equals to 1+(2*3)+(2*3)+(2*3) respective 19.

Structure of the SP-Lang file¤

SP-Lang uses three dashes (---) to separate expressions from document content. This also serves to signal the start of a SP-Lang. Three dots ( “...”) indicate the end of a file without starting a new one, for use in communication channels.

The file extension of SP-Lang is .yaml.

Example of the SP-Lang file:

multiplication.yaml

---
# Let's do some basic math
!MUL
- 1
- 2
- 3

Note

SP-Lang file always starts with --- line.

Info

One file can contain more expressions using YAML separator (---).

Language

SP-Lang language design¤

Properties¤

Compiled or interpreted

SP-Lang is both:

Compiled by LLVM
Interpreted in Python

📜 Declarative¤

Most computer languages are imperative. This means that most of the code goes towards explaining to the computer how to execute some task. SP-Lang, on the other hand, is declarative. The maker describes “what” they want their logic to do, not exactly “how” or “when” it is to be done. Then the compiler will figure out how to do it. This allows the compiler to heavily optimize by deferring work until needed, pre-fetching and reusing cached data, etc.

🔗 Functional¤

SP-Lang favors pure functions without side effects. This results in logic, which is easier to understand and gives the compiler the most freedom to optimize.

🔀 Stateless¤

There is no state to modify, and therefore are no variables, just constants. You pass data through various expressions to build the final result.

More information

🔐 Strongly typed¤

The types of all the values are known at compile time. This allows for the early detection of errors and reinforce optimizations.

💡 Type inference¤

Types are derived from their use without being declared. For example, setting a variable to a number results in that variable's type being established as a number. This further reduces a complexity for a maker without any performance sacrifice known from interpreted languages.

For advanced users who require more control over the type system, the SP-Lang provide mechanisms to explicitly specify types or interact with the type system when necessary. This flexibility allows advanced users to fine-tune their code for maximum performance and reliability, while still benefiting from the convenience of type inference.

🎓Turing completeness¤

SP-Lang is designed to be Turing complete.

SP-Lang Performance¤

Introduction¤

SP-Lang is designed to deliver a very high performance.

Internally, it compiles provided expressions into a machine code, using LLVM IR and large degree of optimizations that are possible thanks to a functional structure of the language. It offers extremely high single CPU core throughput with a seamless ability to scale processing to available CPU cores and take full benefits of modern CPU architectures.

Performance tests measures throughput in EPS, Events per seconds. Events per second is a term used in IT management to define the number of events that are processed by SP-Lang expression in one second. EPS is measured for a single CPU core.

Performance tests are automated using CI/CD framework and therefore completely reproducible.

Multi-string matching¤

This expression locates elements of a finite set of strings within an input text. It is suited for eg. classification of the malicious URLs (provided by a blocklist) in the output of the firewall.

!IN
  where: !ARG url
  what:
  - ".000a.biz"
  - ".001edizioni.com"
  < 64 domains in total >
  - ".2win-tech.com"
  - ".2zzz.ru"

The list is provided by the blackweb project.

Single CPU Core on HW-M1-20: 1423686 EPS
Single CPU Core on HW-I7-15: 807685 EPS

JSON parsing¤

!JSON.PARSE
what: |
  {
    < https://github.com/TeskaLabs/cysimdjson/blob/main/perftest/jsonexamples/test.json >
  }

Note

Fast JSON parsing is powered by cysimdjson respectively simdjson projects._

Single CPU Core on HW-M1-20: 968502 EPS
Single CPU Core on HW-I7-15: 562862 EPS

IETF Syslog parsing¤

This is the IETF Syslog aka RFC5424 parser implemented in SP-Lang:

!PARSE.TUPLE # Header
- !PARSE.EXACTLY {what: '<'}
- !PARSE.DIGITS
- !PARSE.EXACTLY {what: '>'}
- !PARSE.DIGITS
- !PARSE.EXACTLY {what: ' '}

- !PARSE.TUPLE # Timestamp
  - !PARSE.DIGITS # Year
  - !PARSE.EXACTLY {what: '-'}
  - !PARSE.DIGITS # Month
  - !PARSE.EXACTLY {what: '-'}
  - !PARSE.DIGITS # Day
  - !PARSE.EXACTLY {what: 'T'}
  - !PARSE.DIGITS # Hours
  - !PARSE.EXACTLY {what: ':'}
  - !PARSE.DIGITS # Minutes
  - !PARSE.EXACTLY {what: ':'}
  - !PARSE.DIGITS # Seconds
  - !PARSE.EXACTLY {what: '.'}
  - !PARSE.DIGITS # Subseconds
  - !PARSE.EXACTLY {what: 'Z'}

- !PARSE.EXACTLY {what: ' '} # HOSTNAME
- !PARSE.UNTIL   {what: ' '}

- !PARSE.EXACTLY {what: ' '} # APP-NAME
- !PARSE.UNTIL   {what: ' '}

- !PARSE.EXACTLY {what: ' '} # PROCID
- !PARSE.UNTIL   {what: ' '}

- !PARSE.EXACTLY {what: ' '} # MSGID
- !PARSE.UNTIL   {what: ' '}

- !PARSE.EXACTLY {what: ' '} # STRUCTURED-DATA
- !PARSE.OPTIONAL
  what: !PARSE.TUPLE
    - !PARSE.EXACTLY {what: '['}
    - !PARSE.UNTIL   {what: ' '} # SD-ID
    - !PARSE.REPEAT  
      what:
      !PARSE.TUPLE # SD-PARAM
      - !PARSE.EXACTLY {what: ' '}
      - !PARSE.UNTIL   {what: '='} # PARAM-NAME
      - !PARSE.EXACTLY {what: '='}
      - !PARSE.BETWEEN 
         what: '"'
         escaped: '\\"]'

Single CPU Core on HW-M1-20: 304004 EPS
Single CPU Core on HW-I7-15: 181494 EPS

Reference Hardware¤

HW-M1-20¤

Machine: MacBook Air (M1, 2020)
CPU: Apple M1, Launched at 2020

HW-I7-15¤

Machine: MacBook Pro (15-inch, 2016)
CPU: 2.6 GHz Quad-Core Intel Core i7, I7-6700HQ, Launched at 2015

Schema¤

Schemas in SP-Lang describe the type and other properties of fields in dynamically types containers such as JSON or Python dictionaries.

It is important to provide information about the type to the SP-Lang because it is used as an input for a type inference and hence optimal performance.

Schema definition¤

YAML representation of the schema:

---
define:
  type: splang/schema

fields:
  field1:
    type: str
    aliases: ["FieldOne"]

  field2:
    type: ui64

Options¤

Option `type`¤

Defines the data type for the given attribute, such as str, si64 and so on. Refer to a SP-Lang type system for more information.

This option is mandatory.

Option `aliases`¤

Defines field aliases for the given attribute, that can be used in the declaration as a synonymic term.

If an field1 has a field alias named FieldOne, the following declarations are equal if the schema is properly defined:

!GET
what: field1
from: !ARG input

!GET
what: FieldOne
from: !ARG input

Option `unit`¤

Defines the unit of the attribute, if needed, such as for timestamps. In this case, the unit can be auto for automatical detection, seconds and microseconds.

Function declaration (Python)¤

The example of the SP-Lang function declaration that uses MYSCHEMA.yaml:

splang.FunctionDeclaration(
    name="main",
    returns="bool",
    arguments={
        'myArgument': 'json<MYSCHEMA>'
    },
)

and MYSCHEMA.yaml itself:

---
define:
  type: splang/schema

fields:
  field1:
    type: str

  field2:
    type: ui64

In-place schemas¤

SP-Lang allows to specify schema directly in the FunctionDeclaration Python code:

splang.FunctionDeclaration(
    name="main",
    returns="bool",
    arguments={
        'myArgument': 'json<INPLACESCHEMA>'
    },
    schemas=[
        ('INPLACESCHEMA', {
            "field1": "str",
            "field2": "si32",
            "field3": "ui64",
        })
    ]
)

It is done by using tuple, the first item is a schema name, the second is a dictionary with fields.

Memory Management¤

Memory management in SP-Lang is based on memory arenas concept.

Diagram: Memory arena layout

Memory arena is a pre-allocated bigger memory chunk that is available for a given lifecycle (aka. one event processing cycle). When any code related to the event processing needs a memory, it asks for a slice from a memory arena. This slice is swiftly provided because it is always taken from a beginning of the free space within the arena (aka offset). The deallocation happens at once, for whole arena; it is called the "reset" of the memory arena. This means that the memory arena concept is very efficient, doesn't introduce memory fragmentation and couple nicely with a static single assignment concept of the SP-Lang.

Memory arena also supports a list of destructors that allows an integration with traditional e.g. malloc allocations for 3rd party technologies that are not compatible with memory arena (e.g. PCRE2 library). Destructors are executed during the arena reset.

Memory arena could be extended by another memory chunk is the current chunk is depleted.

Data types

SP-Lang data types¤

In the SP-Lang, type system plays a critical role in ensuring the correctness and efficiency of expression execution. SP-Lang employs type inference. It means that the type system operates behind the scenes, delivering high performance without burdening the user with its complexities. This approach allows for a seamless and user-friendly experience, where advanced users can access the type system for more fine-grained control and optimization.

Info

A type system is a set of rules that define how data types are classified, combined, and manipulated in a language. It helps catch potential errors early on, improving code reliability, and ensures that operations are performed only on compatible data types.

Scalar types¤

Scalar types are the basic building blocks of a language, which represent single values. They are essential for working with different kinds of data and performing various operations.

Integers¤

Integers are whole numbers, like -5, 0, or 42, that can be used for counting or simple arithmetic operations. Integers could be signed or unsigned.

Type	Name	Type	Name	Bits	Bytes
`si8`	Signed 8bit integer	`ui8`	Unsigned 8bit integer	8	1
`si16`	Signed 16bit integer	`ui16`	Unsigned 16bit integer	16	2
`si32`	Signed 32bit integer	`ui32`	Unsigned 32bit integer	32	4
`si64`	Signed 64bit integer	`ui64`	Unsigned 64bit integer	64	16
`si128`	Signed 128bit integer	`ui128`	Unsigned 128bit integer	128	32
`si256`	Signed 256bit integer	`ui256`	Unsigned 256bit integer	256	64

A preferred (default) integer type is si64 (signed 64bit integer), followed by ui64 (unsigned 64bit integer). This is because SP-Lang is designed primarily for 64bit CPUs.

int is the alias for si64.

Warning

256bit sizes are not fully supported yet.

Boolean¤

A Boolean (bool) is a type that has one of two possible values denoted True and False.

Floating-Point¤

Floating-point numbers are decimal numbers, such as 3.14 or -0.5, that are useful for calculations involving fractions or more precise values.

Type	Name	Bytes
`fp16`	16bit float	2
`fp32`	32bit float	4
`fp64`	64bit float	8
`fp128`	128bit float	16

Warning

fp16 and fp128 are not fully supported.

Warning

Alias float translates to fp64 which translates to LLVM double (different from alias float).

Complex scalar types¤

Complex scalar types are designed for values that provides some internal structure (so technically they are records or tuples) but they can fit into a scalar type (e.g. for performance or optimization purposes).

Date/Time¤

datetime

This is a value that represents a date and time in the UTC, using broken time structure. Broken time means that year, month, day, hour, minute, second and microsecond are stored in dedicated fields; different from the e.g. UNIX timestamp.

Timezone: UTC
Resolution: microseconds (six decimal digits)
64bit unsigned integer, aka ui64

Broken time components

y / year
m / month
d / day
H / hour
M / minute
S / second
u / microsecond

More detailed description of date/time is here.

IP Address¤

This data type contains IPv4 or IPv6 address.

ip

Underlying scalar type: ui128

RFC 4291

IPv4 are mapped into IPv6 space as prescribed in RFC 4291 "IPv4-Mapped IPv6 Address".
For example, the IPv4 address 12.23.45.67 will be mapped into IPv6 address ::ffff:c17:2d43.

MAC Address¤

This data type contains MAC address, (EUI-48).

What is MAC Address?

A MAC address (short for medium access control address) is a unique identifier assigned to a network card etc.

mac

Underlying scalar type: ui64, only 6 octets are used in EUI-48.

Geographical coordinate¤

This type represents geographical coordinate, specifically longitude and latitude.

geopoint

Underlying scalar type: u64

More detailed description of geopoint is here.

Generic types¤

Generic types are used in the early stage of the SP-Lang parsing, optimization and compilation. The complementary type is Specific type. The SP-Lang resolves generic types into specific types by the mechanism called type inference. If generic type cannot be resolved into specific, the compilation will fail and you need to provide more information for a type inference.

The generic type starts with capital T. Also if the container type contains generic type, the container type or structural type itself is considered generic.

Container types¤

List¤

[Ti]

Ti refers to a type of the item in the list

The list must contain a zero, one or many items of the same type.

The type constructor is !LIST expression.

Set¤

{Ti}

Ti refers to a type of the item in the set

The type constructor is !SET expression.

Dictionary¤

{Tk:Tv}

Tk refers to a type of the key
Tv refers to a type of the value

The type constructor is !DICT expression.

Bag¤

[(Tk,Tv)]

Tk refers to a type of the key
Tv refers to a type of the value

A bag (aka multimap) is a container that allows duplicate keys, unlike a dictionary, which only allows unique keys.

Tip

The bag is essentially a list of 2-tuples (couples).

Product types¤

A product type is a compounded type, formed by combining other types into a structure.

Tuple¤

Signature: (T1, T2, T3, ...)

The type constructor is !TUPLE expression.

It is equivalent to a structure type in LLVM IR.

Tip

A tuple with no members respectively () is the unit.

Record¤

Signature: (name1: T1, name2: T2, name3: T3, ...)

The type constructor is !RECORD expression.

It is is equivalent to a C struct.

Sum type¤

A Sum type is a data structure used to hold a value that could take on several different types.

Any¤

any

The any type is a special type that represents a value that can have any type.

Warning

The any type shouldn't be used as a preferred type because it has an overhead. Still, it is rather helpful for typing the dictionary that combines types (e.g. {str:any}) and other situations where the type of the value is not known in the compile type.

The value contained in any type is always located in the memory (e.g., memory pool); for this reason, this type is slower than others, which store value preferably in CPU registers.

The any is a recursive type; it can contain itself because it contains all other types in the type universe. For this reason, it is impossible to calculate the generic or even maximum size of the any variable.

Object types¤

String¤

str

Must be in UTF-8 encoding.

Note

str could be casted to [ui8] (list of ui8) in 'toll-free' manner; it is the binary equivalent.

Bytes¤

Work in progress

Planned

Enum¤

Work in progress

Planned

Regex¤

regex

Contains compiled pattern for a regular expression.

If the regex pattern is constant, then it is compiled during the respective expression compile time. In the case of dynamic regex pattern, the regex compilation happens during the expression evaluation.

JSON¤

json<SCHEMA>

JSON object, result of the JSON parsing. It is schema-based type.

Function Type¤

Function¤

(arg1:T1,arg2:T2,arg3:T3)->Tr

T1, T2, T3 are types of functions inputs arg1, arg2 and arg3 respectively.
Tr specifies the output type of the function

Pythonic types¤

Pythonic types are object types that provides interfacing with the Python.

Python Dictionary¤

pydict<SCHEMA>

A Python dictionary. It is a schema-based type.

Python Object¤

pyobj

A generic Python object.

Python List¤

pylist

A Python list.

Python Tuple¤

pytuple

Casting¤

Use !CAST expression for change of the type of a value.

!CAST
what: 1234
type: fp32

or an equivalent shortcut:

!!fp32 1234

Note

Cast is also a great helper for type inference, it means that it could be used to indicate the the type explicitly, if needed.

Schema-based types¤

Schema is the SP-Lang concept of how to bridge schema-less systems such us JSON or Python with strongly-typed SP-Lang. Schema is basically a directory that maps fields to their types and so on. For more information, continue to a chapter about SP-Lang schemas.

SP-Lang Schema-based type specifies the schema by a schema name: json<SCHEMANAME>. The schema name is used to locate the schema definition eg. in the library.

List of schema-based types: * pydict<...> * json<...>

Build-in schemas¤

ANY: This schema declares any member to be of type any.
VOID: This schema has no member, use in-place type definition to specify types of fields.

SP-Lang date/time¤

Type datetime is a value that represents a date and time in the UTC, using broken time structure. Broken time means that year, month, day, hour, minute, second and microsecond are stored in dedicated fields; different from the e.g. UNIX timestamp.

Timezone: UTC
Resolution: microseconds (six decimal digits)

Useful tools

Bit layout¤

The datetime is stored in 64bit unsigned integer (ui64); little-endian format, Intel/AMD 64bit native.

Position	Component	Bits	Mask	Type*	Range	Remark
58-63		4			0…15	OK (0)/Error (8)/Reserved
46-57	year	14		`si16`	-8190…8191
42-45	month	4	0x0F	`ui8`	1…12	Indexed from 1
37-41	day	5	0x1F	`ui8`	1…31	Indexed from 1
32-36	hour	5	0x1F	`ui8`	0…24
26-31	minute	6	0x3F	`ui8`	0…59
20-25	second	6	0x3F	`ui8`	0…60	60 is for leap second
0-19	microsecond	20		`ui32`	0…1000000

Note

*) Type is recommended/minimal byte-aligned type for a respective component.

Timezone details¤

Timezone information originates from pytz respectively from the IANA Time Zone Database.

Note

The time zone database has precision down to the minute, it means that seconds and microseconds remain untouched when converting from/to UTC.

The timezone data is represented by a filesystem directory structure commonly located at /usr/share/splang or at location specified by SPLANG_SHARE_DIR environment variable. The actual timezone data are stored at tzinfo subfolder. The timezone data are generated by a script generate_datetime_timezones.py during installation of SPLang.

Example of the tzinfo folder

```
.
└── tzinfo
  ├── Europe
    │  ├── Amsterdam.sptl
    │  ├── Amsterdam.sptb
    │  ├── Andorra.sptl
    │  ├── Andorra.sptb
```

.sptl and .sptb files contain speed-optimized binary tables that supports fast lookups for local time <-> UTC conversions. .sptl is for little-endian CPU architectures (x86 and x86-64), .sptb is for big-endian architectures.

The file is memory-mapped into the SP-Lang process memory space, aligned on 64byte boundary, so that it can be directly used as a lookup.

Common structures¤

ym: Year & month, ym = (year << 4) + month
dhm: Day, hour & minute, dhm = (day << 11) + (hour << 6) + minute

Both structures are bit-wise parts of the datetime scalar value and can be extracted from datetime using AND and SHR.**

Timezone file header¤

Header length in 64 bytes. Unspecified bytes are set to 0 and reserved for a future use.

Position 00...03: SPt / magic identifier
Position 04: < for little-endian CPU architecture, > for big-endian
Position 05: Version (currently 1 ASCII character)
Position 08...09: Minimal year/month (min_ym) in this file, month MUST BE 1
Position 10...11: Maximal year/month (min_ym) in this file
Position 12...15: The position of the "parser table" in the file, multiplied by 64, typically 1 b/c the parser table is stored directly after a header

Timezone parser table¤

The parser table is a lookup table used for conversion from the local date/time into UTC.

The table is organised into rows/years and columns/months.
The cell is 4 bytes (32bits) wide, the row is then 64 bytes long.

First 12 cells are "primary parser cells" (in light blue color), the number reflect the number of the month (1...12). The remaining 4 cells are "parser next cells", the number nX is the index.

Primary parser cell¤

The position of the cell for a given date/time is calculated as pos = (ym - min_ym) << 5 which means that year and month is used for a cell localization, minus the minimal year&month value for a table.

Structure of the cell:

16 bits: range, 16bits, dhm
3 bits: next
7 bits: hour offset from UTC
6 bits: minute offset from UTC

dhm denotes the day, hour and minute in the year/month, when the time change (e.g. Daylight-saving time start/end) is observed. For a typical month - where there is no time change is observed - the dhm value represents the maximum in the given month.

If dhm for a input date/time is mathematically lower than dhm from the primary cell, then the hour and minute information is used to adjust date/time from local to UTC.

If dhm is greater, then the next contains a number of the "parser next cell"; present at the end of the relevant parser table row.

Parser next cell¤

The "parser next cell" contain a "continuation" of the information for a month where the time change is observed. The "continuation" means the offset from UTC that happens when local time passed time change boundary.

Structure of the cell:

16 bits: range, 16bits, dhm
3 bits: not used, set to 0
7 bits: hour offset from UTC
6 bits: minute offset from UTC

dhm denotes the day, hour and minute in the year/month, when the NEXT time change (e.g. Daylight-saving time start/end) is observed. Because currently we only support the single time change in the month, this field is set to maximum dhm for a given month.

The hour and minute information is used to adjust date/time from local to UTC.

Note

Currently, only one time change per month is supported, which seems to be fully sufficient for all info in IANA time zone database.

Empty/unused next cells are zeroed.

Errors¤

If datetime bit 63 is set, then the date/time value represents an error. Likely the expression that produced this value failed in some way.

The error code is stored in lower 32bits.

Mixed types¤

Since datetime is 64bit unsigned integer, it could happen - yet this is NOT recommended - that another date/time representation is used. This is an table how to automatically detect a what format is used for a date/time representation.

Representation	1st Jan 2000	1st Jan 2100	Lower range	Upper range
UNIX timestamp	946 681 200	4 102 441 200	0	10 000 000 000
UNIX timestamp (milli)	946 681 200 000	4 102 441 200 000	100 000 000 000	10 000 000 000 000
UNIX timestamp (micro)	946 681 200 000 000	4 102 441 200 000 000	100 000 000 000 000	10 000 000 000 000 000
SP-Lang datetime	140 742 023 840 793 010	147 778 898 258 559 000	100 000 000 000 000 000	-

SP-Lang geopoint¤

The geopoint type is a composite data type designed to efficiently store and represent geographical coordinates, specifically longitude and latitude, in a compact binary format. It combines the longitude and latitude into a single 64-bit integer, utilizing a fixed-point encoding to ensure precision and efficient storage. The geopoint type provides a balance between precision and storage efficiency, making it an ideal choice for modern 64-bit CPU architectures.

Format¤

The higher 32 bits represent the encoded longitude, and the lower 32 bits represent the encoded latitude. Both longitude and latitude are encoded as unsigned 32-bit integers (ui32).

Longitude¤

Scale factor for longitude is: (2^32 / 360) = ~11930464.711

Encoding: encoded_longitude = (longitude + 180) * (2^32 / 360)

Decoding: longitude = (encoded_longitude / (2^32 / 360)) - 180

Latitude¤

Scale factor for latitude is: (2^32 / 180) = ~23860929.422

Encoding: encoded_latitude = (latitude + 90) * (2^32 / 180)

Decoding: latitude = (encoded_latitude / (2^32 / 180)) - 90

Precision¤

The encoded longitude has a precision of approximately 4.76 meters at the equator.

The encoded latitude has a precision of approximately 1.19 meters.

Strings¤

Strings in SP-Lang uses UTF-8 encoding. The string type representation is str.

String representation¤

String is represented by a P-String respective by the record with following items:

Length of the string in bytes as 64bit unsigned number.
Pointer to the start of a string data.

String is also an array of bytes

Value of str is binary compatible with [ui8], a list of ui8.

Compatibility with Null-terminated strings¤

Value of str MUST NOT end with \0 (NULL).

The additional \0 can be placed just after string data but not included in a string length. It provides direct compatibility with NULL-terminated string systems. It is however not guaranteed by str implicitly.

NULL terminated string can be "converted" into str by creating new str using strlen() and actual pointer to a string data. Alternativelly, the complete copy can be created as well.

String data¤

String data is the memory space that contains the actual string value.

The string data could be:

placed just after str structure
completely independent string buffer (“string view”)

The string data may be shared with many str structures, including references to the portions of the string data (aka substrings).

Details of container types¤

List¤

The list represents a finite number of ordered items, where the same item may occur more than once.

Set¤

The set is a composition of the Internal list and the hash table.

Dict¤

The dict (aka dictionary) is a composition of the set (itself a hash table and a list) of keys (called Key set with Key list ) and a list of values (called Value list).

Hash table¤

Set and Dict types uses a hash table.

The hash table is designed so that it maps the 64bit hash of the key directly into an index of the item. The perfect hash strategy is applied so no collision resolution is implemented for a constructed hash table. If a hash table constructing algorithm detects a colision, the algorithm is restarted with a different seed value. This approach leverages relatively rate xxhash64 collision rate.

A hash table can be (lazily) generated only when it is needed (e.g. for !IN and !GET expressions). This applies for objects created dynamically during runtime. Static sets a dictionaries provide a prepared hash table.

A hash table is searched using a binary search.

The used hashing function are:

XXH3 64bit with seed for str
xor with seed for si64, si32, si16, si8, ui64, ui32, si16, ui8

Expressions

SP-Lang Expressions¤

Expressions in SP-Lang are written as YAML tags directives.

List of expressions¤

Expression	Type	Category	Description
`!COUNT`	sequence	aggregate	Counts the number of items.
`!MIN`	sequence	aggregate	Calculates the minimum from a list of items.
`!MAX`	sequence	aggregate	Calculates the maximum from a list of items.
`!AVG`	sequence	aggregate	Calculates the average (arithmetic mean) of items in a list.
`!MEDIAN`	sequence	aggregate	Finds the median (middle value) of a list of items.
`!MODE`	sequence	aggregate	Finds the value that appears most often.
`!RANGE`	sequence	aggregate	Finds the difference between the highest and smallest value.
`!ADD`	sequence	arithmetic	Addition.
`!SUB`	sequence	arithmetic	Subtraction.
`!MUL`	sequence	arithmetic	Multiplication.
`!DIV`	sequence	arithmetic	Division.
`!MOD`	sequence	arithmetic	Modulo.
`!POW`	sequence	arithmetic	Exponentiation.
`!ABS`	mapping	arithmetic	Absolute value.
`!SHL`	mapping	bitwise	Left logical shift.
`!SHR`	mapping	bitwise	Right logical shift.
`!SAL`	mapping	bitwise	Left arithmetic shift.
`!ROL`	mapping	bitwise	Circular rotation to the left.
`!ROR`	mapping	bitwise	Circular rotation to the right.
`!EQ`	sequence	comparisons	Equal to.
`!NE`	sequence	comparisons	Not equal to.
`!LT`	sequence	comparisons	Less than.
`!LE`	sequence	comparisons	Less than or equal to.
`!GT`	sequence	comparisons	Greater than.
`!GE`	sequence	comparisons	Greater than or equal to.
`!IN`	mapping	comparisons	Membership test.
`!IF`	mapping	control	Simple conditional branching.
`!WHEN`	sequence	control	Powerful branching.
`!MATCH`	mapping	control	Pattern matching.
`!TRY`	sequence	control	Execute till first non-error expression.
`!MAP`	mapping	control	Apply the expression on each element in a sequence.
`!REDUCE`	mapping	control	Reduce the elements of an list into a single value.
`!INCLUDE`	scalar	directives	Inserts the content of another file.
`!ARGUMENT`	scalar	function	Gets a function argument.
`!ARG`	scalar	function	Gets a function argument.
`!FUNCTION`	mapping	function	Defines a new function.
`!FN`	mapping	function	Defines a new function.
`!SELF`	mapping	function	Applies the current function, used for recursion.
`!IP.FORMAT`	mapping	ip	Converts an IP address into a string.
`!IP.INSUBNET`	mapping	ip	Check if IP address falls into a subnet.
`!GET`	mapping	json	Gets a single value from JSON.
`!JSON.PARSE`	mapping	json	Parses JSON.
`!LIST`	mapping	list	Creates a list of items.
`!GET`	mapping	list	Gets a single item from a list.
`!AND`	sequence	logic	Conjunction.
`!OR`	sequence	logic	Disjunction.
`!NOT`	sequence	logic	Negation.
`!LOOKUP`	mapping	lookup	Creates a new lookup.
`!GET`	mapping	lookup	Gets items from a lookup.
`IN`	mapping	lookup	Checks if an item is in a lookup.
`!RECORD`	mapping	record	A collection of named items.
`!GET`	mapping	record	Gets the item from a record.
`!REGEX`		regex	Regular expression search.
`!REGEX.REPLACE`	mapping	regex	Regular expression replace.
`!REGEX.SPLIT`	mapping	regex	Split a string by a regular expression.
`!REGEX.FINDALL`	mapping	regex	Find all occurrences by a regular expression.
`!REGEX.PARSE`	mapping	regex	Parse by a regular expression.
`!SET`	mapping	set	Set of items.
`!IN`	mapping	set	Membership test.
`!IN`	mapping	string	Tests if a string contains a substring.
`!STARTSWITH`	mapping	string	Tests whether a string starts with a selected prefix.
`!ENDSWITH`	mapping	string	Tests whether a string ends with a selected suffix.
`!SUBSTRING`	mapping	string	Extracts part of a string.
`!LOWER`	mapping	string	Transforms a string into lowercase.
`!UPPER`	mapping	string	Transforms a string into uppercase.
`!CUT`	mapping	string	Cuts the string and returns a selected part.
`!SPLIT`	mapping	string	Splits a string into a list.
`!RSPLIT`	mapping	string	Splits a string from right into a list.
`!JOIN`	mapping	string	Joins a list of strings.
`!TUPLE`	mapping	tuple	A collection of items.
`!GET`	mapping	tuple	Get item from a tuple.
`!CAST`	mapping	utility	Converts type of the argument into another.
`!HASH`	mapping	utility	Calculates a digest.
`!DEBUG`	mapping	utility	Debugs the expression.

Aggregate expressions¤

Overview¤

An aggregate expression is a type of function that performs calculations on a set of values and returns a single value as a result. These expressions are commonly used to summarize or condense data.

!COUNT: Counts the number of items.
!MAX, !MIN: Calculates the maximum / maximum.
!AVG: Calculates the average (arithmetic mean).
!MEDIAN: Finds the middle value.
!MODE: Finds the value that appears most often.
!RANGE: Finds the difference between the highest and smallest value.

`!COUNT`¤

Counts the number of items in a list.

Type: Sequence

Example

!COUNT
- Frodo Baggins
- Sam Gamgee
- Gandalf
- Legolas
- Gimli
- Aragorn
- Boromir of Gondor
- Merry Brandybuck
- Pippin Took

Returns 9.

`!MAX`¤

Returns a maximum value from the sequence.

Type: Sequence

Example

!MAX
- 1.5
- 2.6
- 5.1
- 3.05
- 4.45

The result of this expression is 5.1.

`!MIN`¤

Returns a minimum value from the sequence.

Type: Sequence

Example

!MIN
- 2.6
- 3.05
- 4.45
- 0.5
- 5.1

The result of this expression is 0.5.

`!AVG`¤

Calculate the average / arithmetic mean.

Type: Sequence

Info

Read more about Arithmetic mean on Wikipedia.

Example

!AVG
- 6
- 2
- 4

Calculation of the average (6+2+4)/3, the result is 4.

`!MEDIAN`¤

The median is the middle value in a list of numbers; half of the values are greater than the median, and half are less than the median. If the list has an even number of elements, the median is the average of the two middle values.

Type: Sequence

Info

`!MODE`¤

The mode is the value or values that occur most frequently in a list. It can be used to represent the central tendency of a data set.

Type: Sequence

Info

`!RANGE`¤

Calculates the difference between the largest and smallest values.

Type: Sequence

Info

Arithmetic expressions¤

Overview¤

Arithmetic expressions are used for basic arithmetic operations with data.

!ADD: Addition.
!SUB: Subtraction.
!MUL: Multiplication.
!DIV: Division.
!MOD: Modulo (remainder after division).
!POW: Exponentiation.
!ABS: Absolute value.

`!ADD`¤

Type: Sequence

You can add following types:

Numbers (Integers and floats)
Strings
Lists
Sets
Tuples
Records

Example

!ADD
- 4
- -5
- 6

Calculates 4+(-5)+6, the result is 5.

`!SUB`¤

Type: Sequence

Example

!SUB
- 3
- 1
- -5

Calculates 3-1-(-5), the result is 7.

`!MUL`¤

Type: Sequence

Example

!MUL
- 7
- 11
- 13

Calculates 7*11*13, the result is 1001 (which happens to be the Scheherazade constant).

`!DIV`¤

Type: Sequence

Example

!DIV
- 21
- 1.5

Calculates 21/5, the result is 14.0.

Division by zero¤

Division by zero produces the error, which can cascade thru the expression.

!TRY expression can be used to handle this situation. The first item in !TRY is a !DIV that can produce division by zero error. The second item is a value that will be returned when such an error occurs.

!TRY
- !DIV
  - !ARG input
  - 0.0
- 5.0

`!MOD`¤

Type: Sequence

Calculate the signed remainder of a division (aka modulo operation).

Info

`!POW`¤

Type: Sequence

Calculate the exponent.

Example

!POW
- 2
- 8

Calculates 2^8, the result is 16.

`!ABS`¤

Type: Mapping

!ABS
what: <x>

Calculate the absolute value of input x, which is the non-negative value of x without regard to its sign.

Example

!ABS
what: -8.5

The result is a value 8.5.

Bitwise expressions¤

Overview¤

The bit shifts treat a value as a series of bits, the binary digits of the value are moved, or shifted, to the left or right.

!SHL, !SHR: Left and right logical shifts.
!SAL, !SAR: Left and right arithmetic shifts.
!ROL, !ROR: Circular rotations to the left and right.

There are also bitwise !AND, !OR and !NOT expression, at Logic chapter.

`!SHL`¤

Left logical shift.

Type: Mapping.

!SHL
what: <...>
by: <...>

Tip

Left shifts could be used as fast multiplication by 2, 4, 8 and so on.

Example

!SHL
what: 9
by: 2

9 is represented by the binary value 1001. The left logical shift moves the bits to the left by 2. The result is 100100, which is 36 in the base-ten system. This is the same result as 9 * (2^2).

`!SHR`¤

Right logical shift.

Type: Mapping.

!SHR
what: <...>
by: <...>

Tip

Right shifts could be used as fast division by 2, 4, 8 and so on.

Example

!SHR
what: 16
by: 3

16 is represented by 10000. The logical shift moves the bits to the right by 3. The result is 10, which is 2 in base-ten system. This is the same result as 16 / (2^3).

`!SAL`¤

Left arithmetic shift.

Type: Mapping.

!SAL
what: <...>
by: <...>

Example

!SAL
what: 60
by: 2

`!SAR`¤

Right arithmetic shift.

Type: Mapping.

!SAR
what: <...>
by: <...>

`!ROL`¤

Circular left rotation.

Type: Mapping.

!ROL
what: <...>
by: <...>

`!ROR`¤

Circular right rotation.

Type: Mapping.

!ROR
what: <...>
by: <...>

Comparisons expressions¤

Overview¤

Test expression evaluates inputs and returns boolean value true or false based on the result of the test.

!EQ: Equal
!NE: Not equal
!LT: Less than
!LE: Less than or equal to
!GT: Greater than
!GE: Greater than or equal to
!IN: Membership test

`!EQ`¤

Equal to.

Type: Sequence.

Example

!EQ
- !ARG count
- 3

This compares count argument with 3, returns count == 3

`!NE`¤

Not equal to.

Type: Sequence.

This is negative counterpart to !EQ.

Example

!NE
- !ARG name
- Frodo

This compares name argument with Frodo, returns name != Frodo.

`!LT`¤

Less than.

Type: Sequence.

Example

!LT
- !ARG count
- 5

Example of a count < 5 test.

`!LE`¤

Less than or equal to.

Type: Sequence.

Example

!LE
- 2
- !ARG count
- 5

Example of a range 2 <= count <= 5 test.

`!GT`¤

Greater than.

Type: Sequence.

Example

!GT [!ARG count, 5]

Example of a count > 5 test using a compacted YAML form.

`!GE`¤

Greater than or equal to.

Type: Sequence.

Example

!GT
- !ARG count
- 5

Example of a count >= 5 test.

`!IN`¤

Membership test.

Type: Mapping.

!IN
what: <...>
where: <...>

The !IN expression is used to check if a value what exists in a value where or not. Value where is a string, container (list, set, dictionary), structural type etc. Evaluate to true if it finds a value what in the specified value where and false otherwise.

Example

!IN
what: 5
where:
  - 1
  - 2
  - 3
  - 4
  - 5

Check for a presence of the value 5 in the list where. Returns "true".

Example

!IN
what: "Willy"
where: "John Willy Boo"

Check for a presence of the substring "Willy" in the John Willy Boo value. Returns true.

Control expressions¤

Overview¤

SP-Lang provides a variety of control flow statements.

!IF: Simple conditional branching.
!WHEN: Powerful branching.
!MATCH: Pattern matching.
!TRY: Execute till first non-error expression.
!MAP: Apply the expression on each element in a sequence.
!REDUCE: Reduce the elements of a list into a single value.

`!IF`¤

Simple conditional branching.

Type: Mapping.

The !IF expression is a decision-making expression that guides the evaluation to make decisions based on specified test.

!IF
test: <expression>
then: <expression>
else: <expression>

Based on the value of test, the branch is evaluated:

then in case of test !EQ true
else in case of test !EQ false

Both then and else have to return the same type, which will be also the type of the !IF return value.

Example

!IF
test:
  !EQ
  - !ARG input
  - 2
then:
  It is two.
else:
  It is NOT two.

`!WHEN`¤

Powerful branching.

Type: Sequence.

!WHEN expression is considerably more powerful than !IF expression. Cases can match many different patterns, including interval matches, tuples, and so on.

!WHEN
- test: <expression>
  then: <expression>

- test: <expression>
  then: <expression>

- test: <expression>
  then: <expression>

- ...

- else: <expression>

If else is not provided, then WHEN returns False.

Example

Example of !WHEN use for exact match, range match and set match:

!WHEN

# Exact value match
- test:
    !EQ
    - !ARG key
    - 34
  then:
    "thirty four"

# Range match
- test:
    !LT
    - 40
    - !ARG key
    - 50
  then:
    "forty to fifty (exclusive)"

# In-set match
- test:
    !IN
    what: !ARG key
    where:
      - 75
      - 77
      - 79
  then:
    "seventy five, seven, nine"

- else:
    "unknown"

`!MATCH`¤

Pattern matching.

Type: Mapping.

!MATCH
what: <what-expression>
with:
  <value>: <expression>
  <value>: <expression>
  ...
else:
  <expression>

!MATCH expression evaluates the what-expression, matching the expression's value to a case clause, and executes expression associated with that case.

The else branch of the !MATCH is optional. The expression fails with error when no matching <value> is found and else branch is missing.

Example

!MATCH
what: !ARG value
with:
    1: "one"
    2: "two"
    3: "three"
else:
    "other number"

Use of !MATCH to structure the code

!MATCH
what: !ARG code
with:
    1: !INCLUDE code-1.yaml
    2: !INCLUDE code-2.yaml
else:
    !INCLUDE code-else.yaml

`!TRY`¤

Execute till first non-error expression.

Type: Sequence

!TRY
- <expression>
- <expression>
- <expression>
...

Iterate thru expression (top down), if the expression return non-null (None) result, stop iteration and return that value. Otherwise continue to the next expression.

Returns None (error) when end of the list is reached.

Note: The obsoleted name of this expression was !FIRST. It was obsoleted in November 2022.

`!MAP`¤

Apply the expression on each element in a sequence.

Type: Mapping.

!MAP
what: <sequence>
apply: <expression>

The apply expression is applied on each element in the what sequence with the argument x containing the respective item value. The result is a new list with transformed elements.

Example

!MAP
what: [1, 2, 3, 4, 5, 6, 7]
apply:
    !ADD [!ARG x, 10]

The result is [11, 12, 13, 14, 15, 16, 17].

`!REDUCE`¤

Reduce the elements of an list into a single value.

Type: Mapping.

!REDUCE
what: <expression>
apply: <expression>
initval: <expression>
fold: <left|right>

The apply expression is applied on each element in the what sequence with the argument a containing an aggregation of the reduce operation and argument b containing the respective item value.

The initval expression provides the initial value for the a argument.

An optional fold value specified a "left folding" (left, default) or a "right folding" (right).

Example

!REDUCE
what: [1, 2, 3, 4, 5, 6, 7]
initval: -10
apply:
  !ADD [!ARG a, !ARG b]

Calculates a sum of the sequence with an initial value -10.
Result is 18 = -10 + 1 + 2 + 3 + 4 + 5 + 6 + 7.

Date/time expressions¤

Overview¤

Date and time is expressed in SP-Lang by a datetime type. It has a microsecond resolution and a range from year 8190 B.C. to a year 8191. It is in the UTC timezone.

Info

For more information about datetime type, continue here.

!NOW: A current date and time.
!DATETIME: Construct a new datetime.
!DATETIME.FORMAT: Format datetime.
!GET Get a datetime component.

`!NOW`¤

Type: Mapping.

Get a current date and time.

!NOW

`!DATETIME`¤

Type: Mapping.

Constructs the datetime from components such as year, month, day and so on.

!DATETIME
year: <year>
month: <month>
day: <day>
hour: <hour>
minute: <minute>
second: <second>
microsecond: <microsecond>
timezone: <timezone>

year is an integer number in range -8190 … 8191.
month is an integer number in range 1 … 12.
day is an integer number in range 1 … 31, respective to a number of days in a given month.
hour is an integer number in 0 … 24, it is optional and default value is 0.
minute is an integer number in 0 … 59, it is optional and default value is 0.
second is an integer number in 0 … 60, it is optional and default value is 0.
microsecond is an integer number in 0 … 1000000, it is optional and default value is 0.
timezone is IANA Time Zone Database name of the timezone. It is optional and a default timezone is UTC.

Example: UTC date/time

!DATETIME
year: 2021
month: 10
day: 13
hour: 12
minute: 34
second: 56
microsecond: 987654

Example: default values

!DATETIME
year: 2021
month: 10
day: 13

Example: timezones

!DATETIME
year: 2021
month: 10
day: 13
timezone: Europe/Prague

!DATETIME
year: 2021
month: 10
day: 13
timezone: "+05:00"

`!DATETIME.FORMAT`¤

Type: Mapping.

Format a date and time information based on the datetime.

!DATETIME.FORMAT
with: <datetime>
format: <format>
timezone: <string>

The datetime contains the information about the data and time to be used for formatting. The format is a string that contains specification about the format of the output. The timezone is optional information, if provided, the time will be printed in the local time specified by the argument, otherwise UTC timezone is used.

Format¤

Directive	Component
`%H`	Hour (2-hour clock) as a zero-padded decimal number.
`%M`	Minute as a zero-padded decimal number.
`%S`	Second as a zero-padded decimal number.
`%f`	Microsecond as a decimal number, zero-padded to 6 digits.
`%I`	Hour (12-hour clock) as a zero-padded decimal number.
`%p`	Locale’s equivalent of either AM or PM.
`%d`	Day of the month as a zero-padded decimal number.
`%m`	Month as a zero-padded decimal number.
`%y`	Year without century as a zero-padded decimal number.
`%Y`	Year with century as a decimal number
`%z`	UTC offset
`%a`	Weekday as abbreviated name.
`%A`	Weekday as full name.
`%w`	Weekday as a decimal number, where 0 is Sunday and 6 is Saturday.
`%b`	Month as abbreviated name.
`%B`	Month as full name.
`%j`	Day of the year as a zero-padded decimal number.
`%U`	Week number of the year (Sunday as the first day of the week) as a zero-padded decimal number. All days in a new year preceding the first Sunday are considered to be in week 0.
`%W`	Week number of the year (Monday as the first day of the week) as a zero-padded decimal number. All days in a new year preceding the first Monday are considered to be in week 0.
`%c`	Date and time representation.
`%x`	Date representation.
`%X`	Time representation.
`%%`	A literal '%' character.

Example

!DATETIME.FORMAT
with: !NOW
format: "%Y-%m-%d %H:%M:%S"
timezone: "Europe/Prague"

Prints the current local time as e.g. 2022-12-31 12:34:56 using the timezone "Europe/Prague".

`!DATETIME.PARSE`¤

Type: Mapping.

Parse a date and time from a string.

!DATETIME.PARSE
what: <string>
format: <format>
timezone: <timezone>

Parse what string input using format string. The timezone information is optional, if provided, then it specifies local timezone of the what string.

See Format chapter above for more information about format.

Example

!DATETIME.PARSE
what: "2021-06-29T16:51:43-08"
format: "%y-%m-%dT%H:%M:%S%z"

`!GET`¤

Type: Mapping.

Extract the date/time component such as hour, minute, day etc. from datetime.

!GET
what: <string>
from: <datetime>
timezone: <timezone>

Extract the what component from datetime. The timezone if optional, if not provided UTC timezone is used.

Components¤

Directive	Component
`year`, `y`	Year
`month`, `m`	Month
`day`, `d`	Day
`hour`, `H`	Hour
`minute`, `M`	Minute
`second`, `S`	Second
`microsecond`, `f`	Microsecond
`weekday`, `w`	Day of the week

Example

Get hours component of the current timestamp, using the "Europe/Prague" timezone.

!GET
what: H
from: !NOW
timezone: "Europe/Prague"

Example: Get a current year

!GET { what: year, from: !NOW }

Dictionary expressions¤

Overview¤

The dict (aka dictionary) store a collection of (key, value) pairs, such that each possible key appears at most once in the collection. Keys in the dictionary must be of the same type as well as values.

An item is a (key, value) pair, represented as a tuple.

Hint

You may know this structure under alternative names "associative array" or "map".

!DICT: Dictionary.
!GET: Get the value from a dictionary.
!IN: Membership test.

`!DICT`¤

Dictionary.

Type: Mapping

!DICT
with:
  <key1>: <value1>
  <key2>: <value2>
  <key3>: <value3>
  ...

Hint

Use !COUNT to determine number of items in the dictionary.

Example

There are several ways, how a dictionary can be specified in SP-Lang:

!DICT
with:
  key1: "One"
  key2: "Two"
  key3: "Three"

Implicit dictionary:

---
key1: "One"
key2: "Two"
key3: "Three"

Concise dictionary using !!dict and YAML flow style:

!!dict {key1: "One", key2: "Two", key3: "Three"}

Type specification¤

The type of dictionary is denoted as {Tk:Tv}, where Tk is a type of the key and Tv is a type of value. For more info about the dictionary type, continue to the relevant chapter in a type system.

The dictionary will try to infer its type based on the items added. The type of the first item will likely provide the key type Tk and the value type Tv. If the dictionary is empty, its inferred type is {str:si64}.

You can override this by using the explicit type specification:

!DICT
type: "{str:any}"
with:
  <key1>: <value1>
  <key2>: <value2>
  <key3>: <value3>
  ...

type is an optional argument containing a string with the dictionary signature that will be used instead of type inference from children.

In the above example, the type of the dictionary is {str:any}, the type of key is str and the type of values is any.

`!GET`¤

Get the value from a dictionary.

Type: Mapping.

!GET
what: <key>
from: <dict>
default: <value>

Get the item from the dict (dictionary) identified by the key.

If the key is not found, return default or error if default is not provided. default is optional.

Example

!GET
what: 3
from:
  !DICT
  with:
    1: "One"
    2: "Two"
    3: "Three"

Returns Three.

`!IN`¤

Membership test.

Type: Mapping.

!IN
what: <key>
where: <dict>

Check if key is present in the dict.

Note

The expression !IN is described in the Comparisons chapter.

Example

!IN
what: 3
where:
  !DICT
  with:
    1: "One"
    2: "Two"
    3: "Three"

Directives¤

Overview¤

Note

SP-Lang directives are expanded during compilation. They are not expressions.

!INCLUDE: Inserts the content of another file.

`!INCLUDE`¤

Insert the content of another file.

Type: Scalar, Directive.

The !INCLUDE directive is used to paste a content of given file into current file. If included file is not found, SP-Lang renders error.

Synopsis:

!INCLUDE <filename>

The filename is a name of the file in the library to be included.

It could be:

an absolute path, starting with / from the root of the library,
an relative path to the location of the file containing !INCLUDE statement

.yaml extension is optional and will be added to the filename if missing.

Example

This is a simple inclusion of the other_file.yaml:

!INCLUDE other_file.yaml

Example

In this example, !INCLUDE is used to decompose a larger expression into a logically separated files:

!MATCH
what: !GET {...}
with:
  'group1': !INCLUDE inc_group1
  'group2': !INCLUDE inc_group2

Function expressions¤

Overview¤

!ARGUMENT, !ARG: Gets a function argument.
!FUNCTION, !FN: Defines a new function.
!SELF: Applies the current function, used for recursion.

`!ARGUMENT`, `!ARG`¤

Provides an access to an argument name.

Type: Scalar.

Synopsis:

!ARGUMENT name

or

!ARG name

Tip

!ARG is an concise version of !ARGUMENT.

`!FUNCTION`, `!FN`¤

The !FUNCTION expression defines a new function. It is typically used as a top-level expression.

Type: Mapping.

Info

SP-Lang expressions are implicitly_placed function definition. It means that in a majority of cases, the expression !FUNCTION can be skipped, and only do section is provided.

Synopsis:

!FUNCTION
name: <name of function>
arguments:
  arg1: <type>
  arg2: <type>
  ...
returns: <type>
schemas: <dictionary of schemas>
do:
  <expression>

Tip

!FN is an concise version of !FUNCTION.

Example

!FUNCTION
arguments:
  a: si64
  b: si32
  c: si32
  d: si32
returns: fp64
do:
  !MUL
  - !ARGUMENT a
  - !ARGUMENT b
  - !ARGUMENT c
  - !ARGUMENT d

This expression defines a function that takes four arguments (a, b, c, and d) with respective data types (si64, si32, si32, and si32) and returns a result of type fp64. The function multiplies the four input arguments (a, b, c, and d) and returns the product as a floating-point number (fp64).

`!SELF`¤

The !SELF provides an ability to recursively apply "self" aka a current function.

Type: Mapping.

Synopsis:

!SELF
arg1: <value>
arg2: <value>
...

Note

!SELF expression is the so called Y combinator.

Example

!FUNCTION
arguments: {x: int}
returns: int
do:
  !IF # value <= 1
    test: !GT [!ARG x, 1]
    then: !MUL [!SELF {x: !SUB [!ARG x, 1]}, !ARG x]
   else: 1

This expression defines a recursive function that takes a single integer argument x and returns an integer result. The function calculates the factorial of the input argument x using an if-else statement. If the input value x is greater than 1, the function multiplies x by the factorial of (x - 1), computed by calling itself recursively. If the input value x is 1 or less, the function returns 1.

IP Address expressions¤

Overview¤

IP addresses are represented internally as a number, 128bit unsigned integer. Such a type can contain both IPv6 and IPv4. IPv4 are mapped into IPv6 space, using RFC 4291 "IPv4-Mapped IPv6 Address".

!IP.FORMAT: Converts an IP address into a string.
!IP.INSUBNET: Check if IP address falls into a subnet.

`!IP.FORMAT`¤

Convert an IP address into a string.

Type: Mapping.

Synopsis:

!IP.FORMAT
what: <ip>

Convert the internal representation of the IP address into a string.

`!IP.INSUBNET`¤

Check if IP address falls into a subnet.

Type: Mapping.

Synopsis:

!IP.INSUBNET
what: <ip>
subnet: <subnet>

!IP.INSUBNET
what: <ip>
subnet:
  - <string>
  - <string>
  - <string>

Test if what IP address belongs to a subnet or subnets , returns true if yes otherwise false.

Example with a single subnet

!IP.INSUBNET
what: 192.168.1.1
subnet: 192.168.0.0/16

Example with multiple subnets

!IP.INSUBNET
what: 1.2.3.4
subnet:
    - 10.0.0.0/8
    - 172.16.0.0/12
    - 192.168.0.0/16

The test that check if IP address is from IPv4 private address space as defined in RFC 1918.

More compact form:

!IP.INSUBNET
what: 1.2.3.4
subnet: [10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16]

Parse of the IP address¤

IP address is parsed automatically from a string. If needed, you may explicitly cast string-based IP address into the ip type:

!CAST
type: ip
what: 192.168.1.1

Parse of the IP subnet¤

IP subnet is parsed automatically from a string. If needed, you may explicitly cast string-based IP address into the ipsubnet type:

!CAST
type: ipsubnet
what: 192.168.1.1/16

JSON¤

Overview¤

SP-Lang offers a high-speed access to JSON data objects.

!GET: Gets a single value from JSON.
!JSON.PARSE: Parses JSON.

`!GET`¤

Get a single value from JSON.

Type: Mapping.

Synopsis:

!GET
what: <item>
type: <type>
from: <json>
default: <value>

Get the item specified by the what from the from JSON object. If the item is not found, return default or error if default is not provided. default is optional.

You may optionally specify the item type by type.

Example

JSON (aka !ARG jsonmessage):

{
"foo.bar": "Example"
}

Get the field foo.bar from a JSON above:

!GET
what: foo.bar
from: !ARG jsonmessage

JSON Pointer¤

If you want to access the item in the nested JSON, you need to use a JSON Pointer (e.g. /foo/bar) as a what for that.

The schema will be applied to infer the type of the item but for more complex access, the type argument is recommended.

Nested JSON

Nested JSON (aka !ARG jsonmessage):

{
    "foo": {
        "bar": "Example"
        }
}

Example of extraction of the string from the nested JSON:

!GET
what: /foo/bar
type: str
from: !ARG jsonmessage

`!JSON.PARSE`¤

Parse JSON.

Type: Mapping.

Synopsis:

!JSON.PARSE
what: <str>
schema: <schema>

Parse JSON string. The result can be used with e.g. !GET operator.

Optional argument schema specifies the schema to be applied. The default schema is build-in ANY.

Example

!JSON.PARSE
what: |
{
    "string1": "Hello World!",
    "string2": "Goodbye ..."
}

List expressions¤

Overview¤

The list is one of basic data structures provided by SP-Lang. The list contains a finite number of ordered item, where the same item may occur more than once. Items in the list must be of the same type.

Note

The list is sometimes also called inaccurately an array.

!LIST: Creates a list of items.
!GET: Gets a single item from a list.

`!LIST`¤

Create a list of items.

Type: Implicit sequence, Mapping.

Synopsis:

!LIST
- ...
- ...

Hint

Use !COUNT to determine number of items in the list.

There are several ways, how a list can be specified in SP-Lang:

Example

!LIST
- "One"
- "Two"
- "Three"
- "Four"
- "Five"

Example

Implicit list using YAML block sequences:

- "One"
- "Two"
- "Three"
- "Four"
- "Five"

Example

Implicit list using YAML flow sequences:

["One", "Two", "Three", "Four", "Five"]

Example

The mapping form:

!LIST
with:
- "One"
- "Two"
- "Three"
- "Four"
- "Five"

`!GET`¤

Get a single item from a list.

Type: Mapping.

Synopsis:

!GET
what: <index of the item in the list>
from: <list>

index is an integer (number). It can be negative, in that case, it specifies an item from the end of the list. Items are indexed from the 0, it means that the first item in the list has an index 0.

If the index is out of bound of the list, the statement returns with error.

Example

!GET
what: 3
from:
    !LIST
    - 1
    - 5
    - 30
    - 50
    - 80
    - 120

Returns 50.

Example

!GET
what: -1
from:
    !LIST
    - 1
    - 5
    - 30
    - 50
    - 80
    - 120

Returns the last item in the list, which is 120.

Logic expressions¤

Overview¤

Logic expressions are commonly used to create more restrictive and precise conditions, such as filtering event, or triggering specific actions based on a set of criteria. Logic expressions operates with truth values true and false.

Logic expressions are representations of boolean algebra

For more information, continue to boolean algebra page at Wikipedia.

!AND: Conjunction.
!OR: Disjunction.
!NOT: Negation.

`!AND`¤

The logical !AND expression is used to combine two or more conditions that must all be true for the entire expression to be true. It is used to create more restrictive and precise conditions.

Type: Sequence

Synopsis:

!AND
- <condition 1>
- <condition 2>
- ...

In a logical !AND expression, conditions (condition 1, condition 2, ...) can be any expressions that evaluate to a boolean value (true or false). The conditions are evaluated from top to bottom, and the evaluation process stops as soon as a false condition is found, following the concept of short-circuit evaluation.

Logical conjunction

For more information, continue to Logical conjunction page at Wikipedia.

Example

!AND
- !EQ
    - !ARG vendor
    - TeskaLabs
- !EQ
    - !ARG product
    - LogMan.io
- !EQ
    - !ARG version
    - v23.10

In this example, if all of the conditions evaluate to true, the entire logical !AND expression will be true. If any of the conditions are false, the logical !AND expression will be false.

Bitwise `!AND`¤

When !AND is applied on integer types, instead on booleans, it provides a bitwise AND.

Example

!AND
- !ARG PRI
- 7

In this example, the argument PRI is masked with 7 (in binary 00000111).

`!OR`¤

The logical !OR expression is used to combine two or more conditions where at least one of the conditions must be true for the entire expression to be true. It is used to create more flexible and inclusive conditions.

Type: Sequence

Synopsis:

!OR
- <condition 1>
- <condition 2>
- ...

Conditions (condition 1, condition 2, ...) can be any expressions that evaluate to a boolean value (true or false). The conditions are evaluated from top to bottom, and the evaluation process stops as soon as a true condition is found, following the concept of short-circuit evaluation.

Logical disjunction

For more information, continue to Logical disjunction page at Wikipedia.

Example

!OR
- !EQ
    - !ARG description
    - unauthorized access
- !EQ
    - !ARG reason
    - brute force
- !EQ
    - !ARG message
    - malware detected

In this example, the expression is true when any of the following conditions is met:

The description field matches the string "unauthorized access"
The reason field matches the string "brute force"
The message field matches the string "malware detected"

Bitwise `!OR`¤

When !OR is applied on integer types, instead on booleans, it provides a bitwise OR.

Example

!OR
- 1  # Read access (binary 001, decimal 1)
- 4  # Execute access (binary 100, decimal 4)

In this example, the expression is evaluated to 5.

This is because, in a bitwise !OR operation, each corresponding bit in the binary representation of the two numbers is combined using the !OR expression:

001 (read access)
100 (execute access)
---
101 (combined permissions)

The expression calculates the permissions with the resulting value (binary 101, decimal 5) from the bitwise OR operation, combining both read and execute access.

`!NOT`¤

The logical !NOT expression is used to invert the truth value of a single condition. It is used to exclude specific conditions when certain conditions are not met.

Type: Mapping.

Synopsis:

!NOT
what: <expression>

Negation

For more information, continue to Negation page at Wikipedia.

Bitwise `!NOT`¤

When integer is provided, then !NOT returns value with bits of what flipped.

Tip

If you want to test that integer is not zero, use !NE test expression.

Lookup expressions¤

Overview¤

!LOOKUP: Creates a new lookup.
!GET: Gets items from a lookup.
IN: Checks if an item is in a lookup.

`!LOOKUP`¤

Type: Mapping.

`!GET`¤

Get item from a lookup.

Type: Mapping.

`!IN`¤

Check if the item is in the lookup.

Type: Mapping.

Record expressions¤

Overview¤

The record is one of basic data structures provided by SP-Lang. A record is a collection of items, possibly of different types. Items of a record are named (in a contrast to a tuple) by a label.

Note

The record is built on top of !TUPLE.

!RECORD: A collection of named items.
!GET: Gets the item from a record.

`!RECORD`¤

A collection of named items.

Type: Mapping.

Synopsis:

!RECORD
with:
  item1: <item 1>
  item2: <item 2>
  ...

item1 and item2 are labels of respective items in the record.

There is no limit of the number of items in the record. The order of the items is preserved.

Example

!RECORD
with:
  name: John Doe
  age: 37
  height: 175.4

Use of the YAML flow form:

!RECORD {with: {name: John Doe, age: 37, height: 175.4} }

Use of the !!record tag:

!!record {name: John Doe, age: 37, height: 175.4}

Enforce specific type of the item:

!RECORD
with:
  name: John Doe
  age: !!ui8 37
  height: 175.4

Field age will have a type ui8.

`!GET`¤

Get the item from a record.

Type: Mapping.

Synopsis:

!GET
what: <name or index of the item>
from: <record>

If what is a string, then it is a name of the field in the record.

If what is an integer (number), then it is index in the record. what can be negative, in that case, it specifies an item from the end of the list. Items are indexed from the 0, it means that the first item in the list has an index 0. If the what is out of bound of the list, the statement returns with error.

Using names of items:

!GET
what: name
from:
  !RECORD
  with:
    name: John Doe
    age: 32 let
    height: 127,5

Returns John Doe.

Using the index of items:

!GET
what: 1
from:
  !RECORD
-  with:
    name: John Doe
    age: 32
    height: 127.5

Returns 32, a value of age item.

Using the negative index of items:

!GET
what: -1
from:
  !RECORD
  with:
    name: John Doe
    age: 32
    height: 127.5

Returns 127.5, a value of height item.

Regex expressions¤

Overview¤

Tip

Use Regexr to develop and test regular expressions.

!REGEX: Regular expression search.
!REGEX.REPLACE: Regular expression replace.
!REGEX.SPLIT: Split a string by a regular expression.
!REGEX.FINDALL: Find all occurrences by a regular expression.
!REGEX.PARSE: Parse by a regular expression.

`!REGEX`¤

Regular expression search.

Type: Mapping.

Synopsis:

!REGEX
what: <string>
regex: <regex>
hit: <hit>
miss: <miss>

Scan through what string looking for any location where regular expression regex produces a match. If there is a match, then returns hit, otherwise miss is returned.

The expression hit is optional, default value is true.

The expression miss is optional, default value is false.

Example

```yaml !IF test: !REGEX what: "Hello world!" regex: "world" then: "Yes :-)" else: "No ;-("

```

Another form:

!REGEX
what: "Hello world!"
regex: "world"
hit: "Yes :-)"
miss: "No ;-("

`!REGEX.REPLACE`¤

Regular expression replace.

Type: Mapping.

Synopsis:

!REGEX.REPLACE
what: <string>
regex: <regex>
by: <string>

Replace regular expression regex matches in what by value of by.

Example

!REGEX.REPLACE
what: "Hello world!"
regex: "world"
by: "Mars"

Returns: Hello Mars!

`!REGEX.SPLIT`¤

Split a string by a regular expression.

Type: Mapping.

Synopsis:

!REGEX.SPLIT
what: <string>
regex: <regex>
max: <integer>

Split string what by regular expression regex.

An optional argument max specify the maximum number of splits.

Example

!REGEX.SPLIT
what: "07/14/2007 12:34:56"
regex: "[/ :]"

Returns: ['07', '14', '2007', '12', '34', '56']

`!REGEX.FINDALL`¤

Find all occurrences by a regular expression.

Type: Mapping.

Synopsis:

!REGEX.FINDALL
what: <string>
regex: <regex>

Find all matches of regex in the string what.

Example

!REGEX.FINDALL
what: "Frodo, Sam, Gandalf, Legolas, Gimli, Aragorn, Boromir, Merry, Pippin"
regex: \w+

Returns: ['Frodo', 'Sam', 'Gandalf', 'Legolas', 'Gimli', 'Aragorn', 'Boromir', 'Merry', 'Pippin']

`!REGEX.PARSE`¤

Parse by a regular expression.

Type: Mapping.

See the chapter !PARSE.REGEX

Set expressions¤

Overview¤

The set store unique items, without any particular order. Items in the set must be of the same type. The set is one of basic data structures provided by SP-Lang.

A set is best suited for a testing value for membership rather than retrieving a specific element from a set.

!SET: Set of items.
!IN: Membership test.

`!SET`¤

Set of items.

Type: Implicit sequence, Mapping.

Synopsis:

!SET
- ...
- ...

Hint

Use !COUNT to determine number of items in the set.

There are several ways, how a set can be specified in SP-Lang:

Example

!SET
- "One"
- "Two"
- "Three"
- "Four"
- "Five"

Unordered set

YAML unordered set:

!!set
? Yellow pork
? Pink grass
? White snow

YAML flow sequences

Concise set using YAML flow sequences:

!SET ["One", "Two", "Three", "Four", "Five"]

Example

The mapping form:

!SET
with:
- "One"
- "Two"
- "Three"
- "Four"
- "Five"

`!IN`¤

Membership test.

Type: Mapping.

Synopsis:

!IN
what: <item>
where: <set>

Check if item is present in the set.

The expression !IN is described in the Comparisons chapter.

Example

!IN
what: 3
where:
  !SET
  with:
    - 1
    - 2
    - 5
    - 8

String expressions¤

Overview¤

!IN: Tests if a string contains a substring.
!STARTSWITH: Tests whether a string starts with a selected prefix.
!ENDSWITH: Tests whether a string ends with a selected suffix.
!SUBSTRING: Extracts part of a string.
!LOWER, !UPPER: Transforms a string into lowercase / uppercase.
!CUT: Cuts the string and returns a selected part.
!SPLIT, !RSPLIT: Splits a string into a list.
!JOIN: Joins a list of strings.

`!IN`¤

The !IN expression is used to check if a string what exists in a string where or not.

Type: Mapping.

Synopsis:

!IN
what: <...>
where: <...>

Evaluate to true if it finds a substring what in the string where and false otherwise.

Example

!IN
what: "Willy"
where: "John Willy Boo"

Check for a presence of the substring "Willy" in the where value. Returns true.

Multi-string variant¤

There is a special variant on !IN operator for checking if any of strings provided in what value (a list in this case) is in the string. It is efficient, optimized implementation of the multi-string matcher.

!IN
what:
  - "John"
  - "Boo"
  - "ly"
where: "John Willy Boo"

This is very efficient way of checking if at least one substring is present in the where string. It provides Incremental String Matching algorithm for fast pattern matching in strings. It makes it an ideal tool for complex filtering as a standalone bit or an optimization technique.

Example of !REGEX optimization by multi-string !IN:

    !AND
    - !IN
      where: !ARG message
      what:
      - "msgbox"
      - "showmod"
      - "showhelp"
      - "prompt"
      - "write"
      - "test"
      - "mail.com"
    - !REGEX
      what: !ARG message
      regex: "(msgbox|showmod(?:al|eless)dialog|showhelp|prompt|write)|(test[0-9])|([a-z]@mail\.com)

This approach is recommended from applications in streams, where you need to filter an extensive amount of the data with assumption that only a smaller portion of the data matches the patters. An application of the !REGEX expression directly will slow processing down significantly, because it is complex regular expression. The idea is to "pre-filter" data with a simpler but faster condition so that only a fraction of the data reaches the expensive !REGEX. The typical performance improvement is 5x-10x.

For that reason, the !IN must be a perfect superset of the !REGEX, it means:

!IN -> true, !REGEX -> true: true
!IN -> true, !REGEX -> false: false (this should be a minority of cases)
!IN -> false, !REGEX -> false: false (prefiltering, this should be a majority of cases)
!IN -> false, !REGEX -> true: this combination MUST BE avoided, adopt the !IN and/or !REGEX accordingly.

`!STARTSWITH`¤

Returns true if what string begins with prefix.

Type: Mapping

Synopsis:

!STARTSWITH
what: <...>
prefix: <...>

Example

!STARTSWITH
what: "FooBar"
prefix: "Foo"

Multi-string variant¤

Work in progress

Not implemented yet.

!STARTSWITH
what: <...>
prefix: [<prefix1>, <prefix2>, ...]

In multi-string variant, a list of strings is defined. The expression evaluates to true if at least one prefix string matches the start of the what string.

`!ENDSWITH`¤

Returns true if what string ends with postfix.

Type: Mapping

Synopsis:

!ENDSWITH
what: <...>
postfix: <...>

Example

!ENDSWITH
what: "autoexec.bat"
postfix: ".bat"

Multi-string variant¤

Work in progress

Not implemented yet.

!ENDSWITH
what: <...>
postfix: [<postfix1>, <postfix2>, ...]

In multi-string variant, a list of strings is defined. The expression evaluates to true if at least one postfix string matches the end of the what string.

`!SUBSTRING`¤

Return part of the string what, in between from and to index.

Type: Mapping

Synopsis:

!SUBSTRING
what: <...>
from: <...>
to: <...>

Info

The first character of the string is located on position from=0.

Example

!SUBSTRING
what: "FooBar"
from: 1
to: 3

Returns oo.

`!LOWER`¤

Transform a string or list of strings input to lowercase format.

Type: Mapping

Synopsis:

!LOWER
what: <...>

Example

!LOWER
what: "FooBar"

Returns foobar.

Example

!LOWER
what: ["FooBar", "Baz"]

Returns list of values ["foobar", "baz"].

`!UPPER`¤

Type: Mapping

Synopsis:

!UPPER
what: <...>

Example

!UPPER
what: "FooBar"

Returns FOOBAR.

`!CUT`¤

Cut the string by a delimiter and return the piece identified by field index (starts with 0).

Type: Mapping

Synopsis:

!CUT
what: <string>
delimiter: <string>
field: <int>

The argument value string will be split using a delimiter argument. The argument field specifies a number of the split strings to return, starting with 0.
If the negative field is provided, then field is taken from the end of the string, for example -2 means the second last substring.

Example

!CUT
what: "Apple,Orange,Melon,Citrus,Pear"
delimiter: ","
field: 2

Will return value "Melon".

Example

!CUT
what: "Apple,Orange,Melon,Citrus,Pear"
delimiter: ","
field: -2

Will return value "Citrus".

`!SPLIT`¤

Splits a string into a list of strings.

Type: Mapping

Synopsis:

!SPLIT
what: <string>
delimiter: <string>
maxsplit: <number>

The argument what string will be split using a delimiter argument. An optional maxsplit arguments specifies how many splits to do.

Example

!SPLIT
what: "hello,world"
delimiter: ","

The result is a list: ["hello", "world"].

`!RSPLIT`¤

Splits a string from the right (end of the string) into a list of strings.

Type: Mapping

Synopsis:

!RSPLIT
what: <string>
delimiter: <string>
maxsplit: <number>

The argument what string will be split using a delimiter argument. An optional maxsplit arguments specifies how many splits to do.

`!JOIN`¤

Type: Mapping

Synopsis:

!JOIN
items:
  - <...>
  - <...>
delimiter: <string>
miss: ""

Default delimiter is space (" ").

If the item is None, then the value of miss parameter is used, by default it is empty string. If miss is None and any of items is None, the result of the whole join is None.

Example

!JOIN
items:
  - "Foo"
  - "Bar"
delimiter: ","

Tuple expressions¤

Overview¤

The tuple is one of basic data structures provided by SP-Lang. A tuple is a collection of items, possibly of different types.

!TUPLE: A collection of items.
!GET: Get item from a tuple.

`!TUPLE`¤

A collection of items.

Type: Mapping.

Synopsis:

!TUPLE
with:
  - ...
  - ...
  ...

There is no limit of the number of items in the tuple. The order of the items is preserved.

Example

!TUPLE
with:
  - John Doe
  - 37
  - 175.4

Example

Use of the !!tuple notation:

!!tuple
- 1
- a
- 1.2

Example

Even more concise version of the !!tuple using flow syntax:

!!tuple ['John Doe', 37, 175.4]

Example

Enforce specific type of the item:

!TUPLE
with:
  - John Doe
  - !!ui8 37
  - 175.4

Item #1 will have a type ui8.

`!GET`¤

Get item from a tuple.

Type: Mapping.

Synopsis:

!GET
what: <index of the item>
from: <tuple>

Argument what is an integer (number), it represent the index in a tuple. It can be negative, in that case, it specifies an item from the end of the list.

Items are indexed from the 0, it means that the first item in the list has an index 0.

If the what is out of bound of the list, the statement returns with error.

Example

!GET
what: 1
from:
  !TUPLE
  with:
    - John Doe
    - 32
    - 127.5

Returns `32`.

Example

Using the negative index of items:

!GET
what: -1
from:
  !TUPLE
  with:
    - John Doe
    - 32
    - 127.5

Returns 127,5.

Utility expressions¤

Overview¤

!CAST: Converts type of the argument into another.
!HASH: Calculates a digest.
!DEBUG: Debugs the expression.

`!CAST`¤

Convert type of the argument into another.

Type: Mapping.

Synopsis:

!CAST
what: <input>
type: <type>

Explicitly convert type of what into the type of type.

SP-Lang automatically converts types of arguments so that the user doesn't need to think about types at all. This feature is called implicit casting.

In case of explicit need for a type conversion, use !CAST expression. It is very powerful method that do a lot of heavy-lifting.

For more details, see chapter about types.

Example

!CAST
what: "10.3"
type: fp64

This is an explicit casting of the string into a floating-point number.

`!HASH`¤

Calculate a digest.

Type: Mapping.

Synopsis:

!HASH
what: <input>
seed: <integer>
type: <type of hash>

Calculate the hash for an what value.

seed specifies the initial hash seed.

type specifies a hashing function, the default value is XXH64.

Supported hashing functions¤

XXH64: xxHash, 64bit, non-cryptographic, extremely fast hash algorithm
XXH3: xxHash, 64bit, non-cryptographic, further optimized for small inputs

More information about xxHash are at xxhash.com.

Example

!HASH
what: "Hello world!"
seed: 5

`!DEBUG`¤

Print the content of the input and pass the value unchanged on the output.

Type: Mapping.

Parsec

PARSEC expressions¤

Parsec expressions group represents the concept of parser combinator.

They provide a way to combine basic parsers in order to construct more complex parsers for specific rules. In this context, a parser is a function that takes a single string as input and produces a structured output, that indicates successful parsing or provide an error message if the parsing process fails.

Parsec expressions are divided into two groups: parsers and combinators.

Parsers can be seen as the fundamental units or building blocks. They are responsible for recognizing and processing specific patterns or elements within the input string.

Combinators are operators (higher order functions) that allow the combination and composition of parsers.

Every expression starts with !PARSE. prefix.

Parser expressions¤

Overview¤

Parser expressions are functions for parsing a certain sequence of characters.

Basic parsers can differentiate between digits, letters and spaces:

!PARSE.DIGIT, !PARSE.DIGITS: Parse single or multiple digits.
!PARSE.LETTER, !PARSE.LETTERS: Parse single or multiple letters.
!PARSE.SPACE, !PARSE.SPACES: Parse single or multiple whitespace characters
!PARSE.CHAR, !PARSE.CHARS: Parse single or multiple characters.

The following expressions are used for parsing characters from custom set of characters and looking for specific characters in input strings:

!PARSE.EXACTLY: Parse only specific sequence of characters.
!PARSE.UNTIL: Parse till a specific character is found.
!PARSE.BETWEEN: Parse between two characters.
!PARSE.ONEOF: Parse only one of allowed characters.
!PARSE.NONEOF: Parse every character except forbidden ones.
!PARSE.REGEX: Parse characters matching a regular expression.

The following expressions are used for parsing dates and times in various formats:

!PARSE.DATETIME: Parse date and time.
!PARSE.MONTH: Parse month in various formats.
!PARSE.FRAC: Parse decimal numbers (which is useful for parsing microseconds).

The following expressions are used for parsing specific types of strings:

!PARSE.IP: Parse IP address.
!PARSE.MAC: Parse MAC address.

`!PARSE.DIGIT`¤

Parse a single digit.

Type: Parser.

Synopsis:

!PARSE.DIGIT

Example

Input string: 2

!PARSE.DIGIT

`!PARSE.DIGITS`¤

Parse a sequence of digits.

Type: Parser.

Synopsis:

!PARSE.DIGITS
min: <...>
max: <...>
exactly: <...>

exactly specifies the exact number of digits to parse.
min and max specify the minimal and maximal number of digits to parse. They cannot be combined with exactly parameter.
If none of fields min, max and exactly is specified, as many digits as possible are parsed.

Warning

exactly field can't be used together with min or max fields. And of course max value can't be less than min value.

Example

Input string: 123

!PARSE.DIGITS
max: 4

More examples

Parse as many digits as possible:

!PARSE.DIGITS

Parse exactly 3 digits:

!PARSE.DIGITS
exactly: 3

Parse at least 2 digits, but not more than 4:

!PARSE.DIGITS
min: 2
max: 4

`!PARSE.LETTER`¤

Parse a single letter.

By letters, we mean latin letters from A to Z, both uppercase and lowercase.

Type: Parser.

Synopsis:

!PARSE.LETTER

Example

Input string: A

!PARSE.LETTER

`!PARSE.LETTERS`¤

Parse a sequence of letters.

By letters, we mean latin letters from A to Z, both uppercase and lowercase.

Type: Parser.

Synopsis:

!PARSE.LETTERS
min: <...>
max: <...>
exactly: <...>

Fields min, max and exactly are optional.

Warning

exactly field can't be used together with min or max fields. And of course max value can't be less than min value.

Example

Input string: cat

!PARSE.LETTERS
max: 4

More examples

Parse as many letters as possible:

!PARSE.LETTERS

Parse exactly 3 letters:

!PARSE.LETTERS
exactly: 3

Parse at least 2 letters, but not more than 4:

!PARSE.LETTERS
min: 2
max: 4

`!PARSE.SPACE`¤

Parse a single space character.

Type: Parser.

Synopsis:

!PARSE.SPACE

`!PARSE.SPACES`¤

Parse a sequence of space characters.

Parse as many space symbols as possible:

Type: Parser.

Synopsis:

!PARSE.SPACES

`!PARSE.CHAR`¤

Parse a single character of any type.

Type: Parser.

Synopsis:

!PARSE.CHAR

Example

Input string: @

!PARSE.CHAR

`!PARSE.CHARS`¤

Parse a sequence of characters.

Type: Parser.

Synopsis:

!PARSE.CHARS
min: <...>
max: <...>
exactly: <...>

Fields min, max and exactly are optional.

Warning

exactly field can't be used together with min or max fields. And of course max value can't be less than min value.

Example

Input string:_ name@123_

!PARSE.CHARS
max: 8

Tip

Use !PARSE.CHARS with default settings to parse till the end of the string.

More examples

Parse as many chars as possible:

!PARSE.CHARS

Parse exactly 3 chars:

!PARSE.CHARS
exactly: 3

Parse at least 2 chars, but not more than 4:

!PARSE.CHARS
min: 2
max: 4

`!PARSE.EXACTLY`¤

Parse a precisely defined sequence of characters.

Type: Parser.

Synopsis:

!PARSE.EXACTLY
what: <...>

or shorter version:

!PARSE.EXACTLY <...>

Example

Input string:_Hello world!

!PARSE.EXACTLY
what: "Hello"

`!PARSE.UNTIL`¤

Parse a sequence of characters until a specific character is found.

Type: Parser.

Synopsis:

!PARSE.UNTIL
what: <...>
stop: <before/after>
eof: <true/false>

or shorter version:

!PARSE.UNTIL <...>

what: Specifies one (and only one) character to search for in the input string.
stop: Indicates whether the stop character should be parsed or not. Possible values: before or after (default).
eof: Indicates if we should parse till the end of the string if what symbol is not found. Possible values: true or false (default).
escape: Indicates escape character.

Info

Field what must be a single character. But some whitespace characters can also be used such as tab. To search for a sequence of characters, see the expression !PARSE.CHARS.LOOKAHEAD.

Example

Input string: 60290:11

!PARSE.UNTIL
what: ":"

More examples

Parse until : symbol and stop before it:

!PARSE.UNTIL
what: ":"
stop: "before"

Parse until space symbol and stop after it:

!PARSE.UNTIL ' '

Parse until , symbol or parse till the end of the string if it's not found:

!PARSE.UNTIL
what: ","
eof: true

Parse until tab symbol (upper and lower case are supported):

!PARSE.UNTIL TAB

Parse until newline symbol (upper and lower case are supported):

!PARSE.UNTIL NEWLINE

Parse until vertical slash, escape internal vertical slashes:
Input string:CRED_REFR\|success\|fail|

!PARSE.UNTIL
what: '|'
escape: '\'

`!PARSE.BETWEEN`¤

Parse a sequence of characters between two specific characters.

Type: Parser.

Synopsis:

!PARSE.BETWEEN
what: <...>
start: <...>
stop: <...>
escape: <...>

or shorter version:

!PARSE.BETWEEN <...>

what - indicates between which same characters we should parse.
start, stop - indicates between which different characters we should parse.
escape - indicates escape character.

Example

Input string:_ [10/May/2023:08:15:54 +0000]

!PARSE.BETWEEN
start: '['
stop: ']'

More examples

Parse between double-quotes:

!PARSE.BETWEEN
what: '"'

Parse between double-quotes, short form:

!PARSE.BETWEEN '"'

Parse between double-quotes, escape internal double-quotes:
Input string:"one, \"two\", three"

!PARSE.BETWEEN
what: '"'
escape: '\'

`!PARSE.ONEOF`¤

Parse a single character from a selected set of characters.

Type: Parser.

Synopsis:

!PARSE.ONEOF
what: <...>

or shorter version:

!PARSE.ONEOF <...>

Example

Input strings:

process finished with status 0
process finished with status 1
process finished with status x

!PARSE.KVLIST
- "process finished with status "
- !PARSE.ONEOF
what: "01x"

`!PARSE.NONEOF`¤

Parse a single character that is not in a selected set of characters.

Type: Parser.

Synopsis:

!PARSE.NONEOF
what: <...>

or shorter version:

!PARSE.NONEOF <...>

Example

Input string:_ Wow!

!PARSE.NONEOF
what: ",;:[]()"

`!PARSE.REGEX`¤

Parse a sequence of characters that matches a regular expression.

Type: Parser.

Synopsis:

!PARSE.REGEX
what: <...>

Example

Input string:_ FTVW23_L-C: Message...

Output: FTVW23_L-C

!PARSE.REGEX
what: '[a-zA-Z0-9_\-0]+'

`!PARSE.DATETIME`¤

Parse datetime.

Type: Parser.

Synopsis:

!PARSE.DATETIME
- year: <...>
- month: <...>
- day: <...>
- hour: <...>
- minute: <...>
- second: <...>
- microsecond: <...>
- nanosecond: <...>
- timezone: <...>

Fields month and day are required.
Field year is optional. If not specified, the smart year function will be used. Both 2 and 4-digit numbers are supported.
Fields hour, minute, second, microsecond, nanosecond are optional. If not specified, the default value 0 will be used.
Specifying microseconds field like microseconds? allows you to parse microseconds or not, depending on their presence in the input string.
Field timezone is optional. If not specified, the default value UTC will be used. Read more about timezone parsing here.

Common datetime formats

Use Shortcuts for parsing datetime formats RFC 3339, RFC 3164 and ISO 8601.

UNIX time

For parsing datetime in UNIX time, use !PARSE.DATETIME EPOCH.

Tip

Use !PARSE.MONTH for parsing a month.

Tip

Use !PARSE.FRAC for parsing microseconds and nanoseconds. Note that this expression consumes . and , as well. Do not parse them separately.

Example

Input string: 2022-10-13T12:34:56.987654

!PARSE.DATETIME
- year: !PARSE.DIGITS
- '-'
- month: !PARSE.MONTH 'number'
- '-'
- day: !PARSE.DIGITS
- 'T'
- hour: !PARSE.DIGITS
- ':'
- minute: !PARSE.DIGITS
- ':'
- second: !PARSE.DIGITS
- microsecond: !PARSE.FRAC
        base: "micro"

Two-digit year

Parse datetime with two-digit year:

Input string: 22-10-13T12:34:56.987654

!PARSE.DATETIME
- year: !PARSE.DIGITS  # Year can be either 4-digit or 2-digit
- '-'
- month: !PARSE.MONTH "number"
- '-'
- day: !PARSE.DIGITS
- 'T'
- hour: !PARSE.DIGITS
- ':'
- minute: !PARSE.DIGITS
- ':'
- second: !PARSE.DIGITS
- microsecond: !PARSE.FRAC
            base: micro

No year, optional microseconds

Parse datetime without a year, with short month form and optional microseconds:

Input strings:

Aug 17 12:00:00
Aug 17 12:00:00.123
Aug 17 12:00:00.123456

!PARSE.DATETIME
# There is no year in input string, smart year function is used.
- month: !PARSE.MONTH 'short'
- !PARSE.SPACE
- day: !PARSE.DIGITS
- !PARSE.SPACE
- hour: !PARSE.DIGITS
- ":"
- minute: !PARSE.DIGITS
- ":"
- second: !PARSE.DIGITS
- microsecond?: !PARSE.FRAC  # Parsing of microseconds is optional here
                base: "micro"
                max: 6

In this case, year is automatically determined by the smart year function, which basically means that the current year is used.

Milliseconds

Parse datetime with milliseconds:

Input string: 2023-03-23T07:00:00.734

!PARSE.DATETIME
- year: !PARSE.DIGITS
- "-"
- month: !PARSE.DIGITS
- "-"
- day: !PARSE.DIGITS
- "T"
- hour: !PARSE.DIGITS
- ":"
- minute: !PARSE.DIGITS
- ":"
- second: !PARSE.DIGITS
- microsecond: !PARSE.FRAC
            base: milli
            max: 3

Nanoseconds

Parse datetime with nanoseconds:

Input string: 2023-03-23T07:00:00.734323900

!PARSE.DATETIME
- year: !PARSE.DIGITS
- "-"
- month: !PARSE.DIGITS
- "-"
- day: !PARSE.DIGITS
- "T"
- hour: !PARSE.DIGITS
- ":"
- minute: !PARSE.DIGITS
- ":"
- second: !PARSE.DIGITS
- nanosecond: !PARSE.FRAC
            base: "nano"
            max: 9

Timezone¤

Timezone can be either specified in the log or it can be missing. There are two approaches for that:

Timezone is parsed from the input string. In that case, use the suitable parsing expression for the timezone part.
```
!PARSE.DATETIME
- timezone: !PARSE.UNTIL " "
```
Permissible formats of timezones are: Z, UTC, +02:00, -0600.
Timezone is fixed. In that case, specify it as IANA timezone.
```
!PARSE.DATETIME
- timezone: "Europe/Prague"
```
This will come handy when the timezone is missing in the log or when it is used incorrectly.

Timezone from input

Parse datetime which contains timezone in input strings.

Input strings:

2024-04-15T12:00:00+04:00 ...(other log content)
2024-04-15T12:00:00+02:00 ...
2024-04-15T12:00:00+00:00 ...
2024-04-15T12:00:00-02:00 ...

!PARSE.DATETIME
- year: !PARSE.DIGITS
- '-'
- month: !PARSE.MONTH 'number'
- '-'
- day: !PARSE.DIGITS
- 'T'
- hour: !PARSE.DIGITS
- ':'
- minute: !PARSE.DIGITS
- ':'
- second: !PARSE.DIGITS
- timezone: !PARSE.UNTIL " "  # Read timezone from '+04:00', '+02:00', etc.

Smart year¤

The smart year function is designed to predict the complete year from a provided month by taking into account the current year and month to determine the most likely corresponding four-digit year.

Shortcuts¤

Shortcut forms are available (in both lower/upper variants):

ISO 8601¤

!PARSE.DATETIME ISO8601

This expression parses datetimes defined by ISO 8601. Timezone can be parsed from the input string or, if not present, it can be set in the lmio-parsec configuration.

Example of datetimes that can be parsed using the shortcut:

2024-04-12T10:16:21Z
20240412T101621Z
2024-12-11T11:17:21.123456+00:00
2024-04-12T03:16:21−07:00
2024-04-12T03:16:21

RFC 3339¤

!PARSE.DATETIME RFC3339

This expression parses datetimes defined by RFC 3339. Timezone is always parsed from the input string.

Example of datetimes that can be parsed using the shortcut:

1985-04-12T23:20:50.52Z
1996-12-19T16:39:57-08:00
2021-06-29 16:51:43.987654+02:00

RFC 3164¤

!PARSE.DATETIME RFC3164

This expression parses datetimes defined by RFC 3164. Year is provided by the smart year function. Timezone must be set in LogMan.io Parsec configuration, otherwise considered as UTC.

Example of datetimes that can be parsed using the shortcut:

Apr 24 15:25:20
Oct  3 20:33:02
AUG  4 10:20:20

Epoch¤

!PARSE.DATETIME EPOCH

!PARSE.DATETIME epoch

This expression parses datetimes defined by Unix time. Expression allows parsing for different Unix datetime representations, such as seconds, milliseconds, microseconds and seconds with micro/milliseconds floating point.

Example of datetimes that can be parsed using the shortcut:

1731410205 - seconds
1727951634.687 - seconds with microseconds floating point
1728564383160 - milliseconds

`!PARSE.MONTH`¤

Parse a month.

Type: Parser.

Synopsis:

!PARSE.MONTH
what: <...>

or shorter version:

!PARSE.MONTH <...>

Parameter what indicates format of the month name. Possible values are:

number: numbered representation, e.g. 01 for January, 12 for December
short: three letters representation, e.g. Jan for January, Dec for December
full: full name representation, e.g. January, December

Tip

Use !PARSE.MONTH to parse month name as part of !PARSE.DATETIME.

Example

Input string: 10/Jan/2023:08:15:54

!PARSE.MONTH 'short'

Input string: 10/01/2023:08:15:54

!PARSE.MONTH 'number'

Input string: 10/January/2023:08:15:54

!PARSE.MONTH 'full'

`!PARSE.FRAC`¤

Parse a fraction.

Warn

Fraction parsing includes parsing of a dot (.) and a comma (,) separator.

Type: Parser.

Synopsis:

!PARSE.FRAC
base: <...>
max: <...>

base: Indicates a base of the fraction. Possible values are:
- milli for 10^-3 base
- micro for 10^-6 base
- nano for 10^-9 base
max: Indicates a maximum number of digits depending on the base value. Default values 3, 6, 9 will be applied if max parameter is not specified.

Tip

Use !PARSE.FRAC to parse microseconds or nanoseconds as part of !PARSE.DATETIME.

Example

Input strings:

Aug 22 05:40:14.264
Aug 22 05:40:14.264023

!PARSE.FRAC
base: "micro"

or the full form:

!PARSE.FRAC
base: "micro"
max: 6

`!PARSE.IP`¤

Parse IP address in both IPv4 and IPv6 formats.

Returns numeric representation of the IP address.

Type: Parser.

Synopsis:

!PARSE.IP

Example

Input string:_ 193.178.72.2

!PARSE.IP

`!PARSE.MAC`¤

Parse MAC address in the format XX:XX:XX:XX:XX:XX.

Returns numeric representation of the MAC address.

Type: Parser.

Synopsis:

!PARSE.MAC

Example

Input string: 4d:3b:4c:bc:e5:6d

!PARSE.MAC

Combinator expressions¤

Overview¤

Combinators are functions for composing parsec expressions (parsers or another combinators) together. They specify how parsing is applied, what is the output type. They can be used for the flow control of parsing (applying conditional or repeated expressions) and also for lookahead searching in the input string.

Output selectors determine the type of output:

!PARSE.KVLIST: Parse sequence of keys and values into bag type.
!PARSE.KV: Parse key and value from the input string.
!PARSE.TUPLE: Parse into tuple type.
!PARSE.RECORD

Flow control expressions can perform sequence of parser expressions based on certain conditions:

!PARSE.REPEAT: Performs the same sequence of expressions multiple times, similarly to "for" statement from different languages.
!PARSE.SEPARATED
!PARSE.OPTIONAL: Adds optional parser function, similarly to "if/else" statement from different languages.
!PARSE.TRIE: Performs the sequence of expressions based on the input string prefix.

Lookahead expressions:

!PARSE.CHARS.LOOKAHEAD: Parse until certain sequence of characters is found in the string.

`!PARSE.KVLIST`¤

Parse list of key-value pairs.

Iterating through list of elements !PARSE.KVLIST expression collects key-value pairs to bag.

Type: Combinator

Synopsis:

!PARSE.KVLIST
- <...>
- key: <...>

Non-key elements are parsed, but not collected:

!PARSE.KVLIST
- <...>  # parsed, but not collected
- key1: <...>  # parsed and collected
- key2: <...>  # parsed and collected

Nested !PARSE.KVLIST expressions are joined to the parent one:

!PARSE.KVLIST
- <...>
- !PARSE.KVLIST  # expression is joined to the parent one
  - key3: <...>
  - <...>
- key4: <...>

Example

Input string:

<141>May  9 10:00:00 myhost.com notice tmm1[22731]: User 'user' was logged in.

!PARSE.KVLIST
- '<'
- PRI: !PARSE.DIGITS
- '>'
- TIMESTAMP: !PARSE.DATETIME
                - month: !PARSE.MONTH 'short'
                - !PARSE.SPACES
                - day: !PARSE.DIGITS # Day
                - !PARSE.SPACES
                - hour: !PARSE.DIGITS # Hours
                - ':'
                - minute: !PARSE.DIGITS # Minutes
                - ':'
                - second: !PARSE.DIGITS # Seconds

- !PARSE.SPACES
- HOSTNAME: !PARSE.UNTIL ' '
- LEVEL: !PARSE.UNTIL ' '
- PROCESS.NAME: !PARSE.UNTIL '['
- PROCESS.PID: !PARSE.DIGITS
- ']:'
- !PARSE.SPACES
- MESSAGE: !PARSE.CHARS

Output:

[
    (PRI, 141),
    (TIMESTAMP, 140994182325993472),
    (HOSTNAME, myhost.com),
    (LEVEL, notice),
    (PROCESS.NAME, tmm1),
    (PROCESS.PID, 22731),
    (MESSAGE, User 'user' was logged in.)
]

`!PARSE.KV`¤

Parse key and value from a string into key-value pair, with the possibility of adding a certain prefix.

Type: Combinator

Synopsis:

!PARSE.KV
- prefix: <...>
- key: <...>
- value: <...>
- <...> # optional elements

prefix is optional. If specified, the prefix will be added to the key.
key and value are required.

Tip

Use a combination of !PARSE.REPEAT and !PARSE.KV to parse repeated key-value pairs. (see examples)

Example

Input string: eventID= "1011"

!PARSE.KV
- key: !PARSE.UNTIL '='
- !PARSE.SPACE
- value: !PARSE.BETWEEN {what: '"'}

Output: (eventID, 1011)

Parse key and value with a specified prefix

Input string: eventID= "1011"

!PARSE.KV
- key: !PARSE.UNTIL {what: '='}
prefix: SD.PARAM.
- !PARSE.SPACE
- value: !PARSE.BETWEEN {what: '"'}

Output: (SD.PARAM.eventID, 1011)

Usage together with !PARSE.REPEAT

Input string: devid="FEVM020000191439" vd="root" itime=1665629867

!PARSE.REPEAT
what: !PARSE.KV
    - !PARSE.OPTIONAL
    what: !PARSE.SPACE
    - key: !PARSE.UNTIL '='
    - value: !TRY
            - !PARSE.BETWEEN '"'
            - !PARSE.UNTIL { what: ' ', eof: true}

Output:

[
    (devid, FEVM020000191439),
    (vd, root),
    (itime, 1665629867)
]

`!PARSE.TUPLE`¤

Type: Combinator

Parse list of values to tuple.

Iterating through list of elements !PARSE.TUPLE expression collects values to tuple.

Synopsis:

!PARSE.TUPLE
- <...>
- <...>
- <...>

Example

Input string:_Hello world!

!PARSE.TUPLE
- 'Hello'
- !PARSE.SPACE
- 'world'
- '!'

Output: ('Hello', ' ', 'world', '!')

`!PARSE.RECORD`¤

Parse list of values to record structure.

Iterating through list of elements !PARSE.RECORD expression collects values to record structure.

Type: Combinator

Synopsis:

!PARSE.RECORD
- <...>
- element1: <...>
- element2: <...>
- <...>

Example

Input string: <165>1

!PARSE.RECORD
- '<'
- severity: !PARSE.DIGITS
- '>'
- version: !PARSE.DIGITS
- ' '

Output: {'output.severity': 165, 'output.version': 1}

`!PARSE.REPEAT`¤

Parse a repeated pattern.

Type: Combinator.

Synopsis:

!PARSE.REPEAT
what: <expression>
min: <...>
max: <...>
exactly: <...>

If neither of min, max, exactly is specified, what will be repeated as many times as possible.
exactly determines the exact number of repetitions.
min and max set minimal and maximal number of repetitions.

Example

Input string:_ host:myhost;ip:192.0.0.1;user:root;

!PARSE.KVLIST
- !PARSE.REPEAT
what: !PARSE.KV
    - key: !PARSE.UNTIL ':'
    - value: !PARSE.UNTIL ';'

This will repeat the !PARSE.KV expression as many times as possible.

Output:

[
    (host, myhost),
    (ip, 192.0.0.1),
    (user, root)
]

Parse

Input string:_ hello hello hello Anna!

!PARSE.KVLIST
- !PARSE.REPEAT
    what: !PARSE.EXACTLY 'hello '
    exactly: 3
- NAME: !PARSE.UNTIL '!'

Output: [(NAME, Anna)]

Parse

Input strings:

hello hello Anna!
hello hello hello Anna!
hello hello hello hello Anna!

!PARSE.KVLIST
- !PARSE.REPEAT
    what: !PARSE.EXACTLY 'hello '
    min: 2
    max: 4
- NAME: !PARSE.UNTIL '!'

Output: [(NAME, Anna)]

`!PARSE.SEPARATED`¤

Parse a sequence with a separator.

Type: Combinator.

Synopsis:

!PARSE.SEPARATED
what: <...>
sep: <...>
min: <...>
max: <...>
end: <...>

min and max are optional.
end indicates if trailing separator is required. By default, it is optional.

Example

Input string: 0->1->2->3

!PARSE.SEPARATED
what: !PARSE.DIGITS
sep: !PARSE.EXACTLY {what: "->"}
min: 3

Output: [0, 1, 2, 3]

Note: the trailing separator is optional, so input string 0->1->2->3-> is also valid.

More examples

Parse what values separated by sep in [min;max] interval, trailing separator is required:
Input string: 11,22,33,44,55,66,

!PARSE.SEPARATED
what: !PARSE.DIGITS
sep: !PARSE.EXACTLY {what: ","}
end: True
min: 3
max: 7

Parse what values separated by sep in [min;max] interval, trailing separator is not presented:
Input string: 0..1..2..3

!PARSE.SEPARATED
what: !PARSE.DIGITS
sep: !PARSE.EXACTLY {what: ".."}
end: False
min: 3
max: 5

`!PARSE.OPTIONAL`¤

Parse optional pattern.

Type: Combinator

!PARSE.OPTIONAL expression tries to parse the input string using the specified parser. If the parser fails, starting position rolls back to the initial one.

Synopsis:

!PARSE.OPTIONAL
what: <...>

or shorter version:

!PARSE.OPTIONAL <...>

Example

Input strings:

mymachine myproc[10]: DHCPACK to
mymachine myproc[10]DHCPACK to

!PARSE.KVLIST
- HOSTNAME: !PARSE.UNTIL ' ' # mymachine
- TAG: !PARSE.UNTIL '[' # myproc
- PID: !PARSE.DIGITS  # 10
- !PARSE.EXACTLY ']'

# Parsing of optional characters
- !PARSE.OPTIONAL ':'
- !PARSE.OPTIONAL
    what: !PARSE.SPACE

- NAME: !PARSE.UNTIL ' '

`!PARSE.TRIE`¤

Type: Combinator.

Parse using starting prefix.

!PARSE.TRIE expression chooses one of the specified prefixes and parse the rest of the input string using the corresponding parser. If empty prefix is specified, the corresponding parser will be used in case other prefixes are not matched.

Synopsis:

!PARSE.TRIE
- <prefix1>: <...>
- <prefix2>: <...>
...

Tip

Use !PARSE.TRIE to parse multivariance log messages.

Example

Input strings:

Received disconnect from 10.17.248.1 port 60290:11: disconnected by user
Disconnected from user root 10.17.248.1 port 60290

!PARSE.TRIE
- 'Received disconnect from ': !PARSE.KVLIST
                            - CLIENT_IP: !PARSE.UNTIL ' '
                            - 'port '
                            - CLIENT_PORT: !PARSE.DIGITS
                            - ':'
                            - !PARSE.CHARS
- 'Disconnected from user ': !PARSE.KVLIST
                            - USERNAME: !PARSE.UNTIL ' '
                            - CLIENT_IP: !PARSE.UNTIL ' '
                            - 'port '
                            - CLIENT_PORT: !PARSE.DIGITS

Specify

Input string:Failed password for root from 218.92.0.190

!PARSE.TRIE
- 'Received disconnect from ': !PARSE.KVLIST
                            - CLIENT_IP: !PARSE.UNTIL ' '
                            - 'port '
                            - CLIENT_PORT: !PARSE.DIGITS
                            - ':'
                            - !PARSE.CHARS
- 'Disconnected from user ': !PARSE.KVLIST
                            - USERNAME: !PARSE.UNTIL ' '
                            - CLIENT_IP: !PARSE.UNTIL ' '
                            - 'port '
                            - CLIENT_PORT: !PARSE.DIGITS
- '': !PARSE.KVLIST
    - tags: ["trie-match-fail"]

Output: [(tags, ["trie-match-fail"])]

`!PARSE.CHARS.LOOKAHEAD`¤

Parse chars applying lookahead group.

Parse chars until specified lookahead group is found and stop before it.

Type: Combinator

Synopsis:

!PARSE.CHARS.LOOKAHEAD
what:
- <...>
- <...>
- <...>
...
eof: <true/false>

eof - indicates if we should parse till the end of the string if what lookahead group is not found. Possible values: true(default) or false.

Example

Input string: Rule Name cs=Proxy

!PARSE.CHARS.LOOKAHEAD
what:
- " "
- !PARSE.LETTERS
- '='

Output: Rule Name

Visual Programming

Visual programming in SP-Lang¤

SP-Lang lets users create expressions by manipulating expression elements graphically rather than by specifying them textually.

Example of the Syslog parser implemented in the visual SP-Lang:

Welcome!

SP-Lang Documentation¤

Introduction¤

Features of the SP-Lang¤

Dedication¤

SP-Lang Tutorial¤

Introduction¤

Hello World¤

SP-Lang is based on YAML¤

Comments¤

SP-Lang Expressions¤

Mapping expressions¤

Compose expressions¤

Arguments¤

Conclusion¤

SP-Lang Syntax¤

Comments¤

Numbers¤

Integer¤

Floating Point¤

Strings¤

Booleans¤

Expressions¤

Mapping expression¤

Sequence expression¤

Scalar expressions¤

Anchors and Aliases¤

Structure of the SP-Lang file¤

Language

SP-Lang language design¤

Properties¤

📜 Declarative¤

🔗 Functional¤

🔀 Stateless¤

🔐 Strongly typed¤

💡 Type inference¤

🎓Turing completeness¤

SP-Lang Performance¤

Introduction¤

Multi-string matching¤

JSON parsing¤

IETF Syslog parsing¤

Reference Hardware¤

HW-M1-20¤

HW-I7-15¤

Schema¤

Schema definition¤

Options¤

Option type¤

Option aliases¤

Option unit¤

Function declaration (Python)¤

In-place schemas¤

Memory Management¤

Data types

SP-Lang data types¤

Scalar types¤

Integers¤

Boolean¤

Floating-Point¤

Complex scalar types¤

Date/Time¤

IP Address¤

MAC Address¤

Geographical coordinate¤

Generic types¤

Container types¤

List¤

Set¤

Dictionary¤

Bag¤

Product types¤

Tuple¤

Record¤

Sum type¤

Any¤

Object types¤

String¤

Bytes¤

Enum¤

Option `type`¤

Option `aliases`¤

Option `unit`¤

`!COUNT`¤

`!MAX`¤

`!MIN`¤

`!AVG`¤

`!MEDIAN`¤

`!MODE`¤

`!RANGE`¤

`!ADD`¤

`!SUB`¤

`!MUL`¤

`!DIV`¤

`!MOD`¤

`!POW`¤

`!ABS`¤

`!SHL`¤

`!SHR`¤

`!SAL`¤

`!SAR`¤

`!ROL`¤

`!ROR`¤

`!EQ`¤