Combinator expressions¤
Overview¤
Combinators are functions for composing parsec expressions (parsers or another combinators) together. They specify how parsing is applied, what is the output type. They can be used for the flow control of parsing (applying conditional or repeated expressions) and also for lookahead searching in the input string.
Output selectors determine the type of output:
!PARSE.KVLIST
: Parse sequence of keys and values into bag type.!PARSE.KV
: Parse key and value from the input string.!PARSE.TUPLE
: Parse into tuple type.!PARSE.RECORD
Flow control expressions can perform sequence of parser expressions based on certain conditions:
!PARSE.REPEAT
: Performs the same sequence of expressions multiple times, similarly to "for" statement from different languages.!PARSE.SEPARATED
!PARSE.OPTIONAL
: Adds optional parser function, similarly to "if/else" statement from different languages.!PARSE.TRIE
: Performs the sequence of expressions based on the input string prefix.
Lookahead expressions:
!PARSE.CHARS.LOOKAHEAD
: Parse until certain sequence of characters is found in the string.
!PARSE.KVLIST
: Parse list of key-value pairs¤
Type: Combinator
Iterating through list of elements !PARSE.KVLIST
expression collects key-value pairs to bag.
Synopsis:
!PARSE.KVLIST
- <...>
- key: <...>
Non-key elements are parsed, but not collected:
!PARSE.KVLIST
- <...> # parsed, but not collected
- key1: <...> # parsed and collected
- key2: <...> # parsed and collected
Nested !PARSE.KVLIST
expressions are joined to the parent one:
!PARSE.KVLIST
- <...>
- !PARSE.KVLIST # expression is joined to the parent one
- key3: <...>
- <...>
- key4: <...>
Example
Input string:
<141>May 9 10:00:00 myhost.com notice tmm1[22731]: User 'user' was logged in.
!PARSE.KVLIST
- '<'
- PRI: !PARSE.DIGITS
- '>'
- TIMESTAMP: !PARSE.DATETIME
- month: !PARSE.MONTH 'short'
- !PARSE.SPACES
- day: !PARSE.DIGITS # Day
- !PARSE.SPACES
- hour: !PARSE.DIGITS # Hours
- ':'
- minute: !PARSE.DIGITS # Minutes
- ':'
- second: !PARSE.DIGITS # Seconds
- !PARSE.SPACES
- HOSTNAME: !PARSE.UNTIL ' '
- LEVEL: !PARSE.UNTIL ' '
- PROCESS.NAME: !PARSE.UNTIL '['
- PROCESS.PID: !PARSE.DIGITS
- ']:'
- !PARSE.SPACES
- MESSAGE: !PARSE.CHARS
Output:
[
(PRI, 141),
(TIMESTAMP, 140994182325993472),
(HOSTNAME, myhost.com),
(LEVEL, notice),
(PROCESS.NAME, tmm1),
(PROCESS.PID, 22731),
(MESSAGE, User 'user' was logged in.)
]
!PARSE.KV
: Parse key-value pair¤
Type: Combinator
Parse key and value from a string into key-value pair, with the possibility of adding a certain prefix.
Synopsis:
!PARSE.KV
- prefix: <...>
- key: <...>
- value: <...>
- <...> # optional elements
prefix
is optional. If specified, the prefix will be added to thekey
.key
andvalue
are required.
Tip
Use combination of !PARSE.REPEAT
and !PARSE.KV
to parse repeated key-value pairs. (see examples)
Example
Input string: eventID= "1011"
!PARSE.KV
- key: !PARSE.UNTIL '='
- !PARSE.SPACE
- value: !PARSE.BETWEEN {what: '"'}
Output: (eventID, 1011)
Parse key and value with a specified prefix
Input string: eventID= "1011"
!PARSE.KV
- key: !PARSE.UNTIL {what: '='}
prefix: SD.PARAM.
- !PARSE.SPACE
- value: !PARSE.BETWEEN {what: '"'}
(SD.PARAM.eventID, 1011)
Usage together with !PARSE.REPEAT
Input string: devid="FEVM020000191439" vd="root" itime=1665629867
!PARSE.REPEAT
what: !PARSE.KV
- !PARSE.OPTIONAL
what: !PARSE.SPACE
- key: !PARSE.UNTIL '='
- value: !TRY
- !PARSE.BETWEEN '"'
- !PARSE.UNTIL { what: ' ', eof: true}
Output:
[
(devid, FEVM020000191439),
(vd, root),
(itime, 1665629867)
]
!PARSE.TUPLE
: Parse list of values to tuple¤
Type: Combinator
Iterating through list of elements !PARSE.TUPLE
expression collects values to tuple.
Synopsis:
!PARSE.TUPLE
- <...>
- <...>
- <...>
Example
Input string: Hello world!
!PARSE.TUPLE
- 'Hello'
- !PARSE.SPACE
- 'world'
- '!'
Output: ('Hello', ' ', 'world', '!')
!PARSE.RECORD
: Parse list of values to record structure¤
Iterating through list of elements !PARSE.RECORD
expression collects values to record structure.
Type: Combinator
Synopsis:
!PARSE.RECORD
- <...>
- element1: <...>
- element2: <...>
- <...>
Example
Input string: <165>1
!PARSE.RECORD
- '<'
- severity: !PARSE.DIGITS
- '>'
- version: !PARSE.DIGITS
- ' '
Output: {'output.severity': 165, 'output.version': 1}
!PARSE.REPEAT
: Parse a repeated pattern¤
Type: Combinator.
Synopsis:
!PARSE.REPEAT
what: <expression>
min: <...>
max: <...>
exactly: <...>
- If neither of
min
,max
,exactly
is specified,what
will be repeated as many times as possible. exactly
determines the exact number of repetitions.min
andmax
set minimal and maximal number of repetitions.
Example
Input string: host:myhost;ip:192.0.0.1;user:root;
!PARSE.KVLIST
- !PARSE.REPEAT
what: !PARSE.KV
- key: !PARSE.UNTIL ':'
- value: !PARSE.UNTIL ';'
This will repeat the !PARSE.KV
expression as many times as possible.
Output:
[
(host, myhost),
(ip, 192.0.0.1),
(user, root)
]
Parse
Input string: hello hello hello Anna!
!PARSE.KVLIST
- !PARSE.REPEAT
what: !PARSE.EXACTLY 'hello '
exactly: 3
- NAME: !PARSE.UNTIL '!'
Output: [(NAME, Anna)]
Parse
Input strings:
hello hello Anna!
hello hello hello Anna!
hello hello hello hello Anna!
!PARSE.KVLIST
- !PARSE.REPEAT
what: !PARSE.EXACTLY 'hello '
min: 2
max: 4
- NAME: !PARSE.UNTIL '!'
Output: [(NAME, Anna)]
!PARSE.SEPARATED
: Parse a sequence with a separator¤
Type: Combinator.
Synopsis:
!PARSE.SEPARATED
what: <...>
sep: <...>
min: <...>
max: <...>
end: <...>
min
andmax
are optional.end
indicates if trailing separator is required. By default, it is optional.
Example
Input string: 0->1->2->3
!PARSE.SEPARATED
what: !PARSE.DIGITS
sep: !PARSE.EXACTLY {what: "->"}
min: 3
Output: [0, 1, 2, 3]
Note: the trailing separator is optional, so input string 0->1->2->3->
is also valid.
More examples
Parsewhat
values separated by sep
in [min;max]
interval, trailing separator is required:Input string:
11,22,33,44,55,66,
!PARSE.SEPARATED
what: !PARSE.DIGITS
sep: !PARSE.EXACTLY {what: ","}
end: True
min: 3
max: 7
what
values separated by sep
in [min;max]
interval, trailing separator is not presented:Input string:
0..1..2..3
!PARSE.SEPARATED
what: !PARSE.DIGITS
sep: !PARSE.EXACTLY {what: ".."}
end: False
min: 3
max: 5
!PARSE.OPTIONAL
: Parse optional pattern¤
Type: Combinator
!PARSE.OPTIONAL
expression tries to parse the input string using the specified parser.
If the parser fails, starting position rolls back to the initial one.
Synopsis:
!PARSE.OPTIONAL
what: <...>
or shorter version:
!PARSE.OPTIONAL <...>
Example
Input strings:
mymachine myproc[10]: DHCPACK to
mymachine myproc[10]DHCPACK to
!PARSE.KVLIST
- HOSTNAME: !PARSE.UNTIL ' ' # mymachine
- TAG: !PARSE.UNTIL '[' # myproc
- PID: !PARSE.DIGITS # 10
- !PARSE.EXACTLY ']'
# Parsing of optional characters
- !PARSE.OPTIONAL ':'
- !PARSE.OPTIONAL
what: !PARSE.SPACE
- NAME: !PARSE.UNTIL ' '
!PARSE.TRIE
: Parse using starting prefix¤
Type: Combinator.
!PARSE.TRIE
expression chooses one of the specified prefixes and parse the rest of the input string using the corresponding parser.
If empty prefix is specified, the corresponding parser will be used in case other prefixes are not matched.
Synopsis:
!PARSE.TRIE
- <prefix1>: <...>
- <prefix2>: <...>
...
Tip
Use !PARSE.TRIE
to parse multivariance log messages.
Example
Input strings:
Received disconnect from 10.17.248.1 port 60290:11: disconnected by user
Disconnected from user root 10.17.248.1 port 60290
!PARSE.TRIE
- 'Received disconnect from ': !PARSE.KVLIST
- CLIENT_IP: !PARSE.UNTIL ' '
- 'port '
- CLIENT_PORT: !PARSE.DIGITS
- ':'
- !PARSE.CHARS
- 'Disconnected from user ': !PARSE.KVLIST
- USERNAME: !PARSE.UNTIL ' '
- CLIENT_IP: !PARSE.UNTIL ' '
- 'port '
- CLIENT_PORT: !PARSE.DIGITS
Specify
Input string: Failed password for root from 218.92.0.190
!PARSE.TRIE
- 'Received disconnect from ': !PARSE.KVLIST
- CLIENT_IP: !PARSE.UNTIL ' '
- 'port '
- CLIENT_PORT: !PARSE.DIGITS
- ':'
- !PARSE.CHARS
- 'Disconnected from user ': !PARSE.KVLIST
- USERNAME: !PARSE.UNTIL ' '
- CLIENT_IP: !PARSE.UNTIL ' '
- 'port '
- CLIENT_PORT: !PARSE.DIGITS
- '': !PARSE.KVLIST
- tags: ["trie-match-fail"]
Output: [(tags, ["trie-match-fail"])]
!PARSE.CHARS.LOOKAHEAD
: Parse chars applying lookahead group¤
Type: Combinator
Parse chars until specified lookahead group is found and stop before it.
Synopsis:
!PARSE.CHARS.LOOKAHEAD
what:
- <...>
- <...>
- <...>
...
eof: <true/false>
eof
- indicates if we should parse till the end of the string ifwhat
lookahead group is not found. Possible values:true
(default) orfalse
.
Example
Input string: Rule Name cs=Proxy
!PARSE.CHARS.LOOKAHEAD
what:
- " "
- !PARSE.LETTERS
- '='
Output: Rule Name