Declarations for defining baselines¶
The declarations for baselines are loaded from the Library from the folder specified in the configuration, such as /Baseliners.
Note
The Baseliner uses /Schemas/ECS.yaml by default, so /Schemas/ECS.yaml must also be present in the Library.
Declaration¶
This is an example of a baseline definition, located in the /Baseliners folder in the Library:
---
define:
name: Dataset
description: Creates baseline for each dataset and trigger alarms if the actual number deviates
type: baseliner
baseline:
region: Czech Republic
period: day
learning: 4
classes: [workdays, weekends, holidays]
max_history_days: 180 # (optional) Maximum days of counter history to retain
evaluate:
key: event.dataset
timestamp: "@timestamp"
analyze:
when: 1h # How often to perform the analysis (1 hour is the default)
test:
!AND
- !LT
- !ARG VALUE
- 1
- !GT
- !ARG MEAN
- 10.0
trigger:
- event:
# Threat description
# https://www.elastic.co/guide/en/ecs/master/ecs-threat.html
threat.framework: "MITRE ATT&CK"
threat.indicator.sightings: !ITEM EVENT value
threat.indicator.confidence: "High"
threat.indicator.name: !ITEM EVENT dimension
- notification:
type: email
to: ["myemail@example.co"]
template: "/Templates/Email/Notification_baseliner_dimension.md"
variables:
name: "Logs are not coming to the dataset within the given UTC hour."
dimension: !ITEM EVENT dimension
hour: !ITEM EVENT hour
Sections¶
baseline¶
baseline: # (1)
region: Czech Republic #(2)
period: day # (3)
learning: 4 # (4)
classes: [workdays, weekends, holidays] # (5)
max_history_days: 180 # (6)
- Defines how the given baseline is built.
- Defines in which region the activity is happening (for calculating holidays and so on).
- Defines the timespan for the baseline. The period can be either
dayorweek. - Defines the number of periods (here day) that occur from the baseliner beginning to receive input until the user can see the baseline analysis. Additional details below.
- Define which days in the week we want to monitor. Classes can include any/all:
workdays,weekends, andholidays. max_history_days(optional): Maximum number of days of counter history to retain for baseline building. Counter records older than this limit are automatically deleted to prevent memory leaks and excessive storage usage. If not specified, defaults to the configuration valuemax_baseline_history_days(default: 60 days if not configured). This parameter helps manage long-term data retention while ensuring sufficient history for accurate baseline calculations.
learning
The learning field defines the learning phase.
The learning phase is the time from the first occurrence of the dimension value in the input of the Baseliner instance until the point when the baseline is shown to the user and the analysis takes place. In the declaration, learning is the number of periods. The learning phase is calculated separately for holidays, weekends and working days. Baselines are rebuilt overnight (housekeeping).
In this example, the period is day, so learning is 4 days. Considering the calendar, a learning phase of 4 days beginning on Friday means 4 working days, and thus ends on Wednesday night.
max_history_days
The max_history_days field defines the maximum number of days of counter history to retain for baseline building. This parameter helps prevent memory leaks and excessive storage usage by automatically deleting counter records older than the specified limit.
- Default behavior: If not specified in the baseline declaration, the system uses the configuration value
max_baseline_history_daysfrom the baseliner configuration (default: 60 days if not configured). - Automatic cleanup: During baseline building (typically overnight), counter records older than
max_history_daysare automatically deleted from MongoDB. - Impact on baseline: Only counter records within the
max_history_dayswindow are used for baseline calculations. Older data is excluded to keep the baseline relevant to recent activity patterns. - Recommendation: Set this value based on your baseline requirements. For example:
- 60-90 days: Suitable for most use cases, provides good balance between history and performance
- 180 days (6 months): For baselines that need longer-term patterns, such as seasonal variations
- 30 days: For high-volume environments where storage is a concern
Storage Management
The max_history_days parameter is particularly important for high-volume baselines or long-running systems. Without this limit, counter records can accumulate indefinitely, leading to increased storage usage and slower baseline calculations.
logsource (optional)¶
The logsource section specifies the types of event lanes that incoming events should be read from. It is used for categorization of the log source and is derived from Sigma rules.
logsource:
vendor:
- microsoft
product:
- windows
The logsource section can contain:
vendor: The vendor of the log source (e.g.,microsoft,linux,cisco)product: The product name (e.g.,windows,exchange,fortigate)service(optional): The service name (e.g.,syslog,audit,activitylogs)
Each field accepts a list of values. Events from event lanes that match any of the specified vendor, product, or service values will be considered for the baseline.
Example: Windows Events Baseline
logsource:
vendor:
- microsoft
product:
- windows
This configuration will process events from event lanes that are categorized as Microsoft Windows logs.
Example: Multiple Products
logsource:
vendor:
- microsoft
product:
- windows
- exchange
- office365
This configuration will process events from Microsoft Windows, Exchange, or Office 365 event lanes.
signal (optional)¶
The optional signal section controls whether the baseline sends a signal to Alert Management and how tickets are grouped. Use signal: default: false when the baseline is for analysis or correlation input only (no direct ticket creation). Use signal: grouping: to set which attributes are used for ticket grouping. See Signal for details.
predicate (optional)¶
The predicate section filters incoming events to be considered as activity in the baseline.
Write filters with TeskaLabs SP-Lang. Visit Predicates or the SP-Lang documentation for details.
Combining logsource and predicate
The logsource section filters events at the event lane level (which event lanes to read from), while the predicate section filters individual events within those lanes. Both filters are applied, so an event must match both the logsource criteria and the predicate expression to be included in the baseline.
evaluate¶
This section specifies which attributes from the event are going to be used in the baseline build.
evaluate:
key: event.dataset # (1)
timestamp: "@timestamp" # (2)
- Specifies the attribute/entity to monitor.
- Specifies the attribute in which the time dimension of the event activity is stored in.
analyze¶
The test section in analyze specifies when to run the trigger, if the actual activity (!ARG VALUE) deviates from the baseline. Write tests in SP-Lang.
analyze:
test:
!AND #(1)
- !LT #(2)
- !ARG VALUE # (3)
- 1
- !GT # (4)
- !ARG MEAN # (5)
- 10.0
- All expressions nested under
ANDmust be true for the test to pass. Here, if the value is less than 1 and the mean is greater than 10, the trigger is run. - "Less than"
- Get (
!ARG) the value (VALUE). If the value is less than1as specified, the!LTexpression is true. - "Greater than"
- Get (
!ARG) the mean (MEAN). If the mean is greater than10.0as specified, the!GTexpression is true.
The following attributes are available, used in SP-Lang notation:
TENANT: "str",
VALUE: "ui64",
STDEV: "fp64",
MEAN: "fp64",
MEDIAN: "fp64",
VARIANCE: "fp64",
MIN: "ui64",
MAX: "ui64",
SUM: "ui64",
HOUR: "ui64",
KEY: "str",
CLASS: "str",
The when option defines how often to perform the analysis.
trigger¶
The trigger section defines the activity that is triggered to run after a successful analysis. (More about triggers.)
Baseliner creates events
Upon every analysis (every hour), Baseliner creates an event to summarize its analysis. These Baseliner-created events are available to use (as EVENT) with expressions such as !ARG and !ITEM, meaning you can pull values from the events for your trigger activities.
These Baseliner-created events include the fields:
tenant
The name of the tenant the baseline belogs to.
dimension
The dimension the baseline belongs to, as specified in evaluate.
class
The class the baseline was calculated from.
Options include: workdays, weekends, and holidays
hour
The number of the UTC hour the analysis happend in.
value
The value of the current counter of events for the given UTC hour.
Notification trigger¶
A notification trigger sends a message, such as an email. See Email notifications for more details about sending email notifications and using email templates.
An example of a notification trigger:
trigger:
- notification:
type: email #(1)
to: ["myemail@example.co"] # (2)
template: "/Templates/Email/Notification_baseliner_dimension.md" # (3)
variables: # (4)
name: "Logs are not coming to the dataset within the given UTC hour."
dimension: !ITEM EVENT dimension # (5)
hour: !ITEM EVENT hour
- Specifies an email notification
- Recipient address
- Filepath to the email template in the LogMan.io Library
- Begins the section that gives directions for how to fill the blank fields from the email template. The blank fields in the template being used in this example are
name,dimension, andhour. - Uses SP-Lang to get information (
!ITEM) from the Baseliner-createdEVENT(detailed below). In this case, the template fielddimensionwill be filled with the value ofdimensiontaken from the Baseliner-created event.
Event trigger¶
You can use an event trigger to create a log or event, which you'll be able to see in the TeskaLabs LogMan.io UI.
Example of an event trigger:
- event: # (1)
threat.framework: "MITRE ATT&CK"
threat.indicator.sightings: !ITEM EVENT value
threat.indicator.confidence: "High"
threat.indicator.name: !ITEM EVENT dimension
- This new event is a threat description using threat fields from Elasticsearch
Analysis in UI¶
By default, the LogMan.io UI provides displays of analyses for user and host.
Specify the analysis in the schema (default: /Schemas/ECS.yaml) like this:
host.id:
type: "str"
analysis: host
user.id:
type: "str"
analysis: user
...
If then the tenant is configured to use this schema (ECS by default), the host.id and user.id fields in Discover will show a link to the given baseline.
Analysis host uses the baseline named Host by default:
---
define:
name: Host
Analysis user uses the baseline named User by default:
---
define:
name: User
If a specific analysis cannot locate its associated baseline, the UI will display an empty screen for that analysis.
Note
Both baselines needed for analysis are distributed as part of the LogMan.io Common Library.