Skip to main content

AI Adaptive

info

AI adaptive alerting automatically establishes metric baselines based on historical data and dynamically calculates normal fluctuation ranges, eliminating the need for manual fixed threshold settings. When metrics deviate from the baseline beyond a preset percentage, it triggers alerts, suitable for monitoring metrics with periodic fluctuation patterns (such as CPU usage, request volume, etc.).

Quick Start

Step 1: Enter the New Page

Enter Alert Rules, click New Alert Rule, select AI adaptive detection type and enter the configuration page.

Step 2: Configure Detection Rules

  1. Select the resource domain in Effective Scope
  2. In Metric Selection, select the metric to monitor, such as CPU usage, set the aggregation method (such as latest value) and add grouping dimension (such as host)
  3. Add Filter Conditions as needed (such as host name = specific host) to narrow the monitoring scope
  4. In Trigger Conditions, set the comparison method for the prediction data interval, and fill in the percentage thresholds for deviation from the baseline at each level
  5. Adjust the fluctuation tolerance multiple in Algorithm Configuration as needed

Step 3: Fill in Alert Content

  1. Fill in Alert Title, supports using variables like ${host.customizedName}
  2. Edit Notification Content as needed, or keep the default template
  3. Select Notification Strategy; if not created, click Create Notification Strategy

Step 4: Set Effective Time and Save

  1. Select Effective Time (default 7×24 hours full day)
  2. Confirm Start/Stop Status is enabled
  3. Click Save to complete creation; if you need to reuse, click Save to Custom Template

Feature Description

1777023829137

Detection Rules

Basic Configuration

FieldRequiredDescription
Effective ScopeYesSelect the resource domain monitored by this rule, the rule only applies to data within the selected scope
Metric SelectionYesSelect the monitored metric and aggregation method (such as average, maximum), and can precisely target monitoring objects (such as specific hosts) through filter conditions
info

The Detection Data chart at the top of the page will display the historical trend of the selected metric in real time, as well as the baseline range calculated by AI (blue interval). It is recommended to confirm whether the baseline meets expectations through the chart before saving, to assist in calibrating threshold settings.

Trigger Conditions

In AI adaptive mode, the threshold is not a fixed value, but a deviation percentage relative to the AI baseline.

FieldRequiredDescription
Prediction Interval DurationYesWithin the last N minutes, trigger if the predicted data interval of the metric meets the condition; determines the length of the detection time window
Comparison MethodYesSupports operators like >, <, etc., combined with deviation percentages at each level to judge whether to trigger
Critical ThresholdNoIf the percentage of metric value higher than baseline threshold > set value, generateCritical level alert
Error ThresholdNoIf the percentage of metric value higher than baseline threshold > set value, generateError level alert
Warning ThresholdNoIf the percentage of metric value higher than baseline threshold > set value, generateWarning level alert
Normal RecoveryNoAfter N consecutive detections with no events, the alert automatically returns toNormal state, default 3 times

Advanced Configuration

FieldRequiredDescription
Algorithm ConfigurationNoOn top of the baseline, additionally tolerate changes within N times the normal signal fluctuation range. The larger the value, the more tolerant of glitches and fewer false alarms; the smaller the value, the more sensitive to anomalies. Default is 1 time
Data GapNoAfter enabling, when there isno metric data at all within the detection interval, it also triggers an alert, suitable for collection scenarios that must continuously report. Default Off
Data DelayNoAfter enabling, the query time window is shifted back by the specified duration to avoid missed reports due to data storage delay. DefaultEnabled, offset 1 minute
Aggregation RuleNoWhen there are multiple time series under the grouping dimension, define the aggregation granularity. Default aggregation by host, each host independently alerts; aggregation by network area, each network area generates an alert containing multiple hosts
tip

It is recommended to start with the default value of 1 for the fluctuation tolerance multiple in Algorithm Configuration, and gradually adjust based on actual alert conditions: increase appropriately if there are many false alarms, and decrease appropriately if there are many missed alarms.

Alert Content

FieldRequiredDescription
Alert TitleYesThe title displayed when the alert is triggered, supports dynamic filling using variables
Notification ContentNoThe body sent to recipients, supports rich text editing and variable insertion. If not filled, the default template is used
Notification StrategyNoSpecify who to notify and through which channel after the alert is triggered. When not configured, alerts are only recorded, not sent
LabelsNoAdd custom labels to rules for easy classification, filtering, and batch management

Status & Effective Time

FieldRequiredDescription
Effective TimeYesAll Time (7×24 hours): The rule always runs; Periodic Time: Effective repeatedly by day of week; Custom Time: Specify specific time periods. Default All Time
Start/Stop StatusYesWhen enabled, the rule normally detects and generates alerts; when disabled, the rule pauses and does not generate any detection or notification. DefaultEnabled

Common Scenarios

Scenario: CPU usage abnormal increase alert Select CPU usage / Average value as the metric, specify target hosts through filter conditions, set trigger conditions to trigger Warning when above baseline by 30%, Error when above by 80%, and let AI automatically identify normal ranges for business peaks and valleys, eliminating the need to manually set different thresholds for day and night.

Scenario: Frequent false alarms for metrics with periodic fluctuations Appropriately increase the fluctuation tolerance multiple in Algorithm Configuration (such as from 1 to 2), allowing larger normal fluctuations without triggering alerts, reducing noise alerts caused by sudden but reasonable business traffic changes.

Scenario: Unified monitoring of similar metrics across multiple hosts Select multiple hosts in filter conditions, and set Aggregation Rule to aggregate by Host. The system will establish baselines for each host separately and trigger alerts independently, rather than mixing data from all hosts for calculation.

Notes

warning

AI baseline relies on historical data . For metrics that have just started collecting recently and have insufficient data, the baseline may not be accurate. It is recommended to accumulate a certain amount of historical data before enabling AI adaptive rules, and use fixed threshold rules as a transition in the early stage.