Skip to content
This repository has been archived by the owner on Feb 12, 2022. It is now read-only.
Jacob Shriver edited this page Jun 19, 2020 · 36 revisions

User Guide


Working with Alerts

Argus evaluates alerts on metric data and notifies users when trigger thresholds are exceeded. Alerts are scheduled and executed per the CRON entry specified when the alert was created. An alert is associated with at least one trigger and one notification. You can associate a trigger with more than one notification, and a notification with more than one trigger.

You can access the Alert interface via the link at the top of the Argus interface. See Alerting Examples for instructions on creating alerts.

Alert Structure

An alert refers to the configuration, including metadata, triggering conditions, and notification types that informs you when something interesting happens. The following components define certain overall aspects of the alert's identity and functioning.

Component Description
CRON entry Time and frequency of the alert (the alert's schedule), based on the Quartz CronTrigger format
Enabled Enabled for evaluation
Expression Metric expression used to retrieve the with which to evaluate trigger conditions
Missing data notification Notify owner if metric expression doesn't contain data
Name Alert name. Variable interpolation is available.
Owner Owner of the alert
Shared A boolean variable that indicates if the alert is visible by other users

Trigger Structure

A trigger defines a threshold value as a condition. When the trigger condition is met, a notification is sent. An alert can have multiple triggers, and you can associate a trigger with more than one notification.

Component Description
ID ID of the trigger
Inertia The timespan for which the condition must be met before the trigger is fired. At least 2 qualifying data points must occur within the inertia window to fire the trigger.
Name Trigger name. Variable interpolation is available.
Primary threshold Operand for the comparison conditions
Secondary threshold Operand for the BETWEEN and NOT_BETWEEN operators. BETWEEN is inclusive of both operand values, while NOT_BETWEEN is exclusive of those values.
Type Trigger comparison operator. Choose "no data" to trigger on missing data and be able to use configured notifications (normally missing-data alerts go only to the alert owner). The threshold fields are not used with the "no data" type.

Notification Structure

A notification defines how you are notified when a trigger fires. You can associate a notification with more than one trigger.

Component Description
Snooze (Cooldown period) Timespan when no further notifications are sent. The notification enters this cooldown timespan immediately after a notification is sent. Notifications are sent again once the timespan expires.
Triggers The triggers to associate with the notification.
Metrics to Annotate Metric identifiers provided as a comma-separated list of metric expressions. When a notification is sent for an alert, an annotation is created on the specified metric. You can retrieve annotations via a RESTful endpoint or view them on a dashboard. The format to specify the metric expression is <scope>:<metricName>:<aggregator>. "avg" is the default for the aggregator field.
Name Notification name. Variable interpolation is available.
Type Notification type (see below)
Custom Text Free-text field. Variable interpolation is available.

Supported Notifiers

Argus supports the following notifiers:

  • Audit Notifier: This notifier writes the notification to the Argus database. Subscriptions field is left blank.
  • Email Notifier: This notifier sends an email to subscribers whose email addresses are mentioned in the Subscriptions field. The subscription field contains a comma-separated list of email addresses.
  • Salesforce Chatter Notifier: This notifier sends an alert to a Chatter group or a list of Chatter groups as specified in the subscriptions field. The subscription field contains a comma-separated list of Chatter group ids.
  • Callback Notifier: This notifier send the templated notification to HTTP end point!

Discovering a Triggering Event

When an alert is triggered, a notification is delivered (according to the selected notifier) and also logged in the alert History tab. The notification shows the metric expression associated with the alert, along with a link to the triggering event. Even if the original metric expression used relative time, the logged metric expression contains the actual timespan for the triggering event.

Templating with Interpolation of Variables

Argus uses FreeMarker v2.3.28 internally as the templating engine. Interpolated variables include the following kinds:

  • ${scope} — The expression’s scope
  • ${metric} — The expression’s metric
  • ${tag.<tag_name>} — Any tags used in the expression. If the <tag_name> contains a dash, it will have to be escaped by preceding the dash with a backslash so it will not be interpreted as a minus sign. Example: ${tag.foo-yoyo} (for "foo-yoyo" tag)
  • ${device} — Any device tags used in the expression (deprecated; use ${tag.device} instead)
  • ${alert.name} — Name of the alert
  • ${alert.expression} - The expression associated with the alert
  • ${alert.cronEntry} - The cron entry associated with the alert
  • ${alert.enabled} - Boolean value which tells if alert is enabled or not
  • ${trigger.name} - Name of the trigger
  • ${trigger.type} - Type of trigger (for example: >, <, nodata)
  • ${trigger.threshold} - Primary Threshold of the trigger
  • ${trigger.secondaryThreshold} - Secondary Threshold of the trigger (if it exists)
  • ${trigger.inertia} - Inertia of the Trigger
  • ${triggerValue} - Value at which the trigger was fired
  • ${triggerTimestamp} - Timestamp at which the trigger was fired. Default Format: MMM d, yyyy h:mm:ss a
  • ${notification.name} - Name of the notification
  • ${notification.cooldownPeriod} - cooldownPeriod of the notification in milliseconds
  • ${notification.SRActionable} - Boolean to state if SR should be actionable or not
  • ${notification.severityLevel} - Severity level of the trigger

For example, if your expression is

-1h:argus.core:alert.evaluation.kpi.*{host=bar, tagA=*}:min

the trigger name can be written as

trigger-${scope}-${metric}-${tag.host}-${tag.tagA}

If the trigger fires on a metric called “alert.evaluation.kpi.foo”, and tagA has the value “baz”, notifications will show the trigger name as

trigger-argus.core-alert.evaluation.kpi.foo-bar-baz

You can create a template like the following and use it in your Notification tab's Custom Text fields to produce more uniform and informative alert messages:

Alert Name = ${alert.name?upper_case},
Alert Expression = ${alert.expression},
Alert cronEntry = ${alert.cronEntry},
Alert enabled = ${alert.enabled?then('alert enabled', 'alert not enabled')},
Alert Expression = ${alert.expression},
Trigger Name = ${trigger.name},
Trigger type = ${trigger.type},
Trigger threshold = ${trigger.threshold},
Trigger secondaryThreshold = ${trigger.secondaryThreshold},
Trigger Inertia = ${trigger.inertia},
Trigger Value = ${triggerValue},
Trigger Timestamp = ${triggerTimestamp?datetime?iso('GMT')},
Notification Name = ${notification.name?cap_first},
Notification cooldownPeriod = ${notification.cooldownPeriod},
Notification SRActionable = ${notification.SRActionable?then('SR Actionable','Not SR Actionable')},
Notification severityLevel = ${notification.severityLevel}

The template will produce alert messages with information similar to the following:

Alert Name = ALERT-123-argus.core, 
Alert Expression = -1h:argus.core:alert.evaluation.kpi{host=*}:avg, 
Alert cronEntry = * * * * *, 
Alert enabled = alert enabled, 
Alert Expression = -1h:argus.core:alert.evaluation.kpi{host=*}:avg, 
Trigger Name = trigger-1234-argus.core, 
Trigger type = GREATER_THAN, 
Trigger threshold = 60, 
Trigger secondaryThreshold = 0, 
Trigger Inertia = 10,000, 
Trigger Value = 240,000, 
Trigger Timestamp = 2018-10-05T21:15:00Z, 
Notification Name = New-notification-1531947321887, 
Notification cooldownPeriod = 10, 
Notification SRActionable = Not SR Actionable, 
Notification severityLevel = 2

Note the conditional statements in the template ("Alert enabled" and "SRActionalble"). FreeMarker conditional statements are fully supported, and can be used to generate richer alert messages. For example, this template

<#if trigger.threshold <= 4> Primary Threshold is less than 4 </#if>,
<#if (trigger.secondaryThreshold == 7.1)> Secondary Threshold is 7.1 </#if>,
<#if trigger.inertia == 5 && (trigger.threshold > 5)> Inertia is 5, Primary Threshold more than 5 <#elseif  (trigger.threshold > 5)>Primary Threshold more than 5 <#elseif trigger.inertia == 5> Inertia is 5 </#if>,
<#if trigger.name?matches('trigger_name') && triggerValue < 2.0> Trigger name matches and trigger value is < 1 </#if>,
<#if triggerValue?round == 2> Trigger fired, rounded value is 2 </#if>,
<#assign dt = triggerTimestamp?datetime> Trigger fired date-time: ${dt?iso('GMT')},
 Time before 2.5 hrs of firing: ${dt?iso('GMT-02:30')}

could result in the following

 Primary Threshold is less than 4 , 
 Secondary Threshold is 7.1 , 
 Inertia is 5 , 
 Trigger name matches and trigger value is < 1 , 
 Trigger fired rounded value is 2 , 
 Trigger fired ate-time: 2014-12-11T17:40:00Z, 
 Time before 2.5 hrs of firing: 2014-12-11T15:10:00-02:30

See the FreeMarker Template Language Reference for complete documentation on FreeMarker directives, built-ins, and more.

Clone this wiki locally