Alert Rules and Fault Signatures

Zebrium's unique strength is its ability to autonomously detect incidents by creating virtual traces across anomalies in logs and metrics. But it is by no means limited to auto-detection only - any existing knowledge about critical problems can also be captured as user defined alert rules or fault signatures. 

To create a query based alert rule, simply set the filters and search terms that define the results of interest. Then you can save it as an alert rule, with a user defined threshold and alert destination (e.g. a webhook or email).

create_alert_rule

The resulting alerts give you a count of unique event types, and a clickable link to view the events matching that notification.

alert_rule_example

Zebrium also offers a powerful alternative to query based alert rules. If you want to build a rule based on a sequence of very specific event types, the Signature capability allows you to do that. You can also specify predicates or conditions on fields within those event types, a time window within which they must occur, and suppress repeated alerts if there are multiple occurrences in an hour or day for example.

To take advantage of this, simply tag the events of interest as "favorites" (click the star button to the left of an event in the log tab), and go to signatures tab. Here you can label the event types, pick the fields you want to set conditions on, and specify things like time range and suppression of repeated alerts.

signature_creation

Finally, you can name the rule, assign it a priority and categorize it. High priority alerts are automatically sent to the default slack channel as well as any custom webhooks specified by the user.

signature_definition

High priority signatures can also be viewed as a heatmap in the logs tab - allowing filtering or navigation of events matching the signature hits.

signature_heatmap