Anomaly Detection as a foundation of Autonomous Monitoring

We believe the future of monitoring, especially for platforms like Kubernetes, is truly autonomous. Cloud native applications are increasingly distributed, evolving faster and failing in new ways, making it harder to monitor, troubleshoot and resolve incidents. Traditional approaches such as dashboards, carefully tuned alert rules and searches through logs are reactive and time intensive, hurting productivity, the user experience and MTTR. 

We believe the future of monitoring, especially for platforms like Kubernetes, is truly autonomous. Cloud native applications are increasingly distributed, evolving faster and failing in new ways, making it harder to monitor, troubleshoot and resolve incidents. Traditional approaches such as dashboards, carefully tuned alert rules and searches through logs are reactive and time intensive, hurting productivity, the user experience and MTTR. We believe machine learning can do much better – detecting anomalous patterns automatically, creating highly diagnostic incident alerts and shortening time to resolution.

Read More

A Prometheus fork for cloud scale anomaly detection across metrics & logs

At Zebrium, we provide an Autonomous Monitoring service that automatically detects anomalies within logs and metrics. We started by correlating anomalies across log streams to automatically raise incidents that require our user's attention. Now, we have taken the step to augment our incident detection by detecting anomalies within a group of related metrics and correlate those with anomalies found in logs.

Introduction

At Zebrium, we provide an Autonomous Monitoring service that automatically detects anomalies within logs and metrics. We started by correlating anomalies across log streams to automatically raise incidents that require our user's attention. Now, we have taken the step to augment our incident detection by detecting anomalies within a group of related metrics and correlate those with anomalies found in logs.

Read More

What Is an ML Detected Software Incident?

Based on our experience with hundreds of incidents across nearly a hundred unique application stacks, we have developed deep insights into the specific ways modern software breaks. This led us to thoughtfully design a multi-layer machine learning stack that can reliably detect these patterns, and identify the collection of events that describes each incident. In simple terms, here is what we have learned about real-world incidents when software breaks.

Based on our experience with hundreds of incidents across nearly a hundred unique application stacks, we have developed deep insights into the specific ways modern software breaks. This led us to thoughtfully design a multi-layer machine learning stack that can reliably detect these patterns, and identify the collection of events that describes each incident. In simple terms, here is what we have learned about real-world incidents when software breaks. You can also try it yourself by signing up for a free account.

Read More

Do your logs feel like a magic 8 ball?

Logs are the source of truth when trying to uncover latent problems in a software system. But is searching logs to find the root cause the right approach?

Logs are the source of truth when trying to uncover latent problems in a software system. They are usually too messy and voluminous to analyze proactively, so they are used mostly for reactive troubleshooting once a problem is known to have occurred.

Read More

Using machine learning to detect anomalies in logs

At Zebrium, we have a saying: “Structure First”. We talk a lot about structuring because it allows us to do amazing things with log data. But most people don’t know what we mean when we say the word “structure”, or why it allows for amazing things like anomaly detection. This is a gentle and intuitive introduction to “structure” as we mean it.

At Zebrium, we have a saying: “Structure First”. We talk a lot about structuring because it allows us to do amazing things with log data. But most people don’t know what we mean when we say the word “structure”, or why it allows for amazing things like anomaly detection. This is a gentle and intuitive introduction to “structure” as we mean it.

Read More

Featured Posts

AUTONOMOUS INCIDENT
DETECTION
 
TRY IT NOW!
 
FREE SIGN-UP