Anomaly Detection as a foundation of Autonomous Monitoring

We believe the future of monitoring, especially for platforms like Kubernetes, is truly autonomous. Cloud native applications are increasingly distributed, evolving faster and failing in new ways, making it harder to monitor, troubleshoot and resolve incidents. Traditional approaches such as dashboards, carefully tuned alert rules and searches through logs are reactive and time intensive, hurting productivity, the user experience and MTTR. 

We believe the future of monitoring, especially for platforms like Kubernetes, is truly autonomous. Cloud native applications are increasingly distributed, evolving faster and failing in new ways, making it harder to monitor, troubleshoot and resolve incidents. Traditional approaches such as dashboards, carefully tuned alert rules and searches through logs are reactive and time intensive, hurting productivity, the user experience and MTTR. We believe machine learning can do much better – detecting anomalous patterns automatically, creating highly diagnostic incident alerts and shortening time to resolution.

Read More

The Future of Monitoring is Autonomous

Monitoring today is extremely human driven. The only thing we’ve automated with monitoring to date is the ability to watch for metrics and events that send us alerts when something goes wrong. Everything else: deploying collectors, building parsing rules, configuring dashboards and alerts, and troubleshooting and resolving incidents, requires a lot of manual effort from expert operators that intuitively know and understand the system being monitored.

TL;DR

Monitoring today puts far too much burden on DevOps - requiring too much time and skill to build and maintain complex dashboards and fragile alert rules. Fortunately unassisted machine learning can be used to autonomously detect and find the root cause of critical incidents. Please read about it below, or try our free Autonomous Monitoring Platform now and start auto-detecting incidents in less than 5 minutes.

 

The Longer Version

 

Monitoring today is extremely human driven. The only thing we’ve automated with monitoring to date is the ability to watch for metrics and events that send us alerts when something goes wrong. Everything else: deploying collectors, building parsing rules, configuring dashboards and alerts, and troubleshooting and resolving incidents, requires a lot of manual effort from expert operators that intuitively know and understand the system being monitored.

Read More

Do your logs feel like a magic 8 ball?

Logs are the source of truth when trying to uncover latent problems in a software system. But is searching logs to find the root cause the right approach?

Logs are the source of truth when trying to uncover latent problems in a software system. They are usually too messy and voluminous to analyze proactively, so they are used mostly for reactive troubleshooting once a problem is known to have occurred.

Read More

Autonomous log monitoring for Kubernetes

Kubernetes makes it easy to deploy, manage and scale large distributed applications. But what happens when something goes wrong with an app? And how do you even know?

Kubernetes makes it easy to deploy, manage and scale large distributed applications. But what happens when something goes wrong with an app? And how do you even know? We hear variations on this all the time: “It was only when customers started complaining that we realized our service had degraded”, “A simple authentication problem stopped customers logging. It took six hours to resolve.”, and so on.

Read More

Catching Faults Missed by APM and Monitoring tools

As software gets more complex, it gets harder to test all possible failure modes within a reasonable time. Monitoring can catch known problems – albeit with pre-defined instrumentation. But it’s hard to catch new (unknown) software problems. 

A quick, free and easy way to find anomalies in your logs

 

Read More

Deploying into Production: The need for a Red Light

As scale and complexity grow, there are diminishing returns from pre-deployment testing. A test writer cannot envision the combinatoric explosion of coincidences that yield calamity. We must accept that deploying into production is the only definitive test.

Read More

Using ML to auto-learn changing log structures

Software log messages are potential goldmines of information, but their lack of explicit structure makes them difficult to programmatically analyze. Tasks as common as accessing (or creating an alert on) a metric in a log message require carefully crafted regexes that can easily capture the wrong data by accident (or break silently because of changing log formats across software versions). But there’s an even bigger prize buried within logs – the possibility of using event patterns to learn what’s normal and what’s anomalous. 

Why understand log structure at all?

Read More

Featured Posts

AUTONOMOUS INCIDENT
DETECTION
 
TRY IT NOW!
 
FREE SIGN-UP