Virtual tracing: A simpler alternative to distributed tracing for troubleshooting

July 21, 2020 | Larry Lancaster

Distributed tracing is commonly used in Application Performance Monitoring (APM) to monitor and manage application performance, giving a view into what parts of a transaction call chain are slowest. It is a powerful tool for monitoring call completion times and examining particular requests and transactions.

Quite beyond APM, it seems natural to expect tracing to yield a ‘troubleshooting tool to rule them all’....

The promise of tracing

Distributed tracing is commonly used in Application Performance Monitoring (APM) to monitor and manage application performance, giving a view into what parts of a transaction call chain are slowest. It is a powerful tool for monitoring call completion times and examining particular requests and transactions.

Read More

Zebrium + Grafana = Awesome

June 19, 2020 | Larry Lancaster

Lots of people like to construct dashboards in Grafana, for monitoring and alerting – it’s fast, sleek, and practical. Zebrium is awesome for analytics in-part because we lay everything down into tables in a scale-out MPP relational column store at ingest. Each event type gets its own table with typed columns for the parameters; metrics data is also tabled; ditto for anomalies and incidents.

Read More

Is Autonomous monitoring the anomaly detection you actually wanted?

April 15, 2020 | Larry Lancaster

Automatically Spot Critical Incidents and Show Me Root Cause

That's what I wanted from a tool when I first heard of anomaly detection. I wanted it to do this based only on the logs and metrics it ingests, and alert me right away, with all this context baked in...

Automatically Spot Critical Incidents and Show Me Root Cause

Read More

Using machine learning to detect anomalies in logs

November 25, 2019 | Larry Lancaster
At Zebrium, we have a saying: “Structure First”. We talk a lot about structuring because it allows us to do amazing things with log data. But most people don’t know what we mean when we say the word “structure”, or why it allows for amazing things like anomaly detection. This is a gentle and intuitive introduction to “structure” as we mean it.

At Zebrium, we have a saying: “Structure First”. We talk a lot about structuring because it allows us to do amazing things with log data. But most people don’t know what we mean when we say the word “structure”, or why it allows for amazing things like anomaly detection. This is a gentle and intuitive introduction to “structure” as we mean it.

Read More

Deploying into Production: The need for a Red Light

July 23, 2019 | Larry Lancaster

As scale and complexity grow, there are diminishing returns from pre-deployment testing. A test writer cannot envision the combinatoric explosion of coincidences that yield calamity. We must accept that deploying into production is the only definitive test.

Read More

Structure is Strategic

October 31, 2018 | Larry Lancaster

We structure machine data at scale

Zebrium helps dev and test engineers find hidden issues in tests that “pass”, find root-cause faster than ever, and validate builds with self-maintaining problem signatures. We ingest, structure, and auto-analyze machine data - logs, stats, and config - collected from test runs.

 

We structure machine data at scale

 

Read More

Featured Posts

FREE SIGN-UP