2:14am — alert fires. Then you really start the clock.

ECS task error rate jumps to 47%. Permission denied errors cascading through logs. You have 3 hours to find the cause or customers wake up to downtime.

The Situation

PagerDuty wakes you at 2:14am. Your ECS cluster is failing. Error rate 47%. You pull up CloudWatch. Hundreds of "permission denied" errors in the task logs. But which permission? Which service? Which IAM change broke this?

You dig into CloudTrail. Hundreds of API calls. You check GitHub. Three commits in the last 4 hours. One manually updated IAM role. You manually grep through logs, trying to correlate the timeline. By 5:30am you piece together: someone accidentally removed a critical IAM permission at 2:09am. But you're exhausted, facts are fuzzy, and the postmortem will be guessing.

3 hours to find the root cause. Every minute costs money and customer trust.

The Company

Series B fintech. Every minute of downtime is expensive. Multi-region deployment. High-velocity team. Multiple services sharing IAM roles.

You don't have time for 3 hours of manual investigation. You need the answer now.

With Escher

You ask: "I had an outage at 2:14am. ECS tasks failing with permission denied. What changed?"

  1. 0-1 min: Escher ingests CloudTrail logs, ECS metrics, CloudWatch errors, IAM role history, and Git commit history.
  2. 1-2 min: Correlates the 2:14am error spike to the 2:09am IAM permission removal. Identifies which role was changed and which permission was deleted.
  3. 2-3 min: Generates root cause: commit abc123 removed PassRole permission. ECS can't assume the task execution role. Surfaces the exact responsible change.
Escher — Product Screenshot

product screenshot · replace with actual

The Outcome

Before

3+ hours
of searching

With Escher

3 minutes
exact cause

Real impact: MTTR drops 60%. Postmortem has facts instead of guesses. Revert plan ready in seconds. Customer impact minimized. Your team gets back to bed.

Try It Yourself

I had an outage at 2:14am. ECS tasks failing with permission denied. What changed?
Start asking Escher

Stop guessing. Start knowing.

Get root cause answers in minutes, not hours.

Get started free