Template

Incident postmortem template.

A blameless framework for documenting incidents, finding root cause, and preventing recurrence with actionable follow‑ups.

Template sections

What to capture

  • Incident summary with impact, duration, and severity.
  • Timeline of key events: detection, mitigation, recovery.
  • Root cause analysis (5 Whys / fishbone) and contributing factors.
  • Customer and stakeholder communications.
  • Action items with owners, dates, and verification steps.
Good practices

Run effective postmortems

  • Keep tone blameless; focus on systems and guardrails.
  • Automate data collection: logs, dashboards, and alerts.
  • Track actions to completion and verify with tests or playbooks.
  • Share learnings broadly to prevent similar incidents.
  • Conduct review meeting within 48 hours of resolution.
Incident severity levels

Classification framework

  • SEV-1 (Critical): Full service outage, data loss, security breach.
  • SEV-2 (High): Major feature degradation, significant user impact.
  • SEV-3 (Medium): Partial feature issues, workaround available.
  • SEV-4 (Low): Minor bugs, cosmetic issues, limited impact.
  • SEV-1 and SEV-2 incidents always require postmortems.
Root cause analysis

Finding the real cause

  • Use 5 Whys technique to drill down to system failures.
  • Identify contributing factors beyond immediate trigger.
  • Look for process gaps, missing monitoring, or unclear ownership.
  • Avoid blaming individuals—focus on improving systems.
  • Document what went well, not just what failed.
Action items

Drive improvements

  • Each action item must have an owner and due date.
  • Prioritize actions based on impact and likelihood of recurrence.
  • Include both immediate fixes and long-term improvements.
  • Track action items in team's backlog or incident tracker.
  • Verify completion with tests, runbooks, or monitoring updates.

Strengthen your incident response.

We can help implement runbooks, on‑call rotations, and reliability metrics that fit your platform and culture.

Talk to our reliability team