Template

Incident postmortem template.

A blameless framework for documenting incidents, finding root cause, and preventing recurrence with actionable follow‑ups.

Template sections

What to capture

Incident summary with impact, duration, and severity.
Timeline of key events: detection, mitigation, recovery.
Root cause analysis (5 Whys / fishbone) and contributing factors.
Customer and stakeholder communications.
Action items with owners, dates, and verification steps.

Good practices

Run effective postmortems

Keep tone blameless; focus on systems and guardrails.
Automate data collection: logs, dashboards, and alerts.
Track actions to completion and verify with tests or playbooks.
Share learnings broadly to prevent similar incidents.
Conduct review meeting within 48 hours of resolution.

Incident severity levels

Classification framework

SEV-1 (Critical): Full service outage, data loss, security breach.
SEV-2 (High): Major feature degradation, significant user impact.
SEV-3 (Medium): Partial feature issues, workaround available.
SEV-4 (Low): Minor bugs, cosmetic issues, limited impact.
SEV-1 and SEV-2 incidents always require postmortems.

Root cause analysis

Finding the real cause

Use 5 Whys technique to drill down to system failures.
Identify contributing factors beyond immediate trigger.
Look for process gaps, missing monitoring, or unclear ownership.
Avoid blaming individuals—focus on improving systems.
Document what went well, not just what failed.

Action items

Drive improvements

Each action item must have an owner and due date.
Prioritize actions based on impact and likelihood of recurrence.
Include both immediate fixes and long-term improvements.
Track action items in team's backlog or incident tracker.
Verify completion with tests, runbooks, or monitoring updates.