Skip to main content

Postmortem Template

Owner: Anchor MSP Operations Lead Last reviewed: 2026-05-24

Purpose

Provide a standard template for post-incident reviews. Every Critical and High severity incident requires a completed postmortem within 5 business days of resolution. Medium severity incidents require a postmortem at the Incident Commander's discretion.

Scope

All incidents affecting systems under Anchor managed production that meet the postmortem threshold defined in the Incident Response Procedure.

Template

Copy the template below for each new postmortem. Fill in all sections. Leave no section blank -- if a section does not apply, write "N/A" with a brief explanation.


Incident Summary

FieldValue
Incident IDINC-XXXX
DateYYYY-MM-DD
DurationTotal time from detection to resolution
SeverityCritical / High / Medium
Systems AffectedList all affected systems
Incident CommanderName
Postmortem AuthorName
Postmortem DateYYYY-MM-DD

One-line summary: A single sentence describing the incident.

Timeline

Provide a chronological list of events from detection through resolution. Use UTC timestamps.

Time (UTC)Event
HH:MMFirst alert fired / issue detected
HH:MMIncident Commander assigned
HH:MMSeverity classified as [level]
HH:MMContainment action taken: [describe]
HH:MMRoot cause identified: [describe]
HH:MMFix deployed / remediation applied
HH:MMServices restored and verified
HH:MMMonitoring confirmed green
HH:MMIncident closed

Root Cause Analysis

What Happened

Describe the technical sequence of events that caused the incident. Be specific. Include relevant system names, configuration values, and error messages.

Why It Happened

Describe the underlying reason the incident occurred. Go beyond the immediate technical cause to identify systemic factors. Ask "why" until you reach a root cause that can be addressed with a preventive action.

Contributing Factors

List factors that did not directly cause the incident but made it more likely or more severe:

  1. [Factor 1]
  2. [Factor 2]
  3. [Factor 3]

Impact Assessment

Impact AreaDetails
Users affectedNumber and description of affected users/clients
Data impactAny data loss, corruption, or unauthorized access
DowntimeTotal service unavailability duration
Financial impactEstimated cost (SLA credits, lost revenue, remediation)
Reputation impactClient trust, public visibility

What Went Well

List things that worked during the incident response. This reinforces good practices.

  1. [Item 1]
  2. [Item 2]
  3. [Item 3]

What Could Be Improved

List things that did not work well or could be done better next time. Be candid. Postmortems are blameless.

  1. [Item 1]
  2. [Item 2]
  3. [Item 3]

Action Items

Every postmortem must produce at least one action item. Action items are specific, assigned, and time-bound.

#Action ItemOwnerDue DateStatus
1[Specific action][Name]YYYY-MM-DDOpen
2[Specific action][Name]YYYY-MM-DDOpen
3[Specific action][Name]YYYY-MM-DDOpen

Action item status values: Open, In Progress, Complete.

Appendix

Include links and references to supporting materials:

  • Logs: Link to relevant Loki/Grafana log queries
  • Dashboards: Link to Grafana dashboards showing the incident period
  • Alerts: Link to Alertmanager alert history
  • Communications: Link to Slack incident thread
  • Forensics: Link to forensic package in R2 (if applicable)
  • Related incidents: Links to previous related postmortems

Postmortem Review

The completed postmortem is reviewed by the Operations Lead before it is finalized. The reviewer confirms:

  1. The timeline is accurate and complete.
  2. The root cause analysis is thorough (not just symptoms).
  3. Action items are specific, assigned, and have due dates.
  4. The postmortem is blameless and focuses on systemic improvements.

Reviewed by: [Name] Review date: YYYY-MM-DD

Exceptions

No exceptions. All Critical and High severity incidents require a completed postmortem.