Skip to content

Status Page Management

Automatic on P0; manual updates during incidents. Accurate and current builds trust; stale destroys it. The status page is what the client checks at 03:00 when the team is awake but silent.

Owners: Tech Lead, Incident Communicator Phase it lives in: How We Build → After We Build The corpus principle this enacts: Trust is the product.

Where it lives in the chain

How to do this

  • Automatic on P0 — monitoring fires, status page updates within minutes. No human in the loop for the initial banner.
  • Manual updates — at minimum every 30 minutes during an incident. "Investigating," "Identified," "Monitoring," "Resolved." No information is itself information; silence is the worst signal.
  • Plain language"Some users are experiencing slow grading submissions." Not "transient latency anomaly in submission queue."
  • Date-stamped — every update has a timestamp the reader can trust.
  • Resolution post — when the incident is resolved, the status page says so explicitly. Not just "issue resolved" — "Submissions are processing normally. We've identified the cause and are running a follow-up to prevent recurrence."

What good practice looks like

The JWT outage's timeline included automatic status page update at 09:43 (when PagerDuty fired) and resolution post at 10:05 (4 minutes after revert). The client saw the team responding in real time, knew when it was resolved, and the postmortem reference followed within 48 hours. The trust cost was real but contained — the team owned the timeline, not the customer's panic.

A team that leaves the status page green during an incident because they're still investigating produces the worst outcome: customers see contradictory information (red on their screen, green on the page), conclude the team is incompetent or dishonest, and the next incident starts from a deeper trust deficit.

The discipline: status page accuracy is more important than status page beauty. A team that posts "we're investigating, will update in 20 min" within 5 minutes is a team customers learn to trust.

200apps · How We Work · NWIRE