Skip to content

SLA Definition

An SLA is an external promise the team makes to the client — contractual, measurable, with consequences for missing it. Five categories. Derived from SLOs with a safety margin.

Owners: PO, Tech Lead Phase it lives in: After We Build (Volume V) The corpus principle this enacts: The chain is the infrastructure that makes trust computable.

Where it lives in the chain

The five categories

CategoryWhat it promisesMeasured by
AvailabilityPercentage of time the system is operational. "99.5% uptime per month" = ≤3.6h downtime.Monitoring, not user reports.
Response timeHow quickly the system responds. Per critical flow.p95 latency under named threshold.
Support response timeHow quickly L1 acknowledges, L2 investigates, L3 resolves.Per priority level.
Resolution timeHow quickly a reported issue is resolved. P0: 4h. P1: 1 business day. P2: 5 business days.Contractual; not aspirational.
Data integrityThe commitment that data is stored correctly and not lost. Hardest to recover from when breached — data loss erodes trust faster than any other failure.Sampled checks; audit log integrity.

How to derive an SLA from an SLO

The SLO says "99% of submissions in under 2 seconds." The SLA promises the client 95%. The gap — the 4% margin — is the team's operational safety buffer.

  • If the SLO is breached, the team has time to fix before the SLA is breached.
  • If the SLA is breached, the client relationship absorbs the cost.

A team with no margin between SLO and SLA is a team where every operational hiccup becomes a contractual breach.

What good practice looks like

The SLA is a living contract, not a marketing document. The PO maintains it, reviews it quarterly with the client (SLA Review), and adjusts it as the system matures. Tightening the SLA as confidence grows is a sign of a healthy relationship; loosening it after a breach is a sign of an honest one.

The SLA names what happens on breach — credits, escalation, named recovery commitments. Without that, the SLA is a number with no teeth. The breach protocol (SLA Monitoring) names the steps when the threshold approaches and when it crosses.

200apps · How We Work · NWIRE