Ongoing Operations & Client · master area
SLA Definition
An SLA is an external promise the team makes to the client — contractual, measurable, with consequences for missing it. Five categories. Derived from SLOs with a safety margin.
Owners: PO, Tech Lead Phase it lives in: After We Build (Volume V) The corpus principle this enacts: The chain is the infrastructure that makes trust computable.
Where it lives in the chain
The five categories
| Category | What it promises | Measured by |
|---|---|---|
| Availability | Percentage of time the system is operational. "99.5% uptime per month" = ≤3.6h downtime. | Monitoring, not user reports. |
| Response time | How quickly the system responds. Per critical flow. | p95 latency under named threshold. |
| Support response time | How quickly L1 acknowledges, L2 investigates, L3 resolves. | Per priority level. |
| Resolution time | How quickly a reported issue is resolved. P0: 4h. P1: 1 business day. P2: 5 business days. | Contractual; not aspirational. |
| Data integrity | The commitment that data is stored correctly and not lost. Hardest to recover from when breached — data loss erodes trust faster than any other failure. | Sampled checks; audit log integrity. |
How to derive an SLA from an SLO
The SLO says "99% of submissions in under 2 seconds." The SLA promises the client 95%. The gap — the 4% margin — is the team's operational safety buffer.
- If the SLO is breached, the team has time to fix before the SLA is breached.
- If the SLA is breached, the client relationship absorbs the cost.
A team with no margin between SLO and SLA is a team where every operational hiccup becomes a contractual breach.
What good practice looks like
The SLA is a living contract, not a marketing document. The PO maintains it, reviews it quarterly with the client (SLA Review), and adjusts it as the system matures. Tightening the SLA as confidence grows is a sign of a healthy relationship; loosening it after a breach is a sign of an honest one.
The SLA names what happens on breach — credits, escalation, named recovery commitments. Without that, the SLA is a number with no teeth. The breach protocol (SLA Monitoring) names the steps when the threshold approaches and when it crosses.
Related crafts
- SLA Monitoring & Breach Protocol
- SLA Review Cadence
- Ilities Selection — where the technical underpinnings of the SLA are named