part three · feature flags
Feature Flags
Wrapping new behavior so rollback is one switch.
A feature flag is a runtime switch that wraps new behavior. With flags, rollback is one click — no redeploy, no rebuild. The team can ship code to production days before the feature is enabled. The team can enable to a pilot group before everyone. The team can disable on the first sign of trouble.
This is the foundation that makes trunk-based development (Part 2) safe.
The lifecycle
Every flag has a lifecycle. The corpus is opinionated about each step.
| Step | What happens | Owner |
|---|---|---|
| Create | Flag is registered in the platform with a name, description, default state | Developer + Tech Lead |
| Wire | Code conditionally executes based on the flag | Developer |
| Test | Both flag-on and flag-off paths are tested in staging | QA |
| Enable | Flag is turned on for the target audience (pilot, percentage, all) | PO + Tech Lead |
| Stabilise | Flag remains in code while behavior is observed | Tech Lead |
| Clean up | Flag is removed; the new behavior becomes the only path | Developer |
The last step is the one most teams skip. A flag that has been on for six months is no longer a flag — it is a legacy if statement. Cleanup is a story like any other.
What flags are for
Three jobs.
- Rollback without redeploy. The most common reason. New behavior misbehaves; flag off; investigate calmly.
- Gradual enablement. Pilot one customer; if good, enable 5%; if good, 25%; if good, all.
- A/B testing. Compare two paths simultaneously. (Less common in the corpus pattern; we tend to predict and check rather than A/B.)
Naming flags
The same domain-language discipline as code (Part 1). Flag names use the brief's vocabulary.
| Bad flag name | Good flag name |
|---|---|
feature_x | grading.hebrew-names |
new_ui | grading.flow-v2 |
experiment_3 | grading.keyboard-shortcuts |
The format is {area}.{specific-thing}. Areas group flags so the platform's UI is navigable.
Wiring patterns
The corpus uses a small wrapper. Every flag check goes through it.
// Bad: scattered inline checks
if (process.env.FLAG_HEBREW === 'on') { ... }
// Good: typed flag client
import { flag } from '@/flags'
if (flag('grading.hebrew-names', { user: gal })) {
// new behavior
} else {
// old behavior
}The wrapper:
- Is the only place that talks to the platform SDK.
- Carries context (user, session) so flags can target.
- Logs flag evaluations as observability events.
- Defaults to off if the platform is unreachable.
What both paths need
A flag wraps two paths — old and new. Both must be tested. Both must be runnable in staging. The flag-on path is what the cycle ships. The flag-off path is the rollback target.
The QA verification (Part 5) explicitly tests both. I tested with the flag on is half a verification.
Targeting
Flags can target by:
- User — specific named users (the pilot).
- Account — specific customer organisations.
- Percentage — random N% of all traffic.
- Attribute — any dimension the platform supports (region, plan, role).
The targeting is part of the rollout plan (Part 7). It is named in advance, not invented during enablement.
Cleanup as a Volume V act
A flag that has been on at 100% for two cycles should be cleaned up. The corpus pattern: cleanup stories appear in the next cycle's slice as part of the unfinished business the model update surfaces.
Failure to clean up flags produces:
- Cognitive load — every reader of the code wonders if the flag is still meaningful.
- Configuration drift — the platform has hundreds of stale flags; finding the active ones is hard.
- Real bugs — when someone toggles a stale flag thinking it does something different.
A flag-cleanup discipline is part of operational hygiene. The portfolio review (Volume V Part 9) reads flag count and age trends as a chain-health signal.
When flags are wrong
Some changes should not be flagged. The flag adds noise without value.
- Pure refactors. No behavior change. No need to flag.
- Changes too tightly coupled to flag. If the flag-on and flag-off paths diverge so much that the codebase has to maintain two implementations, the flag has become a fork.
- Schema changes. A migration cannot be flagged at the data layer the same way as a UI change. Schema changes are managed differently — see Part 8 — Runbooks & Rollback.
The Tech Lead decides what is flagged. The default is flag user-facing behavior changes; do not flag pure refactors or schema work. Deviations are recorded.