Skip to content

Feature Flags

Wrapping new behavior so rollback is one switch.

A feature flag is a runtime switch that wraps new behavior. With flags, rollback is one click — no redeploy, no rebuild. The team can ship code to production days before the feature is enabled. The team can enable to a pilot group before everyone. The team can disable on the first sign of trouble.

This is the foundation that makes trunk-based development (Part 2) safe.

The lifecycle

Every flag has a lifecycle. The corpus is opinionated about each step.

StepWhat happensOwner
CreateFlag is registered in the platform with a name, description, default stateDeveloper + Tech Lead
WireCode conditionally executes based on the flagDeveloper
TestBoth flag-on and flag-off paths are tested in stagingQA
EnableFlag is turned on for the target audience (pilot, percentage, all)PO + Tech Lead
StabiliseFlag remains in code while behavior is observedTech Lead
Clean upFlag is removed; the new behavior becomes the only pathDeveloper

The last step is the one most teams skip. A flag that has been on for six months is no longer a flag — it is a legacy if statement. Cleanup is a story like any other.

What flags are for

Three jobs.

  1. Rollback without redeploy. The most common reason. New behavior misbehaves; flag off; investigate calmly.
  2. Gradual enablement. Pilot one customer; if good, enable 5%; if good, 25%; if good, all.
  3. A/B testing. Compare two paths simultaneously. (Less common in the corpus pattern; we tend to predict and check rather than A/B.)

Naming flags

The same domain-language discipline as code (Part 1). Flag names use the brief's vocabulary.

Bad flag nameGood flag name
feature_xgrading.hebrew-names
new_uigrading.flow-v2
experiment_3grading.keyboard-shortcuts

The format is {area}.{specific-thing}. Areas group flags so the platform's UI is navigable.

Wiring patterns

The corpus uses a small wrapper. Every flag check goes through it.

typescript
// Bad: scattered inline checks
if (process.env.FLAG_HEBREW === 'on') { ... }

// Good: typed flag client
import { flag } from '@/flags'

if (flag('grading.hebrew-names', { user: gal })) {
  // new behavior
} else {
  // old behavior
}

The wrapper:

  • Is the only place that talks to the platform SDK.
  • Carries context (user, session) so flags can target.
  • Logs flag evaluations as observability events.
  • Defaults to off if the platform is unreachable.

What both paths need

A flag wraps two paths — old and new. Both must be tested. Both must be runnable in staging. The flag-on path is what the cycle ships. The flag-off path is the rollback target.

The QA verification (Part 5) explicitly tests both. I tested with the flag on is half a verification.

Targeting

Flags can target by:

  • User — specific named users (the pilot).
  • Account — specific customer organisations.
  • Percentage — random N% of all traffic.
  • Attribute — any dimension the platform supports (region, plan, role).

The targeting is part of the rollout plan (Part 7). It is named in advance, not invented during enablement.

Cleanup as a Volume V act

A flag that has been on at 100% for two cycles should be cleaned up. The corpus pattern: cleanup stories appear in the next cycle's slice as part of the unfinished business the model update surfaces.

Failure to clean up flags produces:

  • Cognitive load — every reader of the code wonders if the flag is still meaningful.
  • Configuration drift — the platform has hundreds of stale flags; finding the active ones is hard.
  • Real bugs — when someone toggles a stale flag thinking it does something different.

A flag-cleanup discipline is part of operational hygiene. The portfolio review (Volume V Part 9) reads flag count and age trends as a chain-health signal.

When flags are wrong

Some changes should not be flagged. The flag adds noise without value.

  • Pure refactors. No behavior change. No need to flag.
  • Changes too tightly coupled to flag. If the flag-on and flag-off paths diverge so much that the codebase has to maintain two implementations, the flag has become a fork.
  • Schema changes. A migration cannot be flagged at the data layer the same way as a UI change. Schema changes are managed differently — see Part 8 — Runbooks & Rollback.

The Tech Lead decides what is flagged. The default is flag user-facing behavior changes; do not flag pure refactors or schema work. Deviations are recorded.

Part 4 — The CI/CD Pipeline →

200apps · How We Work · NWIRE