Skip to main content

ChangeOps Governance

ChangeOps turns FleetForge’s delivery, policy, and replay guarantees into release gates. Instead of shipping code or prompt changes on trust alone, every change bundle must earn an explicit decision backed by telemetry, evals, and replay evidence.

Why ChangeOps gates exist

FleetForge operates fleets of agents that interact with users, budgets, and regulated data. Gates provide:

  • Shared criteria – the same policy engine that guards production runs evaluates change bundles, keeping review standards consistent.
  • Data-backed approvals – novelty scores, replay parity, eval results, and budget impact travel with each change so approvers see objective evidence.
  • Auditability – decisions and follow-ups persist to Postgres, emit OTEL telemetry, and surface as artifacts that downstream tooling can consume.

Gate inputs

Gate evaluations expect a change bundle (typically JSON) that includes:

  • Diff summary – the files, prompts, or adapters modified, plus novelty metrics.
  • Evaluation outputs – recent eval pack runs covering safety, cost, and regression criteria.
  • Replay artefacts – replay IDs and parity checks that prove deterministic behaviour still holds.
  • Budget analysis – projected cost deltas or policy budget impacts.

Pipelines usually assemble this bundle before invoking the gate; see the ChangeOps CI how-to for a concrete example.

Gate outputs & decision artifacts

The ChangeOps engine returns one of three outcomes:

  • allow – safe to ship; metadata includes the evidence that unlocked approval.
  • follow_up – additional action required (for example, missing eval coverage); approvers record the follow-up via the CLI or API.
  • deny – gate fails and the change must be reworked before a new evaluation.

Every decision writes:

  • Database records in changeops_gates (decision snapshot) and changeops_followups (acknowledgements or overrides).
  • Artifacts tagged change_gate_decision so CI, release tooling, or auditors can fetch the evidence without replaying the pipeline.
  • Telemetry spans in the fleetforge.changeops namespace, visible in the UI and exported via OTEL.

Set FLEETFORGE_TRANSPARENCY_SCOPE=gates|runs|artifacts to control whether only gate decisions, entire runs, or every artifact enqueue SCITT transparency jobs. The flag defaults to gates, pairs with FLEETFORGE_TRANSPARENCY_WRITER, and lets regulated tenants opt into per-run/per-artifact receipts gradually. The writer is an Enterprise-only feature—set FLEETFORGE_LICENSE_TIER=enterprise before enabling it; otherwise the runtime logs a warning and keeps the feature disabled.

Lifecycle overview

  1. Prepare – collect diffs, evals, replay IDs, and budgets into a bundle.
  2. Evaluate – call fleetforge-ctl gates check --input bundle.json (or the corresponding API) to obtain a decision.
  3. Act – block merges on deny, route follow_up to approvers, merge on allow.
  4. Record – approvers run gates followup when action is needed; CI stores the decision artifact alongside release notes.
  5. Audit – operators list or export gates (gates list) and drill into the artifacts or telemetry when investigating releases.

Implementation components

  • core/changeops/ – deterministic gate engine.
  • core/storage/migrations/0014_changeops_gates.sql – schema for decisions and follow-ups.
  • core/ctl/ – CLI surface (gates check, gates followup, gates list).
  • Runtime API (CheckChangeGate, RecordGateFollowup, ListChangeGates) – see the ChangeOps reference for full details.

Getting started

  • Define mandatory eval packs, replay thresholds, and budget limits for each surface (prompts, tools, adapters).
  • Integrate the gate check into CI/CD so every PR attaches a decision artifact before merge (follow the ChangeOps CI how-to).
  • Share gate IDs in release notes and store artifacts with the deployment bundle for long-term auditability.

Acceptance tests

Use Status & Acceptance → Receipts-as-gates for the canonical gate checklist. The Hello Fleet walkthrough there produces the capability tokens, attestations, C2PA manifest, and replay telemetry that ChangeOps consumes; extend that single scenario whenever new evidence is required instead of inventing bespoke demos.

Receipts-as-gates (CI / GitHub Action)

To make trust evidence sticky, FleetForge ships a GitHub Action (.github/actions/fleetforge-receipts) that blocks merges when:

  • Replay drift exceeds a configurable threshold (default 1% token delta, 0.5% cost delta).
  • Any step in the referenced run is missing capability tokens or C2PA manifests.
  • Policy verdicts differ between the live run and the replay envelope.

The action fetches fleetforge-ctl receipt --run-id <RUN_ID> plus replay stats, fails the workflow if thresholds trip, and posts an attestation diff as a PR comment so reviewers see exactly which steps drifted. Use it alongside your existing gates check invocation:

name: fleetforge-evidence
on:
pull_request:
jobs:
trust:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: fleetforge/actions/setup@v1
- run: fleetforge-ctl gates check --input bundle.json
- uses: fleetforge/actions/verify-receipts@v1
with:
run-id: ${{ steps.submit.outputs.run_id }}
max-token-drift: '0.01'
max-cost-drift: '0.005'

Keeping receipts in CI makes attestation-backed workflows a merge requirement, not an optional audit exercise.