Day-in-the-life workflows
FleetForge’s “Hello Fleet” storyline covers both the individual contributor (IC) loop on a laptop and the ChangeOps gate that runs on every pull request. Use this guide when you need a concrete, reproducible path for fleetforge-ctl receipts, verification, replay, and CI evidence.
IC workflow (local dev)
-
Run the canonical demo once
Bring up the demo runtime (just demoor the quickstart Compose stack), submit the Hello Fleet DAG, and capture the printedRUN_ID. Keep that ID handy and reuse it everywhere (CLI,/demo, receipts, replay).# export a signed receipt (choose any manifest path under build/ or tmp/)
fleetforge-ctl receipt --run-id "$RUN_ID" --manifest "build/receipts/${RUN_ID}.c2pa.json"
# verify an artifact + manifest locally
fleetforge-ctl verify tmp/hello-fleet/artifacts/demo-output.bin \
--manifest tmp/hello-fleet/artifacts/demo-output.bin.c2pa.json
# prove determinism
fleetforge-ctl replay "$RUN_ID" -
Tighten a guardrail and re-run
Flip the budget preset or run theagent_team_openaivariant so you intentionally trip a policy denial. This yields thepolicy_decisionsartifact, surfaces the signedTrustDecision, and gives you a concrete “bad path” that CI can catch later. -
Understand the evidence you just generated
- Receipts are C2PA Content Credentials bound to each artifact, so verifiers can audit “who did what/when” without your runtime.
fleetforge-ctl verifywalks the manifest + capability-token chain locally; pair it withfleetforge-ctl replayto surface drift (tokens, spend, or tool I/O) immediately.- Telemetry follows the OpenTelemetry GenAI semantic conventions (with
trust.*attributes) so downstream collectors label runs with attestation IDs, capability tokens, and policy verdicts.
-
Optional – sign like prod
Local dev defaults to theenv-ed25519signer. When you need parity with production, point the CLI at AWS/GCP/Azure KMS via the existing shims (FLEETFORGE_TRUST_SIGNING_KEY,FLEETFORGE_SCITT_*, etc.) without changing the workflow above.
CI gate (pull requests)
Treat every pull request as two checks that run back-to-back:
-
ChangeOps decision (policy/evidence gate) – Pipelines build a JSON bundle (diff summary, novelty, eval coverage, replay/canary telemetry, budget deltas) and request a ruling:
fleetforge-ctl gates check --input change.json --json decision.jsonThe gate returns
allow,follow_up, ordeny. Missing replay evidence or thin coverage yields structured reasons (for example “coverage below minimum” or “missing replay telemetry”) so CI can fail fast. Gate inputs should include:- Replay telemetry with
attestation_match=true,tool_io_match=true, and ≤1% token drift. - Canary/promote telemetry listing the attestation IDs it observed so reviewers can trace what passed or failed.
- Replay telemetry with
-
Receipts-as-gates (artifact/replay verifier) – A GitHub Action (or equivalent) fetches
fleetforge-ctl receipt --run-id …plus replay stats, enforces drift thresholds (defaults: 1% tokens, 0.5% cost), and posts an attestation diff on the PR if anything moved. Pair it with the step above (details in Receipts-as-gates).
Operational knobs you’ll set once
- Apply
core/storage/migrations/0014_changeops_gates.sqland rotate the CI service token that callsgates check. - Export gate decisions to your observability stack so reviewers can correlate incidents to specific
gate_ids.
Transparency hooks
- Emit SCITT statements or store receipts in a transparency log by enabling the
FLEETFORGE_TRANSPARENCY_*env vars; local runs can stub these while production points at HTTP-backed logs.
Putting it together
On a laptop: run Hello Fleet → export receipt → verify → replay (no drift) and capture one intentional guardrail denial. On every PR: assemble change.json → gates check returns allow/follow_up/deny → the receipts/verifier action enforces replay + receipt quality → PR blocks when evidence is missing or drift exceeds thresholds.
Why this matters to Security/Risk
The gate enforces controls mapped to OWASP’s LLM Top-10 (prompt-injection guardrails, tool ACLs, output handling) and surfaces receipts plus replay evidence as proof—not just logs—so Risk/Security reviewers inherit deterministic, auditable workflows.