Skip to main content

Roadmap & Status

Canonical snapshot for FleetForge execution. This page unifies the former Trust Mesh Growth Plan, GA Execution Plan, and near-term M1/M2 roadmap so Product, Docs, and Engineering reference a single source. Pair it with docs/reference/status-and-acceptance.md for the detailed evidence tables and Hello Fleet acceptance run. This page summarizes timing; proof lives in Status & Acceptance.

Snapshot — what already ships

  • Receipts + provenance – Capability tokens, Wasm policy verdicts, C2PA manifests, optional SCITT anchors, and AIBOM exports land together in the Hello Fleet acceptance flow (docs/quickstart/hello-fleet.md, core/runtime/src/capability.rs, core/trust/src/c2pa.rs).
  • Deterministic operations – Postgres state machine + transactional outbox + replay tooling keeps LangGraph/AutoGen/CrewAI adapters reproducible with SLO-aware budgets (core/runtime/src/scheduler.rs, core/runtime/src/replay.rs).
  • Transparency alignment – Attestation vault, SCITT scaffolding, and OpenTelemetry trust.* attributes make every step independently verifiable (core/trust/src/vault_object_store.rs, core/runtime/src/otel.rs).
  • Adapter + workload breadth – Runtimes for LangGraph, AutoGen, CrewAI, batch (Airflow, Argo), serverless (Step Functions), and analytics (dbt) reuse the same trust plane so any AI-touched workflow can join the mesh (docs/reference/adapters.md).
  • Beyond tracing – LangSmith/Phoenix/Weave/Traceloop benchmark quality; FleetForge adds enforceable policy plus attestations so Risk, Governance, Data, and Content teams can clear AI workflows, not just observe them (README.md, docs/concepts/policy.md).

Canonical acceptance & evidence

  • The Status & Acceptance tracker is the single source of truth for readiness, FG tests, and evidence locations. Every roadmap item below links back to it for verification.
  • docs/quickstart/hello-fleet.md hosts the only walkthrough reviewers need to reproduce receipts. Extend it for new evidence rather than minting bespoke demos.
  • docs/reference/attestation-vault.md and docs/governance/signer-profiles.md document the APIs, schema, and signer rotation rules that other teams reference in roadmap items below.

Status snapshot

Legend: ✅ Delivered · ⚙️ Feature-gated · ⚠️ In progress · ⏳ Next · ⛔ Not started

CapabilityStatusNotes
Per-step capability tokens + attestations✅ DeliveredEmitted for every Hello Fleet step with trust.capability_token_id and trust.attestation_id spans.
Budget-aware SLO scheduler⚠️ In progressThrottling metrics + CLI surfacing land with M2; drift target ≤1% tokens.
Telemetry versioning policy⚠️ In progressDual emission + trust.semconv.version rolling out alongside the OTEL stability guidance.
Policy interoperability (OPA bundles + Cedar)⚠️ In progressBundle ingestion, Cedar translations, and policy SDK remain on deck.
Portable capability credentials (Biscuit + VC)⚙️ Feature-gatedBiscuit v2 exports + W3C VCs tie into C2PA manifests and SCITT statements.
Deterministic replay controls⚠️ In progressVirtual time, sealed IO, and dataset digests extend strict replay guarantees.
Observability & transparency alignment⚠️ In progressSCITT writer + Attestation Vault API rounds out trust graph exports.

Now → Next (6–12 weeks)

M1 — Per-step attestations, capability tokens, and trust.* spans

  • Guarantee each tool/LLM step mints a scoped capability token, links attestation IDs, and emits trust.capability_token_id / trust.attestation_id on OTEL spans.
  • Wire /demo, Hello Fleet, and fleetforge-ctl receipt so the same receipts appear inline—no alternate storylines.
  • Acceptance: Status & Acceptance → Identity-secured steps shows ✅ with links to core/runtime/src/capability.rs evidence.

M2 — Budget-aware SLOs in the scheduler

  • Enforce per-run budget SLOs with visible throttling (queue slack metrics, budget_exceeded guardrails) exposed via CLI, /demo, and Hello Fleet.
  • Keep replay drift ≤1% tokens by surfacing throttling + budget telemetry alongside receipts.
  • Acceptance: Status & Acceptance → Budget-aware SLO scheduling row upgrades once the scheduler metrics + demos ship.

Telemetry versioning policy (OTel stability alignment)

  • Dual-emit OpenTelemetry GenAI attributes during upgrades, populate trust.semconv.version, and document the expected OTEL_SEMCONV_STABILITY_OPT_IN=gen-ai behavior for collectors.
  • Ship operator guidance (docs + CLI) showing how to pin or upgrade semantics without breaking dashboards.
  • Acceptance: Status & Acceptance → Telemetry compatibility + transparency roadmap references the semconv versioning docs and env vars once dual-emit lands.

Next → Later (Quarterly themes)

Policy interoperability (OPA bundles + Cedar + SDK)

  • Import existing policy packs via OPA bundles, translate Cedar schemas, and record provenance (repo URL, signer, lint status) for every pack.
  • Ship fleetforge-ctl policy inspect and SDK helpers (TS/Python/Rust) so teams can lint/test/publish packs with evidence.
  • Acceptance: Status & Acceptance → Policy interoperability hits ✅ when bundle ingestion + SDK tooling ship.
  • Back capability tokens with Biscuit v2, add attenuation APIs, and export W3C VCs that reference C2PA manifests + optional SCITT statements for offline verification.
  • Store signed denials so least-privilege behavior is provable outside the runtime.
  • Acceptance: Status & Acceptance → Portable capability credentials references Biscuit exports + VC tooling.

Deterministic replay controls (virtual time + sealed IO + dataset digests)

  • Add virtual time checkpoints, seal outbound IO with digests recorded in receipts, and extend AIBOM with dataset/vector-store digests.
  • Provide a strict replay mode that fails if runs touch unsafed endpoints, ensuring forensics-friendly replays.
  • Acceptance: Status & Acceptance → Deterministic replay controls documents the strict-mode walkthrough.

Observability & transparency alignment (SCITT + Attestation Vault API)

Risk register (watch items)

  1. Policy island perception – Without Cedar/OPA parity, Rego→Wasm can feel bespoke; prioritize bundle ingestion + linting exports.
  2. Capability portability – Runtime-only tokens block external verification. Biscuit v2 exports + VC projections remain critical for regulated buyers.
  3. Replay drift – Strict replay mode needs IO sealing + virtual time to satisfy forensics. The scheduler budget work in M2 feeds directly into this promise.
  4. Telemetry churn – GenAI semantic conventions move quickly; publish a compatibility policy and shim layers so dashboards survive upgrades.
  5. Surface-area sprawl – Keep landing/demo/docs aligned (README hero, /, /demo, Hello Fleet). Any new adapters or packs must update this page, the Status & Acceptance tracker, and /reference/adapters together.

References

  • docs/concepts/north-star.md – Product thesis + differentiation.
  • docs/reference/status-and-acceptance.md – Canonical readiness checklist.
  • docs/reference/attestation-vault.md, docs/governance/signer-profiles.md – APIs and ops runbooks for provenance storage.
  • README.md – Trust Mesh overview for newcomers; ensure hero copy matches the “Trust Mesh for AI workflows and agent fleets” positioning.