Phase 6 – Telemetry Integrations

Phase 6 aligns FleetForge telemetry with the OpenTelemetry GenAI semantic conventions and extends exports for Langfuse/LangSmith/Phoenix so operators get SLO dashboards out of the box.

Highlights

GenAI spans now tag requests with gen_ai.system, gen_ai.operation.name, gen_ai.request.model, gen_ai.response.model, and duration/usage attributes for LLM, tool, and agent executions.
Metrics exposed via OTEL counters/histograms:
- gen_ai.prompt.tokens, gen_ai.completion.tokens, gen_ai.tokens.total
- gen_ai.cost.usd, gen_ai.request.duration
- fleetforge.policy.events for guardrail bursts
LangGraph agent spans mirror LLM telemetry so adapter runs inherit the same dashboards and cost tracking.
External exporters (Langfuse, LangSmith, Phoenix) receive the enriched payloads without additional configuration.

Acceptance Criteria

Span attributes and metrics conform to OTEL GenAI naming so downstream tools (Grafana/Langfuse/LangSmith/Phoenix) ingest them without mapping.
Dashboards can chart SLO burn (duration + error rates), cost burn, and policy hits straight from the OTLP stream.
GenAI metrics aggregate per gen_ai.system and gen_ai.request.model, covering both built-in LLM steps and LangGraph adapters.

Metric Reference

Metric	Type	Description
`gen_ai.prompt.tokens`	Counter	Prompt tokens consumed per request
`gen_ai.completion.tokens`	Counter	Completion tokens emitted per request
`gen_ai.tokens.total`	Counter	Prompt + completion tokens
`gen_ai.cost.usd`	Counter	Provider reported USD cost
`gen_ai.request.duration`	Histogram (ms)	End-to-end request latency
`fleetforge.policy.events`	Counter	Guardrail/policy decision count

All metrics surface gen_ai.system and gen_ai.request.model labels so they can be sliced per provider/model. Policy events carry fleetforge.policy.effect and fleetforge.policy.pack.

Notes

Ensure an OTLP collector is configured (OTEL_EXPORTER_OTLP_ENDPOINT) so the new metrics reach Grafana/ClickHouse.
Existing Langfuse/LangSmith/Phoenix exporters automatically include the added attributes; no additional configuration required.
Dashboards should combine the new metrics with existing queue/budget metrics to visualise end-to-end burn and SLO attainment.

Highlights​

Acceptance Criteria​

Metric Reference​

Notes​

Highlights

Acceptance Criteria

Metric Reference

Notes