Tracing and audit logs

Use this page to enable and validate tracing behavior in OngoingAI Gateway. It covers trace capture, audit event signals, and failure semantics.

Trace coverage

If trace persistence falls behind, proxy forwarding continues. Trace records may be dropped when the queue is full, and failures are logged.

YAML

tracing:
  capture_bodies: false
  body_max_size: 1048576
pii:
  mode: "" # auto (off when body capture is disabled)

With capture_bodies=false, bodies are not stored in traces. The gateway still parses provider responses to extract model and usage metadata.

Metadata-only capture in production: capture_bodies=false.
Incident window capture in controlled environments: capture_bodies=true with pii.mode=redact_storage.
Lower body risk profile: reduce body_max_size.

YAML

tracing:
  capture_bodies: false
  body_max_size: 1048576

YAML

tracing:
  capture_bodies: true
  body_max_size: 262144
pii:
  mode: redact_storage
  policy_id: default/v1

Send one proxied request through /openai/... or /anthropic/....
If auth.enabled=true, include your gateway key header on API reads. Default header name is X-OngoingAI-Gateway-Key.

Query traces:

Bash

curl "http://localhost:8080/api/traces?limit=10"

Query analytics summary:

Bash

curl "http://localhost:8080/api/analytics/summary"

You should see:

At least one trace item for routed provider traffic.
Token and cost aggregates in analytics summary.
Streaming traces with time_to_first_token_ms and stream_chunks metadata when streaming endpoints are used.

Symptom: Trace list is empty.
Cause: Requests did not go through provider routes, or upstream request failed before capture.
Fix: Send traffic through /openai/... or /anthropic/..., then query traces again.

Symptom: Trace drops are logged under high load.
Cause: Trace writer queue is saturated while storage writes lag.
Fix: Reduce capture load (for example disable body capture), increase storage throughput, and verify store health.

Symptom: No auth deny or key lifecycle audit lines appear.
Cause: Relevant events were not triggered, or structured logs are not being collected.
Fix: Trigger a known deny case (401/403) or key lifecycle action, then verify log ingestion.