OpenTelemetry Traces

Optional tracing with OpenTelemetry to monitor sessions, LLM calls, and tool execution.

Implemented in src/architect/telemetry/otel.py. Available since v1.0.0 (Base plan v4 Phase D — D4).

Requirement: This module requires the telemetry extra. Install with:

pip install architect-ai-cli[telemetry]

Without this extra, a transparent NoopTracer is used with no performance impact.


Concept

The ArchitectTracer emits OpenTelemetry spans at three levels:

  1. Session span: encompasses the entire execution (architect run "...")
  2. LLM call spans: each model call (tokens, cost, model)
  3. Tool spans: each tool execution (name, success, duration)

If OpenTelemetry is not installed, a transparent NoopTracer is used with no performance impact.


Configuration

YAML Config

telemetry:
  enabled: true
  exporter: otlp                        # otlp | console | json-file
  endpoint: http://localhost:4317       # for otlp (gRPC)
  trace_file: .architect/traces.json    # for json-file

Optional dependencies

# Install the telemetry extra
pip install architect-ai-cli[telemetry]

# Or install manually
pip install opentelemetry-api opentelemetry-sdk opentelemetry-exporter-otlp

Exporters

OTLP (OpenTelemetry Protocol)

Sends spans via gRPC to the configured endpoint. Compatible with:

  • Jaeger (tracing backend)
  • Grafana Tempo (observability)
  • Datadog, Honeycomb, Lightstep, etc.
  • Any OpenTelemetry collector
telemetry:
  enabled: true
  exporter: otlp
  endpoint: http://localhost:4317   # collector or Jaeger

Console

Prints formatted spans to stderr. Ideal for debugging.

telemetry:
  enabled: true
  exporter: console

JSON File

Writes spans as JSON to a file. Useful for offline analysis.

telemetry:
  enabled: true
  exporter: json-file
  trace_file: .architect/traces.json

Semantic attributes

The GenAI Semantic Conventions from OpenTelemetry are used:

Session span

AttributeDescription
architect.taskUser task (first 200 chars)
architect.agentAgent name
gen_ai.request.modelLLM model
architect.session_idSession ID

LLM call span

AttributeDescription
gen_ai.request.modelModel used
gen_ai.usage.input_tokensInput tokens
gen_ai.usage.output_tokensOutput tokens
gen_ai.usage.costCost in USD
architect.stepStep number

Tool span

AttributeDescription
architect.tool_nameTool name
architect.tool_successWhether it executed successfully
architect.tool_duration_msDuration in milliseconds

API

create_tracer()

Factory that returns ArchitectTracer or NoopTracer based on configuration and OpenTelemetry availability.

def create_tracer(
    enabled: bool = False,
    exporter: str = "console",
    endpoint: str = "http://localhost:4317",
    trace_file: str | None = None,
) -> ArchitectTracer | NoopTracer:

ArchitectTracer

class ArchitectTracer:
    def start_session(self, task: str, agent: str, model: str, session_id: str = "") -> ContextManager:
        """Session-level span."""

    def trace_llm_call(self, model: str, tokens_in: int, tokens_out: int, cost: float, step: int) -> ContextManager:
        """Span per LLM call."""

    def trace_tool(self, tool_name: str, success: bool, duration_ms: float, **attrs) -> ContextManager:
        """Span per tool execution."""

    def shutdown(self) -> None:
        """Flush and close the tracer provider."""

NoopTracer / NoopSpan

No-op implementation for when OpenTelemetry is not available:

class NoopSpan:
    def set_attribute(self, key, value): pass
    def __enter__(self): return self
    def __exit__(self, *args): pass

class NoopTracer:
    def start_session(self, **kwargs): return NoopSpan()
    def trace_llm_call(self, **kwargs): return NoopSpan()
    def trace_tool(self, **kwargs): return NoopSpan()
    def shutdown(self): pass

Constants

SERVICE_NAME = "architect"
SERVICE_VERSION = "1.0.0"

Wiring in CLI

# In cli.py (run command)
tracer = create_tracer(
    enabled=config.telemetry.enabled,
    exporter=config.telemetry.exporter,
    endpoint=config.telemetry.endpoint,
    trace_file=config.telemetry.trace_file,
)

with tracer.start_session(task=prompt, agent=agent_name, model=model, session_id=session_id):
    state = loop.run(prompt, stream=use_stream)

tracer.shutdown()

Example with Jaeger

# Start Jaeger
docker run -d --name jaeger \
  -p 16686:16686 \
  -p 4317:4317 \
  jaegertracing/all-in-one:latest

# Configure architect
cat > config.yaml << 'EOF'
telemetry:
  enabled: true
  exporter: otlp
  endpoint: http://localhost:4317
EOF

# Run with telemetry
architect run "refactor utils.py" -c config.yaml --mode yolo

# View traces in Jaeger UI
open http://localhost:16686
# → Service "architect" → search for recent traces

Files

FileContents
src/architect/telemetry/__init__.pyModule exports
src/architect/telemetry/otel.pyArchitectTracer, NoopTracer, NoopSpan, create_tracer()
src/architect/config/schema.pyTelemetryConfig (Pydantic model)
src/architect/cli.pyWiring: create_tracer() + start_session() + shutdown()
tests/test_telemetry/test_telemetry.py20 tests (9 skipped without OpenTelemetry)
tests/test_bugfixes/test_bugfixes.pyBUG-5 tests (wiring)