Logging system

Describes the full logging architecture of the project: three independent pipelines, the custom HUMAN level, the visual format with icons, and the integration with the agent loop.

v1.1.0: HUMAN pipeline messages now support i18n (English by default, Spanish configurable). The examples on this page show the English format (default). See i18n.md.


Architecture: three pipelines

The system uses structlog on top of the Python stdlib with three independent pipelines. Each one has its own handler, level, and format.

structlog.configure(
    processors=[..., wrap_for_formatter],  <- always wrap_for_formatter
    logger_factory=LoggerFactory(),        <- stdlib loggers
)
    |
    v
logging.root
    |-- [1] FileHandler        (JSON Lines, DEBUG+)       <- only if --log-file
    |-- [2] HumanLogHandler    (stderr, HUMAN=25 only)    <- always active (except --quiet/--json)
    +-- [3] StreamHandler      (stderr, WARNING+ / -v)    <- technical console, excludes HUMAN

Pipeline 1 — JSON file (optional)

Activated with --log-file PATH. Captures all events (DEBUG+) in JSON Lines format.

architect run "..." --log-file logs/session.jsonl
cat logs/session.jsonl | jq 'select(.event == "agent.tool_call.execute")'

Pipeline 2 — Human handler (agent traceability)

Active by default. Only processes events at HUMAN level (25). Produces readable output with icons on stderr.

v1.1.0: The HUMAN output language depends on the language configuration (default: en). See i18n.md.

🔄 Step 1 → LLM call (5 messages)
   ✓ LLM responded with 2 tool calls

   🔧 read_file → src/main.py
      ✓ OK

   🔧 edit_file → src/main.py (3→5 lines)
      ✓ OK
      🔍 Hook python-lint: ✓

🔄 Step 2 → LLM call (9 messages)
   ✓ LLM responded with final text

✅ Agent complete (2 steps)
   Reason: LLM decided it was done
   Cost: $0.0042

With language: es in the configuration:

🔄 Paso 1 → Llamada al LLM (5 mensajes)
   ✓ LLM respondió con 2 tool calls
   ...
✅ Agente completado (2 pasos)
   Razón: LLM decidió que terminó
   Coste: $0.0042

Disabled with --quiet or --json.

Pipeline 3 — Technical console

Controlled by -v / -vv / -vvv. Shows technical logs (INFO/DEBUG) on stderr. Excludes HUMAN events to avoid duplicates.

FlagLevelWhat it shows
(no -v)WARNINGProblems only
-vINFOSystem operations, config, registrations
-vvDEBUGFull args, LLM responses, timing
-vvvDEBUGEverything, including HTTP

HUMAN level (25)

Custom level between INFO (20) and WARNING (30):

# logging/levels.py
HUMAN = 25
logging.addLevelName(HUMAN, "HUMAN")

HUMAN events represent the agent traceability — what it is doing step by step. They are not technical logs but information for the end user.


HumanFormatter — visual event formatting

Each event type has its own format with icons:

Loop events

The examples show the English format (default). With language: es, the messages are shown in Spanish.

EventFormat (EN)Icon
agent.llm.call🔄 Step N → LLM call (M messages)🔄
agent.llm.response (tools)✓ LLM responded with N tool calls
agent.llm.response (text)✓ LLM responded with final text
agent.complete✅ Agent complete (N steps) + reason + cost

Tool events

EventFormatIcon
agent.tool_call.execute (local)🔧 tool → args_summary🔧
agent.tool_call.execute (MCP)🌐 tool → summary (MCP: server)🌐
agent.tool_call.complete (ok)✓ OK
agent.tool_call.complete (error)✗ ERROR: message
agent.hook.complete (named)🔍 Hook name: ✓/⚠️ detail🔍

Safety nets

EventFormat (EN)Icon
safety.user_interrupt⚠️ Interrupted by user⚠️
safety.max_steps⚠️ Step limit reached (N/M)⚠️
safety.budget_exceeded⚠️ Budget exceeded ($X/$Y)⚠️
safety.timeout⚠️ Timeout reached⚠️
safety.context_full⚠️ Context full⚠️

Errors and lifecycle

EventFormat (EN)Icon
agent.llm_error❌ LLM error: message
agent.step_timeout⚠️ Step timeout (Ns)⚠️
agent.closing🔄 Closing (reason, N steps)🔄
agent.loop.complete (success)(N steps, M tool calls) + cost
agent.loop.complete (partial)⚡ Stopped (status -- reason, N steps)

Pipeline (v1.1.0)

EventFormatIcon
pipeline.step_start━ Pipeline step 1/3: analyze (agent: plan) ━━━━━
pipeline.step_skipped⏭️ Step 'deploy' skipped (condition not met)⏭️
pipeline.step_done✓ Step 'analyze' → success ($0.0234, 12.5s)✓/✗

Ralph Loop (v1.1.0)

EventFormatIcon
ralph.iteration_start━ Ralph iteration 1/5 (check: pytest tests/) ━━━
ralph.checks_result🧪 Checks: 3/5 passed (or 5/5 passed ✓)🧪
ralph.iteration_done✓ Iteration 1 → success ($0.0234, 45.2s)✓/✗
ralph.complete✅ Ralph complete — 2 iterations, success ($0.0423)✅/⚠️

Auto-Reviewer (v1.1.0)

EventFormatIcon
reviewer.start━ Auto-Review (142 diff lines) ━━━━━━━━━━━━━
reviewer.complete✓ Review complete: approved, 2 issues, score 8/10✓/✗

Parallel Runs (v1.1.0)

EventFormatIcon
parallel.worker_done✓ Worker 1 (gpt-4.1) → success ($0.0456, 120.3s)✓/✗
parallel.worker_error✗ Worker 3 → error: timeout
parallel.complete⚡ Parallel complete — 3 workers: 2 success, 1 failed ($0.0857)

Competitive Eval (v1.1.0)

EventFormatIcon
competitive.model_done🏆 gpt-4.1: #1 (score: 85, 5/5 checks, $0.0456)🏆/🥈/🥉
competitive.ranking🏁 Ranking final: gpt-4.1 > claude-sonnet > gpt-4.1-mini🏁

Context

EventFormatIcon
context.compressing📦 Compressing context -- N exchanges📦
context.window_enforced📦 Context window: removed N messages📦

Args summarizer (_summarize_args)

Each tool has an optimized summary so the user can understand at a glance what the agent is doing:

ToolSummary example
read_filesrc/main.py
write_filesrc/main.py (42 lines)
edit_filesrc/main.py (3→5 lines)
apply_patchsrc/main.py (+5 -3)
search_code"validate_path" in src/
grep"import jwt" in src/
run_commandpytest tests/ -x
MCP toolsfirst argument truncated to 60 chars
Unknown tool without args(no args)

HumanLog — typed helper

The AgentLoop emits HUMAN events through HumanLog, which provides typed methods:

hlog = HumanLog(structlog.get_logger())

hlog.llm_call(step=0, messages_count=5)          # 🔄 Step 1 → LLM (5 messages)
hlog.llm_response(tool_calls=2)                   # ✓ LLM responded with 2 tool calls
hlog.tool_call("read_file", {"path": "main.py"})  # 🔧 read_file → main.py
hlog.tool_call("mcp_docs_search", {"q": "..."}, is_mcp=True, mcp_server="docs")
                                                    # 🌐 mcp_docs_search → ... (MCP: docs)
hlog.tool_result("read_file", success=True)        # ✓ OK
hlog.hook_complete("edit_file", hook="ruff", success=True)
                                                    # 🔍 Hook ruff: ✓
hlog.agent_done(step=3, cost="$0.0042")            # ✅ Agent complete (3 steps)
hlog.safety_net("max_steps", step=50, max_steps=50)
                                                    # ⚠️ Step limit reached
hlog.closing("max_steps", steps=50)                # 🔄 Closing (max_steps, 50 steps)
hlog.llm_error("timeout")                          # ❌ LLM error: timeout
hlog.step_timeout(seconds=60)                      # ⚠️ Step timeout (60s)
hlog.loop_complete("success", None, 3, 5)          # (3 steps, 5 tool calls)

# Pipeline (v1.1.0)
hlog.pipeline_step_start("analyze", "plan", 1, 3)  # ━ Pipeline step 1/3: analyze ━━━
hlog.pipeline_step_skipped("deploy")                # ⏭️ Step 'deploy' skipped
hlog.pipeline_step_done("analyze", "success", 0.02, 12.5)  # ✓ Step 'analyze' → success

# Ralph Loop (v1.1.0)
hlog.ralph_iteration_start(1, 5, "pytest tests/")   # ━ Ralph iteration 1/5 ━━━
hlog.ralph_checks_result(1, 3, 5, False)             # 🧪 Checks: 3/5 passed
hlog.ralph_iteration_done(1, "partial", 0.02, 45.2)  # ✗ Iteration 1 → partial
hlog.ralph_complete(2, "success", 0.04)              # ✅ Ralph complete — 2 iterations

# Auto-Reviewer (v1.1.0)
hlog.reviewer_start(142)                             # ━ Auto-Review (142 diff lines) ━━━
hlog.reviewer_complete(True, 2, "8/10")              # ✓ Review complete: approved

# Parallel Runs (v1.1.0)
hlog.parallel_worker_done(1, "gpt-4.1", "success", 0.04, 120.3)
                                                      # ✓ Worker 1 (gpt-4.1) → success
hlog.parallel_worker_error(3, "timeout")              # ✗ Worker 3 → error: timeout
hlog.parallel_complete(3, 2, 1, 0.08)                 # ⚡ Parallel complete — 3 workers

# Competitive Eval (v1.1.0)
hlog.competitive_model_done("gpt-4.1", 1, 85, 0.04, 5, 5)
                                                      # 🏆 gpt-4.1: #1 (score: 85)
hlog.competitive_ranking([{"model": "gpt-4.1"}, {"model": "claude-sonnet"}])
                                                      # 🏁 Ranking final: gpt-4.1 > claude-sonnet

HumanLogHandler — structured event extraction

HumanLogHandler is a stdlib logging.Handler that:

  1. Filters only events at the exact HUMAN level (25)
  2. Extracts the event dict from record.msg (placed by wrap_for_formatter)
  3. Passes the event to HumanFormatter.format_event()
  4. Writes the formatted result to stderr

Event dict extraction

When structlog uses wrap_for_formatter, the event dict is stored as a dict in record.msg:

def emit(self, record):
    if isinstance(record.msg, dict) and not record.args:
        # structlog event: extract from dict
        event = record.msg["event"]        # "agent.llm.call"
        kw = {k: v for k, v in record.msg.items() if k not in _STRUCTLOG_META}
    else:
        # Fallback: extract from record attributes
        event = getattr(record, "event", None) or record.getMessage()

The filtered structlog fields (_STRUCTLOG_META) are: event, level, log_level, logger, logger_name, timestamp. These are processor metadata, not event kwargs.


Configuration (logging/setup.py)

configure_logging(config, json_output, quiet)

def configure_logging(config: LoggingConfig, json_output=False, quiet=False):
    # 1. Clear previous configuration
    logging.root.handlers.clear()
    structlog.reset_defaults()

    # 2. Pipeline 1: JSON file (if config.file is set)
    if config.file:
        file_handler = FileHandler(config.file)
        file_handler.setFormatter(ProcessorFormatter(processor=JSONRenderer()))
        logging.root.addHandler(file_handler)

    # 3. Pipeline 2: Human handler (if not --quiet or --json)
    if show_human:
        human_handler = HumanLogHandler(stream=sys.stderr)
        human_handler.setLevel(HUMAN)
        human_handler.addFilter(lambda r: r.levelno == HUMAN)
        logging.root.addHandler(human_handler)

    # 4. Pipeline 3: Technical console (if not --quiet or --json)
    if show_console:
        console_handler = StreamHandler(sys.stderr)
        console_handler.setLevel(_verbose_to_level(config.verbose))
        console_handler.addFilter(lambda r: r.levelno != HUMAN)  # exclude HUMAN
        console_handler.setFormatter(ProcessorFormatter(processor=ConsoleRenderer()))
        logging.root.addHandler(console_handler)

    # 5. structlog: ALWAYS wrap_for_formatter
    structlog.configure(
        processors=[..., wrap_for_formatter],
        logger_factory=LoggerFactory(),
    )

Why always wrap_for_formatter

The final structlog processor is always ProcessorFormatter.wrap_for_formatter, regardless of whether --log-file is used or not. This ensures events flow as structured dicts through the stdlib handler system, which allows HumanLogHandler to extract the event dict from record.msg.

If ConsoleRenderer were used directly in the processor chain (as was done before v0.15.3), events would be rendered to plain text before reaching the handlers, and HumanLogHandler would not be able to extract the event names to format them.


Verbose levels

VerboseConsole levelWhat the user sees
0 (default)WARNINGOnly HUMAN logs (agent steps) + errors
1 (-v)INFOHUMAN + system operations
2 (-vv)DEBUGHUMAN + all technical detail
3+ (-vvv)DEBUGHUMAN + HTTP + payloads

HUMAN logs are shown always (except with --quiet / --json), regardless of -v.


Relationship with OpenTelemetry (v1.0.0)

Starting with v1.0.0, architect supports OpenTelemetry traces as a complement to structured logging. Traces and logs are independent systems:

SystemPurposeConfiguration
Logging (structlog)Agent events, debugging, human outputlogging: in config + -v flags
Telemetry (OpenTelemetry)Session/LLM/tool spans for observabilitytelemetry: in config

Logs go to stderr (human/technical) and JSON file. Traces go to OTLP, console, or a separate JSON file. They do not mix.

See telemetry.md for OpenTelemetry configuration.


Module files

FileContents
logging/levels.pyDefinition of HUMAN = 25
logging/human.pyHumanFormatter, HumanLogHandler, HumanLog, _summarize_args
logging/setup.pyconfigure_logging(), configure_logging_basic(), get_logger()
telemetry/otel.pyArchitectTracer, NoopTracer (independent system)