Configuration Reference

Note: The references (v4-A1), (v4-B1), etc. refer to phases of the internal development plan (Base plan v4). All of them are included in v1.0.0.

Layer System

Configuration is resolved across 4 layers (lowest to highest priority):

1. Pydantic defaults (code)

2. YAML file (-c config.yaml)

3. Environment variables (ARCHITECT_*)

4. CLI flags (--model, --workspace, etc.)

The deep_merge() function in config/loader.py combines layers recursively: nested dicts are merged instead of replaced. This way you can override llm.model from the CLI without losing llm.timeout from the YAML.


Environment Variables

VariableConfig FieldExample
LITELLM_API_KEYRead by LiteLLM directly (not by architect)sk-...
ARCHITECT_MODELllm.modelgpt-4o
ARCHITECT_API_BASEllm.api_basehttp://localhost:8000
ARCHITECT_LOG_LEVELlogging.leveldebug
ARCHITECT_WORKSPACEworkspace.root/home/user/project

LITELLM_API_KEY is the default API key. If you need a different variable, configure llm.api_key_env in the YAML.


CLI Flags that Override Config

FlagOverridden Field
--model MODELllm.model
--api-base URLllm.api_base
--api-key KEYllm.api_key_env → direct key
--timeout NTotal session timeout (watchdog). Does not override llm.timeout (per-request)
--no-streamllm.stream = False
--workspace PATHworkspace.root
--max-steps Nagent_config.max_steps
--mode MODEagent_config.confirm_mode
-v / -vv / -vvvlogging.verbose (count)
--log-level LEVELlogging.level
--log-file PATHlogging.file
--self-eval MODEevaluation.mode (off/basic/full)
--allow-commandscommands.enabled = True
--no-commandscommands.enabled = False
--budget FLOATcosts.budget_usd
--cachellm_cache.enabled = True
--no-cachellm_cache.enabled = False
--jsonJSON output to stdout (disables streaming)
--dry-runDry-run mode: simulates without executing write tools
--report FORMATReport format: json, markdown, github
--report-file PATHWrite report to file (otherwise stdout)
--session IDResume existing session by ID
--confirm-mode MODEOverride confirm mode (yolo/confirm-all/confirm-sensitive)
--context-git-diff REFInject git diff REF diff as additional context
--exit-code-on-partialReturn exit code 2 if status=partial (default in CI)

Additional Commands:

CommandDescription
architect loop TASK --check CMDRalph Loop: iterate until checks pass
architect pipeline FILEPipeline: execute multi-step YAML workflow
architect parallel --task CMDParallel: execute in parallel worktrees
architect parallel-cleanupClean up worktrees from parallel runs
architect eval TASK --models CSV --check CMDCompetitive multi-model evaluation
architect init --preset NAMEInitialize project with configuration preset
--health (in architect run)Code health analysis before/after

Complete YAML Schema

# ==============================================================================
# LLM
# ==============================================================================
llm:
  provider: litellm        # always "litellm"
  mode: direct             # "direct" | "proxy" (LiteLLM Proxy Server)
  model: gpt-4o-mini       # any LiteLLM model

  # api_base: http://localhost:8000   # custom endpoint (Proxy, Ollama, etc.)

  api_key_env: LITELLM_API_KEY       # env var containing the API key

  timeout: 60              # seconds per LLM call
  retries: 2               # retries on transient errors (not auth)
  stream: true             # streaming by default; disabled with --no-stream/--json/--quiet
  prompt_caching: false    # mark system prompt with cache_control → 50-90% savings on Anthropic/OpenAI

# ==============================================================================
# Agents (custom or overrides of defaults)
# ==============================================================================
agents:
  # Partial override of a default:
  build:
    confirm_mode: confirm-all    # only overrides this field
    max_steps: 10

  # Completely new agent:
  deploy:
    system_prompt: |
      You are a specialized deployment agent.
      ...
    allowed_tools:
      - read_file
      - list_files
      - write_file
    confirm_mode: confirm-all
    max_steps: 15

# ==============================================================================
# Logging
# ==============================================================================
logging:
  level: human             # "debug" | "info" | "human" | "warn" | "error"
                           # v3: "human" shows agent traceability
  verbose: 0               # 0=human logs only, 1=info, 2=debug, 3+=all
  # file: logs/architect.jsonl   # JSON Lines; always full DEBUG

# ==============================================================================
# Workspace
# ==============================================================================
workspace:
  root: .                  # root directory; all file ops confined here
  allow_delete: false      # true = enable delete_file tool

# ==============================================================================
# MCP (Model Context Protocol — remote tools)
# ==============================================================================
mcp:
  servers:
    - name: github
      url: http://localhost:3001
      token_env: GITHUB_TOKEN         # env var with Bearer token

    - name: database
      url: https://mcp.example.com/db
      token_env: DB_TOKEN

    # inline token (not recommended in production):
    # - name: internal
    #   url: http://internal:8080
    #   token: "hardcoded-token"

# ==============================================================================
# Indexer — repository tree in the system prompt (F10)
# ==============================================================================
indexer:
  enabled: true            # false = no tree in the prompt; search tools remain available
  max_file_size: 1000000   # bytes; files larger than this are omitted from the index
  exclude_dirs: []         # additional dirs to exclude (besides .git, node_modules, etc.)
  # exclude_dirs:
  #   - vendor
  #   - .terraform
  exclude_patterns: []     # additional patterns to exclude (besides *.pyc, *.min.js, etc.)
  # exclude_patterns:
  #   - "*.generated.py"
  #   - "*.pb.go"
  use_cache: true          # disk-based index cache, 5-minute TTL

# ==============================================================================
# Context — context window management (F11)
# ==============================================================================
context:
  # Level 1: truncate long tool results
  max_tool_result_tokens: 2000   # ~4 chars/token; 0 = disable truncation

  # Level 2: compress old steps with the LLM
  summarize_after_steps: 8       # 0 = disable compression
  keep_recent_steps: 4           # recent steps to preserve intact

  # Level 3: hard limit on total context window
  max_context_tokens: 80000      # 0 = no limit (dangerous for long tasks)
  # Reference: gpt-4o/mini → 80000, claude-sonnet-4-6 → 150000

  # Parallel tool calls
  parallel_tools: true           # false = always sequential

# ==============================================================================
# Evaluation — result self-evaluation (F12)
# ==============================================================================
evaluation:
  mode: off                # "off" | "basic" | "full"
                           # CLI override: --self-eval basic|full
  max_retries: 2           # retries in "full" mode (range: 1-5)
  confidence_threshold: 0.8  # confidence threshold to accept result (0.0-1.0)

# ==============================================================================
# Commands — system command execution (F13)
# ==============================================================================
commands:
  enabled: true            # false = do not register run_command; --allow-commands/--no-commands
  default_timeout: 30      # default seconds (1-600)
  max_output_lines: 200    # stdout/stderr lines before truncation (10-5000)
  blocked_patterns: []     # extra regexes to block (in addition to built-ins)
  # blocked_patterns:
  #   - "git push --force"
  #   - "docker rm"
  safe_commands: []        # additional commands classified as 'safe'
  allowed_only: false      # if true, only safe/dev; dangerous rejected in execute()

# ==============================================================================
# Costs — LLM call cost tracking (F14)
# ==============================================================================
costs:
  enabled: true            # false = no cost tracking
  # prices_file: my_prices.json  # custom prices (same format as default_prices.json)
  # budget_usd: 1.0        # stop if exceeding $1.00; Override: --budget 1.0
  # warn_at_usd: 0.5       # log warning upon reaching $0.50

# ==============================================================================
# LLM Cache — local LLM response cache for development (F14)
# ==============================================================================
llm_cache:
  enabled: false           # true = enable; Override: --cache / --no-cache
  dir: ~/.architect/cache  # directory to store entries
  ttl_hours: 24            # validity of each entry (1-8760 hours)

# ==============================================================================
# Hooks — full lifecycle (v4-A1, backward compat v3-M4)
# ==============================================================================
hooks:
  # Pre-hooks: run BEFORE the action. Exit code 2 = BLOCK.
  pre_tool_use:
    - name: validate-secrets
      command: "bash scripts/check-secrets.sh"
      matcher: "write_file|edit_file"      # regex to filter tools
      file_patterns: ["*.py", "*.env"]
      timeout: 5

  # Post-hooks: run AFTER the action.
  post_tool_use:
    - name: python-lint
      command: "ruff check {file} --no-fix"    # {file} is replaced with the edited path
      file_patterns: ["*.py"]                    # glob patterns
      timeout: 15                                # seconds (1-300, default: 10)
      enabled: true                              # false = skip this hook
    - name: python-typecheck
      command: "mypy {file} --no-error-summary"
      file_patterns: ["*.py"]
      timeout: 30

  # Session hooks (notification only, cannot block)
  session_start: []
  session_end: []
  on_error: []
  agent_complete: []
  budget_warning: []
  context_compress: []

  # Pre/post LLM call
  pre_llm_call: []
  post_llm_call: []

  # Backward compatibility v3-M4: post_edit maps to post_tool_use
  # with automatic matcher for edit_file/write_file/apply_patch
  post_edit:
    - name: legacy-lint
      command: "ruff check {file}"
      file_patterns: ["*.py"]
      timeout: 15

  # Fields for each hook:
  # name:          str           — descriptive name
  # command:       str           — shell command ({file} is replaced)
  # matcher:       str = "*"    — regex/glob to filter tools
  # file_patterns: list[str]    — glob patterns to filter files
  # timeout:       int = 10     — seconds (1-300)
  # async:         bool = false — true = run in background without blocking
  # enabled:       bool = true  — false = skip

# ==============================================================================
# Guardrails — deterministic security (v4-A2)
# ==============================================================================
guardrails:
  enabled: false              # true = enable guardrails
  protected_files: []         # globs: [".env", "*.pem", "secrets/**"]
  blocked_commands: []        # regexes: ["git push --force", "docker rm"]
  max_files_modified: null    # limit of distinct files per session (null = no limit)
  max_lines_changed: null     # limit of accumulated lines changed
  max_commands_executed: null  # limit of commands executed
  require_test_after_edit: false  # force test every N edits

  code_rules: []              # simple static analysis rules
  # - pattern: "eval\\("
  #   message: "Usage of eval() detected"
  #   severity: block          # block | warn

  quality_gates: []           # final verification upon completion
  # - name: tests
  #   command: "pytest tests/ -x"
  #   required: true           # true = blocks if it fails
  #   timeout: 120

# ==============================================================================
# Skills — project context and workflows (v4-A3)
# ==============================================================================
skills:
  auto_discover: true         # auto-discover skills in .architect/skills/
  inject_by_glob: true        # inject skills based on active files

# ==============================================================================
# Memory — procedural memory (v4-A4)
# ==============================================================================
memory:
  enabled: false              # true = enable correction detection
  auto_detect_corrections: true  # automatically detect corrections in user messages

# ==============================================================================
# Sessions — persistence and resume (v4-B1)
# ==============================================================================
sessions:
  auto_save: true             # save state after each step (default: true)
  cleanup_after_days: 7       # days after which `architect cleanup` removes sessions

# ==============================================================================
# Ralph Loop — automatic iteration with checks (v4-C1)
# ==============================================================================
ralph:
  max_iterations: 25           # maximum iterations (1-100)
  max_cost: null               # maximum total cost in USD (null = no limit)
  max_time: null               # maximum total time in seconds (null = no limit)
  completion_tag: COMPLETE     # tag the agent emits when declaring completion
  agent: build                 # agent to use in each iteration

# ==============================================================================
# Parallel Runs — parallel execution in git worktrees (v4-C2)
# ==============================================================================
parallel:
  workers: 3                   # number of parallel workers (1-10)
  agent: build                 # agent to use in each worker
  max_steps: 50                # maximum steps per worker
  budget_per_worker: null      # USD per worker (null = no limit)
  timeout_per_worker: null     # seconds per worker (null = 600s)

# ==============================================================================
# Checkpoints — git restore points (v4-C4)
# ==============================================================================
checkpoints:
  enabled: false               # true = enable automatic checkpoints in the AgentLoop
  every_n_steps: 5             # create checkpoint every N steps (1-50)

# ==============================================================================
# Auto-Review — automatic post-build review
# ==============================================================================
auto_review:
  enabled: false               # true = enable auto-review after completion
  review_model: null           # model for the reviewer (null = same as builder)
  max_fix_passes: 1            # fix passes (0 = report only, 1-3 = fix)

# ==============================================================================
# Telemetry — OpenTelemetry traces (v1.0.0)
# ==============================================================================
telemetry:
  enabled: false               # true = enable OpenTelemetry traces
  exporter: console            # "otlp" | "console" | "json-file"
  endpoint: http://localhost:4317  # gRPC endpoint for OTLP
  trace_file: null             # file path for json-file (e.g.: .architect/traces.json)

# ==============================================================================
# Health — code health analysis (v1.0.0)
# ==============================================================================
health:
  enabled: false               # true = automatic analysis (no --health flag needed)
  include_patterns:            # glob patterns of files to analyze
    - "**/*.py"
  exclude_dirs:                # directories to exclude from analysis
    - .git
    - venv
    - __pycache__
    - node_modules

The load_config() Function

def load_config(
    config_path: Path | None = None,
    cli_args: dict | None = None,
) -> AppConfig:
    # 1. Load YAML (empty if config_path=None)
    yaml_dict = load_yaml_config(config_path)

    # 2. Read ARCHITECT_* env vars
    env_dict = load_env_overrides()

    # 3. Merge: yaml ← env
    merged = deep_merge(yaml_dict, env_dict)

    # 4. Apply CLI flags
    if cli_args:
        merged = apply_cli_overrides(merged, cli_args)

    # 5. Validate with Pydantic (extra="forbid")
    return AppConfig(**merged)

If the YAML contains unknown keys, Pydantic raises ValidationError → the CLI displays the error and exits with code 3 (EXIT_CONFIG_ERROR).


Common Configuration Examples

Minimal (API key in env only)

export LITELLM_API_KEY=sk-...
architect run "analyze the project" -a resume

OpenAI with Explicit Config

llm:
  model: gpt-4o
  api_key_env: OPENAI_API_KEY
  timeout: 120
  retries: 3

workspace:
  root: /my/project
  allow_delete: false

Anthropic Claude

llm:
  model: claude-sonnet-4-6
  api_key_env: ANTHROPIC_API_KEY
  stream: true

context:
  max_context_tokens: 150000   # Claude has a larger window

Ollama (local, no API key)

llm:
  model: ollama/llama3
  api_base: http://localhost:11434
  retries: 0    # local, no need for retries
  timeout: 300  # local models can be slow

context:
  parallel_tools: false   # no parallelism for slow local models

LiteLLM Proxy (teams)

llm:
  mode: proxy
  model: gpt-4o-mini
  api_base: http://proxy.internal:8000
  api_key_env: LITELLM_PROXY_KEY

CI/CD (yolo mode, no confirmations, with evaluation)

llm:
  model: gpt-4o-mini
  timeout: 120
  retries: 3
  stream: false

workspace:
  root: .

logging:
  verbose: 0
  level: warn

evaluation:
  mode: basic              # evaluate the result in CI
  confidence_threshold: 0.7  # less strict than interactive
architect run "update obsolete imports in src/" \
  --mode yolo --quiet --json \
  -c ci/architect.yaml

Large Repos (with context optimization)

indexer:
  exclude_dirs:
    - vendor
    - .terraform
    - coverage
  exclude_patterns:
    - "*.generated.py"
    - "*.pb.go"
  use_cache: true

context:
  max_tool_result_tokens: 1000   # more aggressive for large repos
  summarize_after_steps: 5       # compress faster
  keep_recent_steps: 3
  max_context_tokens: 60000      # more conservative
  parallel_tools: true

With Command Execution (F13) and Costs (F14)

llm:
  model: claude-sonnet-4-6
  api_key_env: ANTHROPIC_API_KEY
  prompt_caching: true     # saves tokens on repeated calls to the same system prompt

commands:
  enabled: true
  default_timeout: 60
  max_output_lines: 200
  safe_commands:
    - "pnpm test"
    - "cargo check"

costs:
  enabled: true
  budget_usd: 2.0          # maximum $2 per run
  warn_at_usd: 1.0         # warning upon reaching $1

# Local cache for development: avoids repeated LLM calls
llm_cache:
  enabled: false           # enable with --cache in CLI during development
  ttl_hours: 24
# With local cache enabled and budget from CLI
architect run "PROMPT" -a build --cache --budget 1.5 --show-costs

With Lifecycle Hooks (v4-A1)

hooks:
  post_tool_use:
    - name: python-lint
      command: "ruff check {file} --no-fix"
      file_patterns: ["*.py"]
      timeout: 15
    - name: python-typecheck
      command: "mypy {file} --no-error-summary"
      file_patterns: ["*.py"]
      timeout: 30
  pre_tool_use:
    - name: no-secrets
      command: "bash scripts/check-secrets.sh"
      matcher: "write_file|edit_file"
      timeout: 5
# Hooks run automatically — the LLM sees the lint/typecheck output
# and can self-correct errors. Pre-hooks can block actions.
architect run "refactor utils.py" -a build --mode yolo -c config.yaml

With Guardrails (v4-A2)

guardrails:
  enabled: true
  protected_files: [".env", "*.pem", "deploy/**"]
  blocked_commands: ["git push", "docker rm"]
  max_files_modified: 10
  max_lines_changed: 500
  require_test_after_edit: true
  code_rules:
    - pattern: "eval\\("
      message: "Do not use eval()"
      severity: block
  quality_gates:
    - name: tests
      command: "pytest tests/ -x"
      required: true
      timeout: 120

With Skills and Memory (v4-A3/A4)

skills:
  auto_discover: true
  inject_by_glob: true

memory:
  enabled: true
  auto_detect_corrections: true

CI/CD with Reports and Sessions (v4-B)

llm:
  model: gpt-4o-mini
  stream: false
  prompt_caching: true

commands:
  enabled: true
  allowed_only: true

costs:
  enabled: true
  budget_usd: 2.00

sessions:
  auto_save: true
  cleanup_after_days: 30
# Run with report and PR diff context
architect run "review the PR changes" \
  --mode yolo --quiet \
  --context-git-diff origin/main \
  --report github --report-file pr-report.md \
  --budget 2.00 \
  -c ci/architect.yaml

# Resume if left partial
architect resume SESSION_ID --budget 2.00

# Clean up old sessions in CI
architect cleanup --older-than 30

Full Config with Self-Eval

llm:
  model: gpt-4o
  api_key_env: OPENAI_API_KEY
  timeout: 120

workspace:
  root: .

indexer:
  enabled: true
  use_cache: true

context:
  max_tool_result_tokens: 2000
  summarize_after_steps: 8
  max_context_tokens: 80000
  parallel_tools: true

evaluation:
  mode: full               # automatically retry if it fails
  max_retries: 2
  confidence_threshold: 0.85
# Or use just the CLI flag (ignores evaluation.mode from YAML)
architect run "generate tests for src/auth.py" -a build --self-eval full

Ralph Loop with Checks (v4-C1)

ralph:
  max_iterations: 10
  max_cost: 5.0
  agent: build
# Iterate until tests pass
architect loop "fix the failing tests in src/auth.py" \
  --check "pytest tests/test_auth.py -x" \
  --max-iterations 10

# With multiple checks
architect loop "implement email validation" \
  --check "pytest tests/" \
  --check "ruff check src/" \
  --max-cost 2.0

Parallel Execution (v4-C2)

parallel:
  workers: 3
  agent: build
  budget_per_worker: 1.0
  timeout_per_worker: 300
# Same task with different models
architect parallel "optimize SQL queries" \
  --models gpt-4o,claude-sonnet-4-6,deepseek-chat

# Different tasks in parallel
architect parallel \
  --task "tests for auth" \
  --task "tests for users" \
  --task "tests for billing" \
  --workers 3 --budget-per-worker 1.0

# Clean up worktrees afterwards
architect parallel-cleanup

Multi-Step YAML Pipeline (v4-C3)

# pipeline.yaml
name: implement-and-test
steps:
  - name: implement
    prompt: "Implement the feature described in {{task}}"
    agent: build
    checkpoint: true

  - name: test
    prompt: "Generate tests for the changes from the previous step"
    agent: build
    checks:
      - "pytest tests/ -x"
    checkpoint: true

  - name: review
    prompt: "Review the changes made"
    agent: review
    output_var: review_result

variables:
  task: "add JWT authentication"
# Execute pipeline
architect pipeline pipeline.yaml

# Execute from a specific step
architect pipeline pipeline.yaml --from-step test

# Dry-run the pipeline
architect pipeline pipeline.yaml --dry-run

Checkpoints and Rollback (v4-C4)

checkpoints:
  enabled: true
  every_n_steps: 5
# View created checkpoints
git log --oneline --grep="architect:checkpoint"

# Manual rollback to a checkpoint
git reset --hard <commit_hash>

Competitive Evaluation (v1.0.0)

# Compare models on the same task
architect eval "optimize SQL queries" \
  --models gpt-4o,claude-sonnet-4-6,deepseek-chat \
  --check "pytest tests/test_queries.py -q" \
  --check "ruff check src/" \
  --budget-per-model 1.0 \
  --report-file eval_report.md

Initialization with Presets (v1.0.0)

# Generate config for a Python project
architect init --preset python

# Maximum security config
architect init --preset paranoid --overwrite

Telemetry with Jaeger (v1.0.0)

telemetry:
  enabled: true
  exporter: otlp
  endpoint: http://localhost:4317
# Run with tracing
architect run "implement feature" -c config.yaml --mode yolo
# → Traces visible in Jaeger UI: http://localhost:16686

Health Delta (v1.0.0)

health:
  enabled: true
  include_patterns: ["**/*.py"]
# Or use the flag directly
architect run "refactor utils.py" --health --mode yolo
# → Displays markdown table with metrics delta

Auto-Review Post-Build

auto_review:
  enabled: true
  review_model: claude-sonnet-4-6
  max_fix_passes: 1
# Auto-review activates automatically upon completing a task
# with auto_review.enabled: true. No additional CLI flags required.
architect run "implement feature X" --mode yolo -c config.yaml
# → Build completes → Automatic review → Fix-pass if there are issues