Security Model — Architect CLI

Comprehensive documentation of the security layers, threat model, attack surface, and hardening recommendations.

Threat model

Architect gives the LLM control over real system tools: file reading/writing, command execution, and connection to remote servers (MCP). This implies concrete risks that the system mitigates with multiple defensive layers.

Threat actors

Actor	Vector	Primary mitigation
Adversarial LLM / hallucination	The model attempts to read `/etc/passwd`, execute `rm -rf /`, or escape the workspace	Path traversal prevention + blocklist + workspace sandboxing
Prompt injection (indirect)	A workspace file contains instructions that manipulate the LLM	Workspace confinement + confirmation of sensitive operations
Malicious MCP server	The MCP server returns data containing prompt injection	Argument sanitization in logs + tool result isolation
User with insecure config	`--mode yolo` + `--allow-commands` without restrictions	Hard blocklist (non-bypassable) + `allowed_only` mode + secure defaults

Attack surface

                    +----------------+
                    |     LLM        |  <- Prompt injection (indirect)
                    +-------+--------+
                            | tool calls
                    +-------v--------+
                    | ExecutionEngine |  <- Validation + confirmation
                    +-------+--------+
           +----------------+----------------+
           |                |                |
    +------v-------+  +----v------+  +------v-------+
    |  Filesystem  |  | run_command|  |     MCP      |
    |  Tools       |  |  (shell)  |  |   (HTTP)     |
    +--------------+  +-----------+  +--------------+
           |                |                |
    validate_path()   4 security layers  Bearer token
    workspace jail    blocklist+classify  session ID

Layer 1 — Workspace confinement (Path Traversal Prevention)

File: src/architect/execution/validators.py

Every filesystem operation mandatorily passes through validate_path(). It is the most critical barrier in the system.

Mechanism

def validate_path(path: str, workspace_root: Path) -> Path:
    workspace_resolved = workspace_root.resolve()
    full_path = (workspace_root / path).resolve()  # resolve() eliminates ../ and symlinks

    if not full_path.is_relative_to(workspace_resolved):
        raise PathTraversalError(...)

    return full_path

What it prevents

Attempt	Result
`../../etc/passwd`	`PathTraversalError` — `.resolve()` collapses the `..` and detects the escape
`/etc/shadow` (absolute path)	`PathTraversalError` — when concatenated with workspace, resolve detects escape
`src/../../.env`	`PathTraversalError` — resolve normalizes and detects exit
Symlink to `/root`	`PathTraversalError` — `.resolve()` follows the real symlink

Guarantees

Path.resolve() resolves symlinks, . and .. to the real filesystem path
is_relative_to() verifies that the resolved path starts with the resolved workspace
Includes a string comparison fallback for Python < 3.9
Every filesystem tool (read_file, write_file, edit_file, delete_file, list_files, apply_patch, search_code, grep, find_files) calls validate_path() before any operation

Errors returned to the LLM (never exceptions to the caller)

When PathTraversalError occurs, the tool returns:

ToolResult(success=False, output="", error="Security error: Path '../../etc/passwd' escapes the workspace.")

The LLM receives the error as a tool call result and can reason about it. The loop never breaks due to a security error.

Layer 2 — Command execution security (4 layers)

File: src/architect/tools/commands.py

run_command is the most dangerous tool in the system. It implements 4 independent security layers.

Layer 2.1 — Hard blocklist (regex)

Patterns that are never executed, regardless of the confirmation mode or user configuration:

BLOCKED_PATTERNS = [
    r"\brm\s+-rf\s+/",          # rm -rf /
    r"\brm\s+-rf\s+~",          # rm -rf ~
    r"\bsudo\b",                 # privilege escalation
    r"\bsu\b",                   # user switching
    r"\bchmod\s+777\b",          # insecure permissions
    r"\bcurl\b.*\|\s*(ba)?sh",   # curl | bash
    r"\bwget\b.*\|\s*(ba)?sh",   # wget | bash
    r"\bdd\b.*\bof=/dev/",       # writing to devices
    r">\s*/dev/sd",              # redirection to disks
    r"\bmkfs\b",                 # disk formatting
    r"\b:()\s*\{\s*:\|:&\s*\};?:", # fork bomb
    r"\bpkill\s+-9\s+-f\b",     # mass kill by name
    r"\bkillall\s+-9\b",        # mass kill
]

Matching with re.search() and re.IGNORECASE flag
They are additive: config can add extra blocked_patterns, but never remove the built-in ones
If matched, returns ToolResult(success=False) immediately — the LLM receives the rejection

Layer 2.2 — Dynamic classification

Each command is classified into 3 categories:

Category	Criteria	Examples
`safe`	Prefix match with `SAFE_COMMANDS` (28 commands)	`ls`, `cat`, `git status`, `grep`, `pip list`, `kubectl get`
`dev`	Prefix match with `DEV_PREFIXES` (30+ prefixes)	`pytest`, `mypy`, `ruff`, `cargo test`, `npm test`, `make`
`dangerous`	Everything else	`python script.py`, `bash deploy.sh`, `docker run`

The classification determines whether confirmation is required based on the mode:

Classification	`yolo`	`confirm-sensitive`	`confirm-all`
`safe`	No	No	Yes
`dev`	No	Yes	Yes
`dangerous`	No	Yes	Yes

In yolo mode, confirmation is never requested — not even for dangerous commands. Security is guaranteed by the blocklist (Layer 2.1), which blocks truly dangerous commands (rm -rf /, sudo, etc.) regardless of the mode. dangerous commands are simply “unrecognized” in the safe/dev lists, not necessarily dangerous.

For environments where you want to reject dangerous commands without confirmation, use allowed_only: true (see below).

Layer 2.3 — Timeouts and output truncation

subprocess.run(
    command,
    shell=True,
    timeout=effective_timeout,  # default 30s, configurable up to 600s
    capture_output=True,
    stdin=subprocess.DEVNULL,   # headless: never waits for input
)

stdin=subprocess.DEVNULL: prevents an interactive command from blocking the agent
Timeout: configurable via commands.default_timeout (1-600s)
Output truncation: max_output_lines (default 200) — preserves the beginning and end of output, prevents saturating the LLM’s context window

Layer 2.4 — Directory sandboxing

def _resolve_cwd(self, cwd: str | None) -> Path:
    if cwd is None:
        return self.workspace_root
    return validate_path(cwd, self.workspace_root)  # Reuses validate_path()

The process’s working directory is always within the workspace. If the LLM attempts to execute a command with cwd: "../../", validate_path() blocks it.

`allowed_only` mode

Extra configuration for restrictive environments:

commands:
  allowed_only: true  # Only safe + dev; dangerous = rejected without confirmation

With allowed_only: true, dangerous commands are rejected directly in execute(), without reaching the confirmation prompt. Useful in CI/CD where there is no TTY.

Complete deactivation

commands:
  enabled: false  # The run_command tool is not registered

Or via CLI: --no-commands. In this case, the tool is not even available to the LLM.

Layer 3 — Confirmation policies

File: src/architect/execution/policies.py

Modes

Mode	Behavior	Recommended use
`confirm-all`	Confirms every tool call	Production, first time
`confirm-sensitive`	Only confirms tools with `sensitive=True`	Default, normal development
`yolo`	No confirmation for any tool or command	Trusted tasks, CI

Sensitive tools (built-in)

Tool	`sensitive`	Reason
`read_file`	`false`	Read-only
`list_files`	`false`	Read-only
`write_file`	`true`	Modifies files
`edit_file`	`true`	Modifies files
`apply_patch`	`true`	Modifies files
`delete_file`	`true`	Deletes files
`run_command`	`true`	Arbitrary execution (dynamic classification)
`search_code`	`false`	Read-only
`grep`	`false`	Read-only
`find_files`	`false`	Read-only

Headless protection (CI/CD)

if not sys.stdin.isatty():
    raise NoTTYError(
        "Confirmation is required to execute '{tool_name}' "
        "but no TTY is available (headless/CI environment). "
        "Solutions: 1) Use --mode yolo, 2) Use --dry-run, "
        "3) Change the agent configuration to confirm_mode: yolo"
    )

If a tool requires confirmation but there is no TTY (CI, Docker, cron), the system fails safely with NoTTYError instead of executing without confirmation.

Layer 4 — Delete protection

File: src/architect/config/schema.py — WorkspaceConfig

workspace:
  allow_delete: false  # Default

delete_file is disabled by default. It requires explicit allow_delete: true configuration to function. Even with --mode yolo, if allow_delete is false, the tool returns an error.

Layer 5 — ExecutionEngine security

File: src/architect/execution/engine.py

The ExecutionEngine is the mandatory pass-through point for all tool execution. It applies a 7-step pipeline:

Tool call -> Find in registry -> Validate args (Pydantic) -> Confirmation policy
         -> Execute (or dry-run) -> Log result -> Return ToolResult

Invariants

Never throws exceptions to the caller — always returns ToolResult. Triple try-catch:
- Catch for each individual step (registry, validation, confirmation, execution)
- Last-resort catch in execute_tool_call()
- Tools internally also catch their own exceptions
Argument sanitization for logs — _sanitize_args_for_log() truncates values > 200 chars:
```
sanitized[key] = value[:200] + f"... ({len(value)} chars total)"
```
This prevents sensitive content (API keys in files, tokens) from appearing in full in logs.
Dry-run — --dry-run simulates execution without real effects. The engine returns ToolResult with [DRY-RUN] without executing the tool.

Layer 6 — API key and token security

LLM API keys

File: src/architect/llm/adapter.py

api_key = os.environ.get(self.config.api_key_env)  # Reads from env var
os.environ["LITELLM_API_KEY"] = api_key              # Configures for LiteLLM

API keys are never stored in config files — they are only referenced via api_key_env
If the env var does not exist, the adapter logs a warning but does not fail immediately
Logs do not include the API key value, only the env var name: env_var=self.config.api_key_env
LiteLLM handles the key internally; architect does not propagate it to tools or outputs

MCP tokens

File: src/architect/mcp/client.py

def _resolve_token(self) -> str | None:
    if self.config.token:
        return self.config.token        # 1. Direct token
    if self.config.token_env:
        return os.environ.get(self.config.token_env)  # 2. Env var
    return None

Support for direct token in config or via env var (recommended: env var)
The token is sent as Authorization: Bearer {token} in HTTP headers
Initialization logs use has_token=self.token is not None (boolean, not the value)
The server’s session ID is logged truncated: session_id[:12] + "..."
_sanitize_args() truncates values > 100 chars in tool call logs

Layer 7 — Agent loop security

Safety nets

File: src/architect/core/loop.py

The while True loop has 5 safety nets that prevent infinite execution:

Safety net	Mechanism	Default
`max_steps`	Iteration counter	50 (build), 20 (plan/review), 15 (resume)
`budget`	`CostTracker` + `BudgetExceededError`	No limit (configurable)
`timeout`	`StepTimeout` (SIGALRM per step)	Configurable
`context_full`	`ContextManager.is_critically_full()`	80k tokens default
`shutdown`	`GracefulShutdown` (SIGINT/SIGTERM)	Always active

Graceful shutdown

File: src/architect/core/shutdown.py

SIGINT (Ctrl+C) #1  ->  Sets should_stop = True, loop ends after current step completes
SIGINT (Ctrl+C) #2  ->  Immediate sys.exit(130)
SIGTERM             ->  Same as first SIGINT (for Docker/K8s)

Does not cut in the middle of a file operation
Allows _graceful_close(): one final LLM call without tools to generate a summary
Exit code 130 (POSIX standard: 128 + SIGINT)

Step timeout

File: src/architect/core/timeout.py

with StepTimeout(seconds=60):
    response = llm.completion(messages)
    result = engine.execute_tool_call(...)

Uses signal.SIGALRM on Linux/macOS
On Windows: no-op (degrades gracefully)
Raises StepTimeoutError which the loop catches and records as StopReason

Layer 8 — Configuration security

Strict validation with Pydantic v2

File: src/architect/config/schema.py

All configuration models use model_config = {"extra": "forbid"}. This means that:

Any unknown field in the YAML is a validation error
Undocumented options cannot be injected
Types are strictly validated (Literal, int with ge/le, etc.)

Secure defaults

Configuration	Default	Reason
`workspace.allow_delete`	`false`	Prevent accidental deletion
`commands.allowed_only`	`false`	Allows dangerous with confirmation
`confirm_mode`	`"confirm-sensitive"`	Balance between security/usability
`llm_cache.enabled`	`false`	Cache only for development
`evaluation.mode`	`"off"`	Do not consume extra tokens
`commands.default_timeout`	`30`	Prevent hung processes
`commands.max_output_lines`	`200`	Prevent context flooding
`llm.retries`	`2`	Only transient errors

Configuration precedence

CLI flags  >  environment variables  >  config.yaml  >  Pydantic defaults

CLI flags always win. The user at the terminal has the final say.

Layer 9 — Argument validation (Pydantic)

Each tool defines an args_model (Pydantic BaseModel) that validates arguments before execution:

class RunCommandArgs(BaseModel):
    command: str
    cwd: str | None = None
    timeout: int = 30
    env: dict[str, str] | None = None

LLM arguments are validated before executing the tool
If validation fails, ToolResult(success=False, error="Invalid arguments: ...") is returned
The LLM receives the error and can correct its next call

Layer 10 — Post-edit hooks (generated code security)

File: src/architect/core/hooks.py

Hooks automatically verify the code that the agent writes:

hooks:
  post_edit:
    - name: lint
      command: "ruff check {file} --fix"
      file_patterns: ["*.py"]
      timeout: 15

Hook security

Timeout per hook: default 15s, configurable (1-300s) — prevents hung processes
stdin=subprocess.DEVNULL: hooks cannot request interactive input
cwd=workspace_root: hooks execute within the workspace
Truncated output: maximum 1000 chars to avoid saturating the context
Never break the loop: hook errors are logged and return None, the agent continues
Environment variable: ARCHITECT_EDITED_FILE is injected into the hook
Edit tools only: executed exclusively for edit_file, write_file, apply_patch

Layer 11 — Logging and sanitization

3 logging pipelines

Pipeline	Destination	Content
Human	stderr	Agent events with icons (terminal only)
Console	stderr	Technical logs (structlog)
JSON file	file	Complete logs in JSON-lines (audit)

Sanitization in logs

ExecutionEngine._sanitize_args_for_log(): truncates values > 200 chars
MCPClient._sanitize_args(): truncates values > 100 chars
Session IDs: truncated to 12 chars in logs
API keys: only the env var name is logged, never the value
stdout reserved exclusively for final result and JSON — logs go to stderr

Layer 12 — Agent security (registry)

File: src/architect/agents/registry.py

Each agent has tool restrictions defined in its configuration:

Agent	Allowed tools	`confirm_mode`	`max_steps`
`build`	All	`confirm-sensitive`	50
`plan`	Read-only	`yolo`	20
`review`	Read-only	`yolo`	20
`resume`	Read-only	`yolo`	15

plan, review, and resume use yolo mode because they have no write tools — there is nothing to confirm
build is the only agent with write and execution tools, and uses confirm-sensitive by default
allowed_tools in the registry defines exactly which tools each agent can use
A plan agent cannot call write_file even if the LLM attempts it — the engine returns ToolNotFoundError

Layer 13 — MCP security (Model Context Protocol)