Extensibility Foundations
The infrastructure everything else builds on. Hooks, guardrails, skills and procedural memory form the backbone of agent extensibility.
- [x] Complete Hook System — 10 lifecycle events, action blocking/modification, async execution and timeouts.
- [x] First-Class Guardrails — Protected files, blocked commands, edit limits, code rules and mandatory quality gates.
- [x] .architect.md + Skills — Auto-injected project context and reusable skills ecosystem activated by glob.
- [x] Procedural Memory — User correction detection, disk persistence and automatic injection in future sessions.
Persistence and Reporting
Features that make architect viable for long tasks and CI/CD environments. Persistent sessions, structured reports and native flags for pipelines.
- [x] Session Resume — State persistence to disk. If a session is interrupted, it resumes from the last point.
- [x] Execution Report — Reports in JSON, Markdown and GitHub PR comment with timeline, costs and quality gates.
- [x] CI/CD Native Flags —
--json,--budget,--timeout,--context-git-diff. Semantic exit codes. - [x] Dry Run / Preview — The agent plans without executing. Read tools active, write intercepted as plan.
Advanced Automation
The features that turn architect into a serious automation tool. Autonomous loops, parallel execution and multi-step workflows.
- [x] Native Ralph Loop — Automatic correction loop: run, verify external checks, re-run with errors. Configurable with budget and time limit.
- [x] Parallel Runs + Worktrees — Multiple agents in isolated git worktrees. Fan-out (same task, several models) or task distribution.
- [x] Pipeline Mode — Multi-step YAML workflows with variables, conditions, checkpoints and resume from any step.
- [x] Checkpoints & Rollback — Git-based restore points. Rollback to any previous step.
- [x] Auto-Review — Writer/reviewer pattern: on completion, a reviewer analyzes changes and generates automatic corrections.
Extras and Specialization
Advanced features that complete the platform: sub-agents, health metrics, competitive evaluation between models and observability.
- [x] Sub-Agents / Dispatch — Delegate sub-tasks to agents with independent context that return a summary.
- [x] Code Health Delta — Before/after health metrics with radon, eslint. Complexity diff in the report.
- [x] Competitive Eval — Same task with different models + comparative report of quality, cost and speed.
- [x] OpenTelemetry Traces — Spans for sessions, LLM calls, tools and hooks. Export to Jaeger, Grafana Tempo, etc.
- [x] Preset Configs — Predefined templates:
python,node-react,ci,paranoid.
v1.0.0 Stable Release
Exhaustive Testing & Hardening
Integration tests battery, stress tests and post-release edge case fixes. Core stabilization before new features.
Backend Abstraction + Claude SDK
Abstraction layer for LLM providers and native integration with Claude Agent SDK as execution engine, keeping architect's control layer on top.
- [ ] Backend Abstraction Layer — Unified interface for LLM providers with health checks, per-backend metrics and transparent switching.
- [ ] Claude Agent SDK Backend — Claude Agent SDK backend to use Claude Code's native tools as engine, with architect's control layer on top.
Architect as MCP Server
Architect as a native MCP server for bidirectional integration with Claude Code and other ecosystem agents.
- [ ] Architect MCP Server — Native MCP server exposing architect capabilities (build, review, plan) as remote tools for bidirectional integration with Claude Code and other agents.
Ralph v2 + Guardrails v2 + Reports v2
Deepening of core systems: resumable loops, per-agent guardrails with audit trail, parallel pipelines and reports in standard CI/CD formats.
- [ ] Ralph Loop v2 — Resumable (if a long loop is interrupted, it resumes from the last iteration). Escalation strategies: if it's been failing for 5+ iterations, automatically changes approach.
- [ ] Guardrails v2 — Scoped per agent (build can touch code, deploy only infra). Immutable JSONL audit trail.
allowed_pathsas inverse ofprotected_files. - [ ] Pipeline Engine v2 — Parallel steps, declarative error handling (
on_failure: retry | skip | abort), includes to reuse steps between pipelines. - [ ] Reports & Audit Engine — JUnit XML for standard CI/CD dashboards, GitHub PR format with collapsible sections, cost breakdown per step.
Output Modes + Fallback + Int. Tests
Production resilience: configurable output modes, automatic fallback between backends and end-to-end integration test suite.
- [ ] CLI Output Modes — Configurable and extensible CLI output modes for different usage contexts.
- [ ] Backend Health & Fallback — Health check and automatic fallback between backends. If the primary provider goes down, architect switches to fallback without intervention.
- [ ] Integration Test Suite — End-to-end integration test suite to validate complete flows: build, review, loops, pipelines and parallel.
Future
Ideas under evaluation for after the main phases launch. Subject to changes based on community feedback.
- [?] Docker Sandbox — Run the agent in an ephemeral container for total host system isolation.
- [?] Watch Mode — Daemon that observes the workspace and automatically reacts to configured triggers.
- [?] Interactive Streaming — Mid-task instruction injection with Ctrl+M in interactive mode.
- [?] Hierarchical .architect.md — Per-directory skills that merge based on the active file context.
Missing something from the blueprints?
Architecture is a collaborative effort. Propose new tools or agents in our repository.
Open Issue on GitHub