Analyzers
vigil uses a modular analyzer system. Each analyzer focuses on a detection category and produces findings independently. This document describes the implemented analyzers.
For the general analyzer architecture (protocol, registration, flow), see Architecture.
DependencyAnalyzer (CAT-01)
Module: src/vigil/analyzers/deps/
Category: dependency
Active rules: DEP-001, DEP-002, DEP-003, DEP-005, DEP-007
Detects hallucinated dependencies (slopsquatting), typosquatting, suspicious packages, nonexistent versions, and packages without a source repository.
Supported dependency files
| File | Ecosystem | Parser |
|---|---|---|
requirements.txt | PyPI | parse_requirements_txt() |
requirements-dev.txt, requirements-*.txt | PyPI | parse_requirements_txt() |
pyproject.toml ([project.dependencies], [project.optional-dependencies]) | PyPI | parse_pyproject_toml() |
package.json (dependencies, devDependencies) | npm | parse_package_json() |
Files are discovered automatically with find_and_parse_all(), which traverses the directory tree while skipping .venv/, node_modules/, .git/, etc.
Implemented rules
DEP-001 — Hallucinated dependency (CRITICAL)
Verifies that each declared package exists in the public registry (PyPI or npm). If it does not exist, it is very likely a name hallucinated by the AI agent.
# requirements.txt
flask==3.0.0
python-jwt-utils==1.0.0 # Does NOT exist on PyPI -> DEP-001 CRITICAL
Requires network: Yes. Skipped in --offline mode.
DEP-002 — Suspiciously new dependency (HIGH)
Checks the package creation date. If it was created less than deps.min_age_days days ago (default: 30), it may be a malicious package registered as part of a slopsquatting attack.
Requires network: Yes. Skipped in --offline mode.
DEP-003 — Typosquatting candidate (HIGH)
Compares each dependency name against a corpus of popular packages using normalized Levenshtein distance. If the similarity is >= deps.similarity_threshold (default: 0.85), it is a typosquatting candidate.
# requirements.txt
requets==2.31.0 # Similarity 0.875 with "requests" -> DEP-003 HIGH
Requires network: No. Works in --offline mode.
Normalization: For PyPI, hyphens (-), underscores (_), and dots (.) are treated as equivalent (PEP 503). my-package, my_package, and my.package are normalized to the same name before comparison.
Corpus: A built-in corpus of ~100 PyPI packages and ~70 npm packages is used as a fallback. When the files data/popular_pypi.json and data/popular_npm.json are generated (PHASE 6), those will be used instead.
DEP-005 — No source repository (MEDIUM)
Verifies that the package has a source code repository linked in its metadata. Packages without a repository are harder to audit.
Requires network: Yes. Skipped in --offline mode.
DEP-007 — Nonexistent version (CRITICAL)
Verifies that the exact pinned version exists in the registry. Only applies to exact versions (==1.2.3 in PyPI, 1.2.3 without prefix in npm).
# requirements.txt
flask==99.0.0 # Version does not exist -> DEP-007 CRITICAL
Requires network: Yes. Skipped in --offline mode.
Deferred rules
| Rule | Reason | Estimate |
|---|---|---|
| DEP-004 (unpopular) | Requires download statistics API, not available in basic PyPI/npm metadata | V1 or PHASE 6 |
| DEP-006 (missing import) | Requires AST import parser, out of scope for V0 (regex-based) | V1 |
Analysis flow
- Discovery:
find_and_parse_all()traverses directories withos.walk()+ pruning, looking for dependency files. - Parsing: Each file is parsed into a list of
DeclaredDependencywith name, version, source file, line, and ecosystem. - Deduplication: Duplicates are removed by name+ecosystem (e.g., same package in
requirements.txtandpyproject.toml). - Registry verification (if online): For each unique package, PyPI/npm is queried via
RegistryClient. DEP-001, DEP-002, DEP-005, DEP-007 are applied. - Similarity verification (always): For each unique package, popular packages with similar names are searched. DEP-003 is applied.
Registry Client
The RegistryClient handles HTTP queries to PyPI and npm:
- Disk cache:
~/.cache/vigil/registry/with individual JSON files per package. - Configurable TTL: Default 24 hours (
deps.cache_ttl_hours). - Lazy init: The httpx client is created only when the first request is made.
- Context manager: Supports
with RegistryClient() as client:for automatic cleanup. - Resilience: Network errors assume the package exists (avoids false positives on unstable connections).
# Clear cache
rm -rf ~/.cache/vigil/registry/
# Force fresh requests
# (set cache_ttl_hours: 0 in .vigil.yaml)
Relevant configuration
deps:
# Verify against registries (false = static checks only)
verify_registry: true
# Minimum age in days (DEP-002)
min_age_days: 30
# Similarity threshold for typosquatting (DEP-003)
# 0.85 = catches 1-character typos in names of 8+ characters
similarity_threshold: 0.85
# Registry cache TTL
cache_ttl_hours: 24
# Offline mode (no HTTP)
offline_mode: false
Offline mode
With --offline or deps.offline_mode: true:
| Rule | Behavior |
|---|---|
| DEP-001 | Skipped (requires registry verification) |
| DEP-002 | Skipped (requires creation date from registry) |
| DEP-003 | Active (local comparison against corpus) |
| DEP-005 | Skipped (requires registry metadata) |
| DEP-007 | Skipped (requires version list from registry) |
Pending analyzers
AuthAnalyzer (CAT-02) — PHASE 2
Will detect insecure authentication patterns in FastAPI, Flask, and Express via regex:
- Endpoints without auth middleware (AUTH-001, AUTH-002)
- CORS with
*(AUTH-005) - JWT with hardcoded secrets (AUTH-004)
- Cookies without security flags (AUTH-006)
- Non timing-safe password comparison (AUTH-007)
SecretsAnalyzer (CAT-03) — PHASE 2
Will detect secrets and credentials in code:
- Placeholders copied from docs/examples (SEC-001)
- Secrets with low entropy (SEC-002)
- Connection strings with credentials (SEC-003)
- Environment variables with sensitive defaults (SEC-004)
- Secret files outside .gitignore (SEC-005)
- Values copied from .env.example (SEC-006)
TestQualityAnalyzer (CAT-06) — PHASE 3
Will detect tests that provide false coverage:
- Tests without assertions (TEST-001)
- Trivial assertions (TEST-002)
- Generic exception catching in tests (TEST-003)
- Tests skipped without reason (TEST-004)
- API tests without status code verification (TEST-005)
- Mocks that replicate the implementation (TEST-006)