Senior-engineer code review as an agent skill. Run /codeprobe audit . and get a full health report in seconds.
- 9 review categories -- security, SOLID, architecture, error handling, performance, test quality, code smells, design patterns, framework best practices
- Severity-scored findings with file locations and copy-pasteable fix prompts
- Auto-detects your stack -- Python, TypeScript, React/Next.js, PHP/Laravel, SQL, and more
- Strictly read-only -- never modifies your code
- Works with 45+ agents -- Claude Code, Cursor, Codex, Windsurf, Cline, and more
Every /codeprobe audit opens with a health dashboard (category scores, codebase stats, hot-spot files), then lists detailed P0–P3 findings with fix prompts, and saves the whole report to ./codeprobe-reports/<project>-<cmd>-<timestamp>.md.
Overall Health: 58/100 [Critical]
Category Scores
| Category | Score | Bar | Status |
|---|---|---|---|
| Security | 72/100 | ██████████████░░░░░░ |
Needs Attention |
| SOLID | 45/100 | █████████░░░░░░░░░░░ |
Critical |
| Architecture | 55/100 | ███████████░░░░░░░░░ |
Critical |
| Error Handling | 38/100 | ████████░░░░░░░░░░░░ |
Critical |
| Performance | 62/100 | ████████████░░░░░░░░ |
Needs Attention |
| Test Quality | 48/100 | ██████████░░░░░░░░░░ |
Critical |
| Code Smells | 65/100 | █████████████░░░░░░░ |
Needs Attention |
| Patterns | 78/100 | ████████████████░░░░ |
Needs Attention |
| Framework | 86/100 | █████████████████░░░ |
Healthy |
Codebase: 391 files · 64,146 LOC · Python 3.11 / FastAPI + Next.js / PostgreSQL / Redis
Hot spots:
services/order_processor.py— 7 categories flagged (SOLID, Error Handling, Performance, Code Smells, Test Quality, Patterns, Architecture)api/routers/checkout.py— 5 categories flagged (Security, SOLID, Error Handling, Performance, Test Quality)frontend/src/pages/dashboard.tsx— 4 categories flagged (Architecture, Code Smells, Patterns, Test Quality)
Executive Summary: Systemic issues across four categories. The biggest blocker is error handling (38) — 14 swallowed exceptions in payment paths, no top-level try/except in any of the three async workers, and transaction boundaries leak across service calls. SOLID (45) is dominated by OrderProcessor as a 1,420-LOC god class handling pricing, inventory, fulfillment, and notifications. Architecture (55) has a bidirectional dependency between services/ and api/ that makes the checkout path near-impossible to test in isolation. Security (72) is mostly sound but string-concatenation SQL in reports/query_builder.py:88 needs to be fixed before the next release.
Critical (P0 — 3 findings):
- SEC-001 |
reports/query_builder.py:88— SQL built with string concatenation against the unsanitizedfiltersrequest param. Direct injection viaPOST /reports. Fix: parameterize with SQLAlchemytext(...)+ bind params, or move the query to the ORM. - ERR-007 |
workers/payment_worker.py:42-139—process_refundwraps the Stripe call in a baretry/except Exception: pass. Failed refunds are silently dropped from the retry queue. Fix: catchstripe.error.StripeErrorexplicitly, enqueue the payload on a dead-letter queue, alert onInvalidRequestError. - ARCH-003 |
services/order_processor.py↔api/routers/checkout.py— Bidirectional coupling: the service imports router types for validation while the router constructs service internals directly. Blocks both service-level unit tests and independent router evolution. Fix: introducecheckout/schemas.pyas a dependency-free Pydantic layer both sides depend on.
--> Report saved to ./codeprobe-reports/growth-engine-audit-2026-04-23-221047.md
npx skills add nishilbhave/codeprobeThen run /codeprobe audit . in any project.
Manage: npx skills update • npx skills remove
Optional: Python 3.8+ enables codebase statistics in the /codeprobe audit dashboard.
Reports are saved to ./codeprobe-reports/<project>-<cmd>-<timestamp>.md in your current directory (e.g. growth-engine-audit-2026-04-23-221047.md) — the filename tells you which project and which command the report came from at a glance. Add codeprobe-reports/ to your .gitignore to keep them out of source control.
| Command | Description |
|---|---|
/codeprobe audit <path> |
Full audit -- health dashboard (scores, file statistics, hot spots) plus detailed findings with fix prompts |
/codeprobe quick <path> |
Top 5 most impactful issues with fix prompts |
/codeprobe security <path> |
Security vulnerability detection |
/codeprobe solid <path> |
SOLID principles analysis |
/codeprobe architecture <path> |
Architecture and dependency analysis |
/codeprobe performance <path> |
Performance audit |
/codeprobe errors <path> |
Error handling audit |
/codeprobe tests <path> |
Test quality audit |
/codeprobe smells <path> |
Code smell detection |
/codeprobe patterns <path> |
Design patterns analysis |
/codeprobe framework <path> |
Framework best practices |
If no path is given, the current working directory is used.
The <path> argument works the same way for every command above. It can be a directory or a single file, and can be relative or absolute.
# Current directory (same as passing no path)
/codeprobe audit .
# A subdirectory
/codeprobe audit ./src/backend
# An absolute path
/codeprobe audit /Users/me/projects/myapp
# A single file
/codeprobe audit ./src/api/auth.ts
# Scope a single category to a subfolder
/codeprobe security ./src/api
/codeprobe solid ./backend/services
/codeprobe quick ./src/checkoutNotes on paths:
- Relative paths are resolved against the directory Claude Code was started in.
- Only files inside the given path are analyzed — everything else is ignored.
- The report is always saved to
./codeprobe-reports/<project>-<cmd>-<timestamp>.mdin your current working directory, regardless of which path you scanned.cdinto the project first if you want the report to land there. The<project>segment is derived from the basename of the path you scanned (extension stripped for single-file scans), so reports from different projects in the same directory remain distinguishable.
The system uses an orchestrator + sub-skill architecture:
- Orchestrator (
skills/codeprobe/SKILL.md) -- Routes commands, detects your tech stack, loads config, and invokes specialized sub-skills. - Sub-skills -- Domain experts that each analyze one category:
codeprobe-security-- SQL injection, XSS, hardcoded secrets, auth issuescodeprobe-error-handling-- Swallowed exceptions, missing try/catch, transaction safetycodeprobe-solid-- Single Responsibility, Open/Closed, Liskov, Interface Segregation, Dependency Inversioncodeprobe-architecture-- Coupling, layering violations, circular dependencies, god objectscodeprobe-patterns-- Design pattern opportunities and anti-patternscodeprobe-performance-- N+1 queries, unbounded queries, algorithmic efficiency, cachingcodeprobe-code-smells-- Long methods, deep nesting, duplicate code, primitive obsessioncodeprobe-testing-- Missing tests, test smells, mock abuse, coverage gapscodeprobe-framework-- Laravel, React/Next.js, Python/Django framework idiom violations
- Reference guides (
skills/codeprobe/references/) -- Stack-specific best practices loaded based on auto-detected languages. - Scripts (
skills/codeprobe/scripts/) -- Deterministic analysis utilities:file_stats.py-- LOC, file counts, method counts per filecomplexity_scorer.py-- Cyclomatic complexity per functiondependency_mapper.py-- Import graph and circular dependency detectiongenerate_report.py-- Markdown report generation from audit findings
Stack detection is automatic. The orchestrator scans for file extensions and project markers (e.g., next.config.*, migrations/ directory) and loads the appropriate reference guides.
Each category is scored independently:
crit_penalty = min(50, critical_count * 15)
major_penalty = min(30, major_count * 6)
minor_penalty = min(10, minor_count * 2)
category_score = max(0, 100 - crit_penalty - major_penalty - minor_penalty)
Suggestions do not affect scores. The overall score is a weighted average of active categories:
| Category | Weight |
|---|---|
| Security | 20% |
| SOLID | 15% |
| Architecture | 15% |
| Error Handling | 12% |
| Performance | 12% |
| Test Quality | 10% |
| Code Smells | 8% |
| Design Patterns | 4% |
| Framework | 4% |
| Score Range | Status |
|---|---|
| 80-100 | Healthy |
| 60-79 | Needs Attention |
| 0-59 | Critical |
Create a .codeprobe-config.json in your project root to customize behavior:
{
"severity_overrides": {
"long_method_loc": 50,
"large_class_loc": 500,
"deep_nesting_max": 4,
"max_constructor_deps": 6
},
"skip_categories": ["codeprobe-testing"],
"skip_rules": ["SPEC-GEN-001"],
"framework": "laravel",
"extra_references": [],
"report_format": "markdown"
}All fields are optional. If the file is absent, defaults apply. If skip_categories is set, weights are normalized to 100%.
Auto-detected languages and frameworks with dedicated reference guides:
- Python -- PEP standards, Django/Flask patterns, type hinting
- JavaScript / TypeScript -- ES modules, async patterns, type safety
- React / Next.js -- Component patterns, hooks, SSR/SSG
- PHP / Laravel -- Eloquent, service patterns, blade templates
- SQL / Database -- Query optimization, schema design, migrations
- API Design -- REST conventions, validation, error responses
Additional languages recognized for file statistics: Java, Ruby, Go, Rust, Vue, Svelte, Shell, CSS/SCSS, HTML.
When used on Claude.ai (without filesystem access), the skill runs in degraded mode: it analyzes pasted or uploaded code directly, skips codebase statistics and diff review, and notes the limitation. Findings and scoring still work normally.
MIT
Nishil