Scoring Overview
PRISM scoring evaluates the quality of your AI-assisted coding sessions across five dimensions and ten metrics.
How scoring works
Section titled “How scoring works”- You code with Claude Code (Prism plugin active)
- Telemetry flows to S3 via the ingest pipeline
- The scoring worker picks up unscored sessions
- Each session is scored on 5 dimensions (10 metrics total)
- Scores persist to Postgres with coaching notes
- You view results via
/prism:score,/prism:report, or the dashboard
Two-tier scoring
Section titled “Two-tier scoring”Prism uses two scoring methods:
| Tier | Method | Accuracy | Cost | When used |
|---|---|---|---|---|
| Primary | LLM scorer (Anthropic Sonnet) | ~90% | ~$0.0008/session | Default when API key available |
| Fallback | Heuristic scorer (Rust-native) | ~70% | Free | When LLM unavailable or rate-limited |
The LLM scorer reads the full session transcript and evaluates each metric against a detailed rubric. The heuristic scorer uses regex patterns and keyword matching for fast, free scoring.
See Two-Tier Scoring for implementation details.
What gets scored
Section titled “What gets scored”Every dimension has two metrics, each 0–10:
| Dimension | Metric 1 | Metric 2 |
|---|---|---|
| Prompt Quality (PQ) | Specificity | Decomposition |
| Iteration Efficiency (IE) | Convergence | Recovery |
| Verification Discipline (VD) | Review | Validation |
| Tool Use (TU) | Selection | Context |
| Advanced Features (AF) | Delegation | Configuration |
Session-level PRISM
Section titled “Session-level PRISM”The overall PRISM score for a session uses recency-weighted averaging — later turns in a session count more than early turns, reflecting improvement during the session.
Coaching notes
Section titled “Coaching notes”Every scored session includes coaching notes — specific, actionable tips for the weakest dimension. These appear in:
/prism:scorecommand output- Dashboard PRISM insights page
/prism:reportreview