Skip to content

Search is only available in production builds. Try building and previewing the site to test it out locally.

Two-Tier Scoring

Prism uses two scoring methods to balance accuracy and cost.

The LLM scorer reads the full session transcript and evaluates each of the 10 metrics against a detailed rubric.

PropertyValue
ModelAnthropic Sonnet
Accuracy~90%
Cost~$0.0008 per session
Latency2–5 seconds
Used whenAPI key available, rate limit not exceeded

How it works:

  1. Session transcript is formatted with turn boundaries, tool calls, and file edits
  2. The rubric for all 10 metrics is embedded in the system prompt
  3. The LLM evaluates each metric on the 0–10 scale
  4. Coaching notes are generated for the weakest dimension
  5. Results are structured as JSON and persisted to Postgres

Advantages: understands nuance, context, and intent. Can detect subtle patterns like scope creep or missed verification opportunities.

The heuristic scorer uses regex patterns, keyword matching, and structural analysis for free, instant scoring.

PropertyValue
ImplementationRust-native
Accuracy~70%
CostFree
Latency<10ms
Used whenLLM unavailable, rate-limited, or as real-time PQ scoring in the plugin

How it works:

  1. Prompt text is analyzed for specificity markers (file paths, function names, etc.)
  2. Decomposition is measured by verb count, bundling phrases, and list items
  3. Session-level patterns are detected (retry storms, correction cascades)
  4. Point values are assigned per signal and summed to a 0–10 scale
  5. Coaching notes are generated from templates based on the lowest-scoring areas

Advantages: instant, free, works offline. Used for real-time PQ coaching in the UserPromptSubmit hook.

ContextScorerReason
UserPromptSubmit hookHeuristicMust be instant (<100ms), runs on every prompt
/prism:advisor commandHeuristicInteractive — needs instant feedback
/prism:score commandReads from PostgresDisplays pre-computed scores
Background scoring workerLLM (primary)Accuracy matters, async processing
Background scoring worker (fallback)HeuristicLLM unavailable or rate-limited

After scoring, both methods apply adjustments based on session-level signals:

SignalAdjustment
Correction turns detectedPenalty to IE (Recovery)
Retry storm detectedPenalty to IE (Convergence)
Single-turn session with good outputBonus to PQ and IE
No verification prompts in sessionPenalty to VD
CLAUDE.md present in projectBonus to AF (Configuration)

All scores are stored in the prism.prism_scores Postgres table:

  • Session ID, timestamp, org ID, developer ID
  • All 10 metric scores (0–10)
  • Composite PRISM score (weighted average)
  • Scoring method (LLM or heuristic)
  • Coaching notes (text)
  • Anti-patterns detected (JSON array)