Models
Veriva runs Claude Haiku 4.5, Sonnet 4.6, and Opus 4.7 — one isn't right for every job. The pipeline routes each stage to the model that gives the best cost/quality fit, with admin overrides per-org.
No model lock-in
bedrockCreate call reads the active model from the routing config. Switching a stage to a different model is a config change, not a code change.Default routing
| Slot | Default model | Used in |
|---|---|---|
sanitize | Haiku 4.5 | Stage 1 — ambiguous injection detection |
enrichSummarizer | Haiku 4.5 | Stage 3 — repo / author summary compression |
layer2.small | Haiku 4.5 | Stage 5 — diffs under 200 lines |
layer2.standard | Sonnet 4.6 | Stage 5 — diffs 200–2000 lines |
layer2.large | Opus 4.7 | Stage 5 — diffs over 2000 lines |
crossCheck | Haiku 4.5 | Stage 6 — race the AI Review verdict |
deepAudit | Opus 4.7 | Stage 7 — reasoning-heavy second look |
fixGenerator | Sonnet 4.6 | Stage 8 — patch generation |
fixValidator | Haiku 4.5 | Stage 8 — verify patch compiles / makes sense |
Why this mix
Haiku is fast and cheap — perfect for the disputer in cross-check, the validator after auto-fix, and the summarizer at enrich. Putting Haiku opposite Sonnet at cross-check makes the disagreement signal real: same-model self-critique tends to agree with itself.
Sonnet is the workhorse for the typical PR. Opus runs only when the stakes are high — large diffs the small model would lose context on, or Deep Audit on disputed / critical findings — to keep cost bounded.
Cost tracking
Every bedrockCreate call routes through the cost tracker and writes to AICallLog. Per-org daily totals roll up into DailyOrgMetrics. The cost-budget check in the pipeline (Pro: $0.50/PR, Ultra: $2.00/PR) reads the running total mid-flight and short-circuits remaining LLM stages on overrun.
Prompt caching
Bedrock prompt caching is enabled on the system-prompt + repo-profile portion of every AI Review call. The cache point sits before the per-PR diff payload, so repeat-PR-same-repo cases hit the cache and cut Sonnet/Opus cost by 60–90%. Cache hit rate shows on the org overview metrics page.
Overriding routing
Customers don't configure routing directly — it's an admin-controlled global default. Enterprise customers can request a per-org override (e.g., "use Opus everywhere" for compliance reasons), set via PlanOverride.
Region failover
us-east-1. Secondary is us-west-2. The Bedrock client circuit breaker fails over automatically when the primary region opens. Both regions carry provisioned throughput.