Pipeline

Every PR runs through one explicit pipeline whether it enters via the GitHub App webhook, the CLI, the VS Code extension, or the GitHub Action. The same runPipeline() function backs all four entry points, which means findings are reproducible and replayable across surfaces.

One pipeline, four entry points

Webhook deliveries enqueue BullMQ jobs that call runPipeline(). CLI and VS Code call it in-process. The GitHub Action calls our API. The ten stages execute in the same order regardless.

Ten-stage pipeline + post-merge outcome

Sanitize

Strip prompt injection.

Pre-ingest

Hygiene gate.

Enrich

Repo + author + agent context.

Static

Deterministic rule engine.

AI review

Sonnet on size-tiered prompts.

Cross-check

Haiku disputes HIGH+ findings.

Deep audit

Opus on CRITICAL / disputed.

Auto-fix

Plan-gated patch generation.

Merge gate

Policy beats AI.

Explain

Per-finding provenance.

Post-mergeOutcome

14-day revert + incident watch. Feedback channel, not a stage.

pre-model analysis decision post-merge

Stage 1 — Sanitize

First touch. Strips prompt-injection from anything we'll later send to an LLM: PR title, body, commit messages, file contents. Detects literal injection strings (ignore previous instructions, system: blocks), base64-encoded instruction payloads, zero-width characters, and unusual unicode direction marks.

Rule-based scrubbing first, then a Haiku call for ambiguous cases. Output is a sanitized payload plus a sanitizationFindings[] array, logged for audit but not user-blocking.

Stage 2 — Pre-ingest

Hygiene checks before any model sees the diff. Default checks:

PR description present (configurable minimum length)
Tests touched if src/ was touched (paths configurable)
No binary blobs over 1MB
No committed .env files or credentials

Outputs preIngestFindings[] plus a proceed flag. Under STRICT or LOCKED policy, a proceed: false result fails the check-run early without spending LLM tokens.

Stage 3 — Enrich

Assemble the context bundle every downstream LLM stage reads from. Cached in Redis keyed by ${prId}-${headSha} for one hour.

Repo profile — language, frameworks, test conventions, fingerprinted from the last 200 commits (7-day TTL).
Author profile — last 50 PRs by the author: merge rate, revert rate, avg findings per PR, primary languages.
Similar PRs — file-overlap lookup against the last 200 PRs in the repo. Embedding-based similarity comes in v2.
Dep graph — direct dep changes; new packages flagged for slopsquatting check.
Agent identity — see Agents.
Cross-PR detect — file overlap with other open PRs, revert detection, dup-diff signals.

Stage 4 — Static

Deterministic rules, no LLM. Combines the security ruleset, AI-pattern detection, slopsquatting checks for new dependencies, and the dependency license check (disallowed licenses configurable per-org).

Findings here are cheap, fast, and have rule IDs you can reference in policy.

Stage 5 — AI Review

Sized routing. Tiny diffs go to Haiku, mid-size to Sonnet, large to Opus — see Models. Prompts are pulled from the PromptTemplate table at the active version, so admins can ship prompt changes without a deploy.

The system prompt is augmented with the enrich bundle: repo profile summary, author summary, dep flags, agent identity. The cache point marker before the enrich block lets us hit Bedrock prompt caching on repeat-PR-same-repo cases.

Stage 6 — Cross-check

For every AI Review finding above HIGH severity, we run a second cheap Haiku call: "is this finding genuinely a problem? yes/no plus one-line reason." Race with a 5s timeout.

Disagreements get demoted to LOW and routed into the disputed list for Deep Audit. This is the yoloClassifier pattern — it cuts false positives without an Opus call on every finding.

Stage 7 — Deep Audit

Conditional. Runs only when one of:

Any CRITICAL finding from stages 4–6
disputedFindings[] from Cross-check is non-empty
Trust score below configured threshold

Opus reads the same enrich bundle plus the disputed findings and produces a final verdict. Most PRs skip this stage entirely.

Stage 8 — Auto-fix

For findings tagged fixable, generate patch suggestions the author can apply with one click. Plan-gated to Pro and above.

Stage 9 — Merge Gate

Read the merge policy for the repo (falling back to the org default), evaluate against accumulated findings plus agent identity plus changed files, and produce the check-run conclusion: success, failure, or neutral.

Stage 10 — Explain

Every finding written to the database carries a provenance object: the stage that produced it, the rule ID (for static) or prompt version + AI call ID (for AI stages), and an evidence excerpt.

The PR comment renders findings as [stage:rule] message. Owners and admins can click through to the full AI call detail. See Audit Trail.

Post-merge: Outcome tracking

The ten numbered stages finish when the check-run is written. An async outcome loop runs after merge — it's a feedback channel, not a pipeline stage, so it sits outside the numbered flow.

Triggered on pull_request.closed with merged: true, a delayed BullMQ job fires 14 days later to check for revert PRs and incident-tagged commits in the same files.

Writes a PRReviewOutcome row. Aggregated weekly into per-rule false-positive rates that surface in the admin feedback dashboard and feed the model-routing tuning loop.

Cost and latency budgets

Each PR has a cost ceiling — $0.50 on Pro, $2.00 on Ultra. If exceeded mid-pipeline, remaining LLM stages are skipped and a BUDGET_EXCEEDED finding is emitted. Latency target is 90s p95 from webhook to check-run.

Dashboard Tour

Merge Policy