Skip to main content
AgentGuardian runs as a merge gate: the same agent-guardian scan you run locally exits non-zero when an agent regresses, and your CI provider turns that exit code into a failed check. The findings surface three ways — a SARIF / Code Quality report, inline annotations, and a single sticky PR comment — all from the one CLI.

Start here

GitHub Actions

Composite action + SARIF upload to the Security tab + sticky PR comment.

GitLab CI

SARIF security report + inline Code Quality widget + sticky MR note.

Bitbucket

Code Insights report + per-finding annotations + sticky PR comment.

The pieces

PageWhat it covers
GitHub ActionsThe pull_request workflow, the composite action, the permissions block.
Composite actionThe reusable agentguardian-scan action and its inputs.
GitLab CIThe .gitlab-ci.yml job, SAST + Code Quality artifacts.
Bitbucket Pipelinesbitbucket-pipelines.yml, Code Insights, annotations.
Security gates--fail-under plus the --max-critical / --max-high / --max-medium / --max-low ceilings, and the authoritativeness rules.
Upload SARIFgithub/codeql-action/upload-sarif@v3 and the permissions it needs.
PR / MR commentsThe single sticky comment, its hidden-marker contract, and a rendered sample.

The gate, in one line

agent-guardian scan \
  --framework langgraph --framework-ref my_app.graph:graph \
  --model gemini:gemini-2.5-flash --mode full --budget-usd 0.10 \
  --output sarif --output-path scan.sarif \
  --fail-under 70 --max-critical 0 --max-high 0
--fail-under is the AIVSS floor; the --max-* flags are per-severity finding ceilings. They are AND-combined — the gate fails if the score drops below the floor or any severity count exceeds its ceiling. Full matrix on the security gates page.
AgentGuardian’s open-source gate is stateless — there is no baseline.Every scan is judged on its own. The gate has no memory of the previous scan, so it cannot tell a newly introduced finding from one that was already there. Consequences:
  • Every finding counts. A --max-critical 0 gate fails the moment a single CRITICAL finding exists, even if that finding predates the PR under review.
  • A pre-existing finding can fail the gate. Turning the gate on against an agent that already has open findings will go red on the first run. That is by design — the gate reports the agent’s current posture, not the delta.
  • Tune the floor to where you are, then tighten. Start permissive (--fail-under 60, no --max-*), land mitigations, and ratchet the thresholds down as the agent improves. See security gates → picking the floor.
Baseline-diff — failing only on findings a PR introduces relative to a recorded baseline — is a hosted (SaaS) feature, not part of the OSS CLI. If you need delta-only gating, that lives in AgentGuardian Cloud.

Surfacing the result

Each platform exposes the same scan three ways:
  • A machine-readable report — SARIF (GitHub Code Scanning / GitLab SAST), GitLab Code Quality JSON, or a Bitbucket Code Insights report.
  • Inline annotations — per-finding, severity-mapped, shown on the diff.
  • A sticky PR/MR comment — one comment upserted in place on every push, keyed by a hidden marker so it never spams. Identical body across all three hosts; see PR / MR comments.

Cost control

Cap the spend with --budget-usd so a runaway provider can never cost more than budgeted per PR. The swarm soft-stops new attack turns at 80% of the cap and reserves the remainder for the report-emission step. A full-mode scan against the bundled vulnerable demo costs roughly $0.06 on gemini:gemini-2.5-flash.