The defaultDocumentation Index
Fetch the complete documentation index at: https://docs.agentguardian.io/llms.txt
Use this file to discover all available pages before exploring further.
--fail-under gate is a score gate — it blocks when the
aggregate AIVSS drops below your floor. This page covers the
complementary finding gate: block when any finding lands in the
critical (or, optionally, high) severity tier, regardless of the
aggregate score.
When to use this
- A release branch where one critical-band ASI05 (code execution) or ASI03 (privilege escalation) finding is a non-negotiable block, even if the rest of the run is clean and the aggregate AIVSS clears the floor.
- An agent that runs in a high-blast-radius environment (production database write access, customer-data egress, payment APIs).
- Any time the score gate alone has let a single high-severity finding slip through because the aggregate stayed above the floor.
The two gates, side by side
The CLI ships one built-in gate flag:--fail-under (score gate). A finding gate is a thin post-processing step on the scan.json the CLI always emits. Wire both together so they share the same scan and the build fails on whichever fires first.
| Gate | Flag / mechanism | Fires when | Exit code |
|---|---|---|---|
| Score gate | --fail-under N | AIVSS < N, or scan is non-authoritative | 1 (EXIT_FAIL_UNDER) |
| Finding gate | Post-step: parse scan.json, count band=critical findings | Any finding has band=critical | Whatever the post-step exits with (use 1 for parity) |
models/severity.py: excellent, good, warning, poor, critical, plus the non-numeric not_evaluated for non-authoritative scans. The critical band covers AIVSS 0–39 (the lowest 40 points of the 100-point scale).
Step 1: produce a JSON report
The JSON report is structured, signed, and always emitted alongside any other format. Ask for it explicitly so the post-step has a deterministic path:--output accepts json, sarif, junit, md, or pdf (see cli.py:2306–2308). If you need both a JSON for the finding gate and a SARIF for GitHub Code Scanning, ask the CLI for --output sarif --output-path scan.sarif and emit a sidecar scan.json from a second agent-guardian report SCAN_ID --output json call — or use --bundle DIR to write a checksummed SARIF+PoV bundle, with the JSON included.
Step 2: count critical findings
Each entry in thefindings[] array carries a per-finding band field, populated from the same band_for_score enum. The shell post-step is two lines of jq:
::error:: prefix is GitHub Actions log-command syntax — it surfaces the line as a red annotation on the PR. On GitLab or Jenkins, drop the prefix and rely on the non-zero exit.
Step 3: wire both gates into one job
The CLI’s score gate runs first (inside thescan step). If it passes, the post-step’s finding gate runs. The job fails if either fires.
.github/workflows/agent-guardian.yml
if: always() on the finding gate runs it even when the score gate exited 1 — so a single run surfaces both failure modes instead of stopping at the first one. The job’s final status is the OR of the two steps.
Step 4: confirm the gate is real
Before trusting the gate in production, prove it fails on a known-bad target. The bundled vulnerable demo agent —agent_guardian.examples.vulnerable_demo — is calibrated to produce critical-band findings on a full-mode scan. Run it locally:
scan command (score gate fired) and report a non-zero count of critical-band findings (finding gate would fire). If both produce the expected signals, the same workflow on real targets is trustworthy.
How to interpret the two gates together
The two gates answer different questions, and reading both is what makes the verdict actionable:| Score gate | Finding gate | What it means | What to do |
|---|---|---|---|
| Pass | Pass | Aggregate AIVSS clears the floor and no critical-band findings. | Merge. |
| Pass | Fail | Aggregate is healthy but one finding is in the critical band (likely a single ASI05/ASI03 exploit). | Block. Triage the named finding(s); the rest of the agent is fine, but one capability is severely broken. |
| Fail | Pass | AIVSS < floor (broad mediocrity), no single finding hit critical. | Block. The agent has too many warning/poor-band findings — mitigate the most-common category. |
| Fail | Fail | Both broad mediocrity and a critical-band finding. | Block. Fix the critical finding first; re-scan to see whether the aggregate clears once it’s removed. |
What this does not do
- It does not change the CLI’s exit codes. The CLI still uses the six codes from
cli.py:83–89; the finding gate is a sibling shell step with its own exit. - It does not run in
--mode fastor--mode smartand still produce a meaningful “critical” verdict. Afastscan that finds zero critical-band issues has only run 3 probes per agent — absence of evidence is not evidence of absence. The same authoritativeness rule applies to the finding gate as to--fail-under: only trust a “no critical findings” result from a--mode fullauthoritative scan (see AIVSS score →mode_authoritative). - It does not de-duplicate findings across runs. If the same agent flaw is found twice, both findings land in the JSON and both count. Use
--pov-gateto drop unreproducible triggers, and--criticto drop low-quality / high-false-positive-risk findings before they reach the gate.
Next step
Upload SARIF to GitHub
Walk-through for
github/codeql-action/upload-sarif@v3 and the
permissions block.GitHub Actions
The full workflow YAML, including SARIF upload to the Security tab.
Reports
The structure of
scan.json — including the per-finding facets
this gate parses.AIVSS score
The five-step formula + the
mode_authoritative rule that gates
--fail-under.