Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.agentguardian.io/llms.txt

Use this file to discover all available pages before exploring further.

A merge-request check that fails when the AgentGuardian AIVSS score drops below your floor, with the SARIF report surfaced through GitLab’s security-dashboard pipeline-report contract.

When to add this

  • The first time an LLM agent lands in main and needs a regression gate on merge requests.
  • On every release branch before tagging.
  • For any change that touches the agent’s system prompt, tool surface, memory layer, or framework graph.

Wire it up

Create .gitlab-ci.yml (or add a redteam job to your existing pipeline):
.gitlab-ci.yml
stages:
  - redteam

redteam:
  stage: redteam
  image: python:3.12-slim
  variables:
    GEMINI_API_KEY: $GEMINI_API_KEY      # set in CI/CD → Variables (masked)
  before_script:
    - pip install --no-cache-dir agent-guardian
  script:
    - |
      agent-guardian scan \
        --framework langgraph \
        --framework-ref my_app.graph:graph \
        --model gemini:gemini-2.5-flash \
        --mode full \
        --budget-usd 0.10 \
        --output sarif \
        --output-path scan.sarif \
        --fail-under 70
  artifacts:
    when: always       # upload even when --fail-under fails
    paths:
      - scan.sarif
    reports:
      sast: scan.sarif # surfaces findings in GitLab's security dashboard
    expire_in: 30 days
  rules:
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"
    - if: $CI_COMMIT_BRANCH == "main"
Three things to know:
  1. artifacts.when: always is mandatory — without it, a failed --fail-under would suppress the SARIF and leave reviewers without the annotations they need.
  2. artifacts.reports.sast is what feeds GitLab’s Security & Compliance → Vulnerability Report. The SARIF emitter produces a schema-valid file every time, so the report contract is satisfied without a converter step.
  3. rules scope the job to merge-request and main-branch pipelines so a feature-branch push doesn’t burn LLM budget.

Pick a target

Replace my_app.graph:graph with the dotted reference to your real framework-native object. Supported --framework values: adk, autogen, crewai, langgraph, openai_agents, strands. For a hosted HTTP agent, swap the framework flags for:
- |
  agent-guardian scan \
    --endpoint https://my-agent.example.com/chat \
    --model gemini:gemini-2.5-flash \
    --mode full \
    --output sarif \
    --output-path scan.sarif \
    --fail-under 70
and set AGENT_GUARDIAN_AUTH_BEARER from a masked CI variable.

Add the provider secret

In GitLab: Settings → CI/CD → Variables → Add variable. Add the key matching your --model choice — GEMINI_API_KEY, OPENAI_API_KEY, ANTHROPIC_API_KEY. Mark it Masked and Protected if you only want main-branch pipelines to read it. For a free offline smoke check, use --model stub — but note that stub runs are non-authoritative (mode_authoritative=false) and always fail --fail-under regardless of the numeric score. See AIVSS score → mode_authoritative.

How to interpret the exit code

GitLab CI uses the same exit codes the GitHub workflow does. The job fails on anything non-zero; the SARIF is still uploaded because of artifacts.when: always.
CodeConstantWhat to do
0EXIT_OKMerge.
1EXIT_FAIL_UNDERBlock merge. Read the SARIF in the merge-request widget.
2EXIT_CONFIGFix the .gitlab-ci.yml script. Not a security regression.
3EXIT_TARGET_UNREACHABLEAdd a health-check step before redteam.
4EXIT_LLM_PROVIDERCheck the provider secret and rerun.
5EXIT_SANDBOXInspect the job log; fix the target reference.
130EXIT_USER_INTERRUPTJob was cancelled. Re-run; raise --budget-usd if it was timing out.
Full reference: CLI exit codes + Exit codes.

Tune the floor

Same progression as the GitHub Actions page:
  • First two weeks--fail-under 60. Catches catastrophic regressions, lets the team see what a real swarm finds.
  • Steady state--fail-under 70. Matches the WARNING/POOR boundary; rejects merges that introduce a medium-severity ASI01 / ASI02 finding.
  • Hardened release branch--fail-under 80. Matches the GOOD/WARNING boundary; only ships when the agent has no high-severity outstanding findings.
Band cutoffs: src/agent_guardian/models/severity.py.

Cap the spend

Pass --budget-usd so a runaway provider can never cost more than budgeted per pipeline. The swarm soft-stops new attack turns at 80 % of the cap and reserves the remainder for the report emission step.
agent-guardian scan ... --budget-usd 0.10 ...
On a gemini:gemini-2.5-flash --mode full run the typical cost is ~$0.06; 0.10 gives headroom + the soft-stop reserve.

Next step

Reports

Open the scan.sarif and the signed scan.json every job emits.

Fail builds on high risk

Add a finding gate on top of the score gate.

GitHub Actions

The same flow on GitHub, with Code Scanning annotations.

CLI reference

Every flag on agent-guardian scan.