Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.agentguardian.io/llms.txt

Use this file to discover all available pages before exploring further.

How agent-guardian signals failure. CLI exit codes drive CI gates; the LLMError hierarchy lets SDK callers branch on transport / auth / quota faults without parsing strings.

When to use this

  • You’re writing a CI job and need to know which exit codes a scan step can return.
  • You’re wrapping the SDK and need a clean try / except taxonomy for provider failures.
  • You’re debugging a scan that exited non-zero and want to know what the number means.

CLI exit codes

Defined in src/agent_guardian/cli.py. Every top-level command exits with one of these.
CodeConstantRaised byMeaning
0EXIT_OKAny command on successSuccess.
1EXIT_FAIL_UNDERscan (with --fail-under N), verify, publish, last-score --score-onlyFinal AIVSS < N, signature verification failed, or last-score had no scans on record.
2EXIT_CONFIGAll commandsConfiguration error: bad flag, missing file, unknown format, unsafe serve bind, contract migration failure, invalid scan_id.
3EXIT_TARGET_UNREACHABLEscan (HTTP endpoint mode)Endpoint preflight could not reach the target after 3 attempts (5s, 10s, 15s — cold-start tolerant).
4EXIT_LLM_PROVIDERscanLLM provider misconfigured (missing key, unknown provider) or model not found during pre-scan validation.
5EXIT_SANDBOXscanSandbox violation — an agent attempted a blocked filesystem / network operation.
130EXIT_USER_INTERRUPTscanOperator hit Ctrl-C (POSIX convention: 128 + SIGINT(2)).

How CI gates branch on this

agent-guardian scan --endpoint https://api.your-agent.com/v1/chat \
                    --fail-under 80 \
                    --model anthropic:claude-haiku-4-5
EXIT=$?
case $EXIT in
  0)   echo "scan passed (AIVSS ≥ 80)" ;;
  1)   echo "scan completed but AIVSS < 80 — gate fails" ; exit 1 ;;
  2)   echo "config error — fix flags or config" ; exit 1 ;;
  3)   echo "target unreachable — check endpoint" ; exit 1 ;;
  4)   echo "LLM provider error — check API keys" ; exit 1 ;;
  5)   echo "sandbox violation — investigate" ; exit 1 ;;
  130) echo "interrupted" ; exit 130 ;;
  *)   echo "unexpected exit $EXIT" ; exit 1 ;;
esac
Always branch on the exit code rather than parsing stdout: the exit contract is stable; the human-readable lines are not.

Sample exit-code triggers

A few concrete shapes:
# EXIT_OK — happy path
agent-guardian scan --system-prompt prompt.txt --model stub
echo $?    # 0

# EXIT_FAIL_UNDER — gate trips
agent-guardian scan --system-prompt prompt.txt --fail-under 100
echo $?    # 1 (stub scores are rarely 100)

# EXIT_CONFIG — bad flag
agent-guardian report some-scan --output xml
echo $?    # 2 — unknown format

# EXIT_TARGET_UNREACHABLE — endpoint down
agent-guardian scan --endpoint http://127.0.0.1:1 --model stub
echo $?    # 3

# EXIT_LLM_PROVIDER — missing key
unset OPENAI_API_KEY AGENT_GUARDIAN_OPENAI_API_KEY
agent-guardian scan --system-prompt prompt.txt --model openai:gpt-4o
echo $?    # 4

# EXIT_USER_INTERRUPT — Ctrl-C during scan
# (interactive only)
echo $?    # 130

LLM provider exception taxonomy

Defined in src/agent_guardian/llm/errors.py. Every provider client maps HTTP / SDK errors into one of these so the rest of the framework can decide whether to retry, surface to the operator, or abort the scan without caring about the underlying transport.
LLMError                       (base)
├── LLMRateLimitError          # 429 / quota exhausted (carries retry_after)
├── LLMAuthError               # 401 / 403 — credentials missing or invalid
├── LLMTimeoutError            # transport-layer timeout
├── LLMTransientError          # 5xx / network blip — safe to retry
├── LLMPermanentError          # non-retryable 4xx (bad model, bad payload, …)
└── LLMResponseFormatError     # 200 OK but the payload was missing required fields
All seven are exported from agent_guardian (the top-level package).
ExceptionWhenRetry?Typical cause
LLMErrorBase classCatch this if you don’t care about the sub-class.
LLMRateLimitError429, quota exhaustedYes (honour retry_after)You’re hitting your tier’s TPM / RPM cap. The exception carries an optional retry_after (seconds) lifted from the provider’s Retry-After header.
LLMAuthError401, 403NoMissing / wrong / revoked API key. Check AGENT_GUARDIAN_<PROVIDER>_API_KEY or the conventional fallback (OPENAI_API_KEY, ANTHROPIC_API_KEY, GEMINI_API_KEY / GOOGLE_API_KEY).
LLMTimeoutErrorTransport timeoutYesThe request didn’t complete in the client’s deadline. Often a cold start; sometimes a provider blip.
LLMTransientError5xx, network blipYesProvider-side fault. Backoff and retry.
LLMPermanentErrorNon-retryable 4xxNoBad model name, malformed payload, content-policy refusal that won’t change on retry.
LLMResponseFormatError200 OK, malformed bodyMaybe (could be a schema drift)Provider returned success but the body was missing required fields. Usually a provider schema change.

Catching them in your code

from agent_guardian import (
    LLMAuthError, LLMError, LLMPermanentError,
    LLMRateLimitError, LLMResponseFormatError,
    LLMTimeoutError, LLMTransientError,
)
import asyncio

async def call_with_recovery(llm, request):
    try:
        return await llm.chat(request)
    except LLMRateLimitError as exc:
        wait = exc.retry_after or 5.0
        await asyncio.sleep(wait)
        return await llm.chat(request)
    except LLMAuthError:
        raise SystemExit(
            "Provider returned 401/403. Check your API key env vars."
        )
    except (LLMTimeoutError, LLMTransientError):
        # Backoff handled at a higher layer or a retry helper.
        raise
    except LLMPermanentError:
        # No point retrying — fix the request.
        raise
    except LLMResponseFormatError:
        # Provider schema change. Worth surfacing loudly.
        raise
    except LLMError:
        # Unknown LLM-layer fault.
        raise
The bundled agent_guardian.llm.retry helpers honour this taxonomy: they retry the transient classes (LLMRateLimitError, LLMTimeoutError, LLMTransientError), respect retry_after, and fail fast on the rest.

Mapping back to a CLI exit code

When the CLI hits a provider error during a scan, it surfaces it as EXIT_LLM_PROVIDER (4) after pre-scan validation, or lets the swarm’s internal retry policy absorb transients. The operator always sees:
llm config error: ...
# or
warning: <transient blip>
on stderr.

How to interpret the result

  • 0 / 1 are the only exit codes a healthy scan should produce. Anything else means the scan didn’t run to completion.
  • 2 / 3 / 4 are operator-fixable: bad config, dead endpoint, missing key. Surface the underlying message on stderr — the CLI tells you which.
  • 5 is rare and means an agent tried to escape the sandbox. File an issue with the scan transcript.
  • 130 is just Ctrl-C — no action needed.
  • In Python code, prefer catching specific LLMError subclasses over the base — your retry policy depends on the distinction.

Next step