A probe is a single, self-contained attack template — one YAML file per probe, grouped underDocumentation Index
Fetch the complete documentation index at: https://docs.agentguardian.io/llms.txt
Use this file to discover all available pages before exploring further.
src/agent_guardian/probes/asi01/
… asi10/. The schema is enforced by the Probe pydantic model in
src/agent_guardian/models/probe.py.
When to use this page
- You’re authoring a new probe and need the field reference.
- You’re vendoring AgentGuardian into a tool and writing your own probe loader.
- You opened a probe YAML in
src/agent_guardian/probes/and want to know what every field means.
The schema
extra="forbid" — unknown keys raise
ProbeValidationError at load time. The loader walks
src/agent_guardian/probes/asi**/*.yaml and *.yml recursively.
Field reference
Identity
| Field | Type | Required | Notes |
|---|---|---|---|
id | str (min 1) | yes | Stable id used as the SARIF ruleId + finding probe_id. Convention: ASI<NN>-<2-letter agent>-<3-digit>. |
name | str (min 1) | yes | One-word slug, kebab-case. Surfaced in list-probes output. |
Mapping (the OWASP / MITRE / CSA triple)
| Field | Type | Required | Notes |
|---|---|---|---|
asi | AsiCategory enum | yes | One of ASI01..ASI10 (OWASP ASI 2026). |
mitre_atlas | list[MitreTechnique] (min 1) | yes | Mix of numeric AML.T* ids and named techniques (AI Agent Context Poisoning, Memory Manipulation, Escape to Host, …). |
csa_category | CsaCategory enum | yes | CSA Agentic Risk taxonomy bucket. |
_coerce_probe
runs before pydantic validation — missing or empty values raise
ProbeValidationError with an explicit message rather than a generic
pydantic error.
Risk + targeting
| Field | Type | Required | Notes |
|---|---|---|---|
severity | Severity enum | yes | critical / high / medium / low. Drives the AIVSS weight. See Severity levels. |
tier_floor | Tier enum | yes | Minimum tier at which the probe runs (T1–T4). A T2 tier-floor probe is skipped on a target detected as T3 or T4. |
Attack content
| Field | Type | Required | Notes |
|---|---|---|---|
seeds | list[str] (min 1) | yes | Concrete attacker turns. The swarm picks seeds + mutates them through the strategy chain. |
description | str (min 1) | yes | One- or two-paragraph natural-language description of the attack mechanic. Renders into the SARIF rule’s fullDescription. |
expected_evidence | str (min 1) | yes | One-token evidence tag the judge looks for in the target’s response. Examples: encoded_directive_obeyed, tool_invoked_without_consent. |
remediation_ref | str (min 1) | yes | Stable identifier for the recommended remediation (e.g. REM-ASI01-005). The remediation library is keyed by this. |
owasp_scenario | str | null | no (legacy probes) | OWASP-2026 scenario citation (CC-4). Optional for backward compat; Phase B probes are required to populate this. |
references | list[str] (default []) | no | Citation keys for the research paper / blog / repo the probe is derived from. Resolves to the bib in concepts/research-foundation. |
Where each field gets read
| Field | Read by | Surface |
|---|---|---|
id | reports/sarif.py | SARIF rules[].id + results[].ruleId |
name | cli.py list-probes | One-line corpus listing |
asi | core/scoring.py | Per-category score aggregation |
mitre_atlas | reports/sarif.py, JSON emitter | SARIF rules[].properties.mitre_atlas; JSON findings[].mitre_atlas |
csa_category | JSON emitter, JUnit | JSON findings[].csa_category |
severity | core/scoring.py | SEVERITY_WEIGHTS lookup for AIVSS |
tier_floor | core/tiering.py | Skipped if target tier < floor |
seeds | strategies/* | Mutated into attacker turns |
description | reports/sarif.py | SARIF rules[].fullDescription |
expected_evidence | core/triage.py, judge | Tag the evaluator looks for |
remediation_ref | reports/markdown.py, PDF | Renders the recommended-fix block |
owasp_scenario | reports/canonical.py | Per-finding OWASP scenario tag |
references | concepts/research-foundation | Hyperlink resolution |
Loading + validation
ProbeValidationError on the first failing
probe. The error message includes the source file and the failing
field path.
Authoring a new probe
PerCONTRIBUTING.md, the steps:
- Pick the ASI category (
asi01/–asi10/). - Pick a
<2-letter>agent code that matches the specialist (e.g.GHfor goal-hijack,TAfor tool-abuse,MPfor memory poisoning). Listed inlist-agents. - Pick the next free 3-digit suffix in the same
asi/agentseries. - Write the YAML.
- Add a fixture row to the corpus tests so a regression on the loader catches missing fields.
- Bump
PROBE_CORPUS_VERSIONinprobes/loader.py:34+ the sibling_meta/version.yamlstamp.
Anti-patterns
Next step
CLI reference
Use
agent-guardian list-probes --asi ASI01 to inspect the loaded
corpus.Report schema
How the loaded probe ends up serialised into
scan.json +
findings.sarif.Severity levels
The four-tier severity enum + AIVSS weights.
Research foundation
The
references: keys resolve here.