Documentation Index
Fetch the complete documentation index at: https://docs.agentguardian.io/llms.txt
Use this file to discover all available pages before exploring further.
What this category covers
An attacker writes to your vector store, retrieval corpus, or memory cache, then waits for a later turn — sometimes a later session or a different tenant — to retrieve and act on the poison. The eightASI06-MP-* probes exercise the four classic failure modes for retrieval-augmented agents: malicious document injection (rag-corpus-inject, embedding-collision), retrieval manipulation (cross-tenant-vector-bleed), hidden behavioural triggers (persistent-trigger-token, cross-session-payload), and false-context leakage (false-memory-plant, iterative-fact-reinforcement, defender-memory-subversion).
There is no
--rag adapter. RAG is exercised through your existing target adapter — your retrieval code is the surface AgentGuardian probes. If the agent under test never reads a vector store, the MP-* probes still fire but will produce skipped evidence rather than findings.When to focus here
- Your agent uses a vector DB (Pinecone, Chroma, pgvector, FAISS, Weaviate, Qdrant) or any embedding-similarity retrieval.
- Your agent shares a retrieval index across users, teams, or tenants.
- Your agent has per-session memory it reads back on the next turn.
- Your agent fetches prompt templates, tool descriptions, or system instructions from a remote source at runtime.
- You’re running a defender / SOC agent whose threat-knowledge memory is user-editable.
Run the focused scan
The bundledpersonal_assistant_pii LangGraph target exposes a memory surface (3 tools + a memory dict + synthetic PII honeypots), so the memory-poison-agent has somewhere to write to:
Expected output
memory-poison-agent line is the one to watch. With a real evaluator it would surface per-probe findings tagged asi=ASI06; in stub mode it confirms the agent fired all eight MP-* probes against your target without an evaluator pass.
How to interpret
The eight probes underASI06-MP-* map one-to-one with the four brief categories below. Every row is verbatim from src/agent_guardian/probes/asi06/.
Malicious document injection
| ID | Probe | Severity | Tests |
|---|---|---|---|
ASI06-MP-001 | rag-corpus-inject | high | Provenance controls on RAG ingestion — does the agent accept “add this to your knowledge base” requests? |
ASI06-MP-004 | embedding-collision | high | Retrieval guardrails — does a crafted document shadow the real one for a targeted query? |
asi06/rag-corpus-inject.yaml):
poisoned_corpus_entry_accepted or collision_document_retrieved_for_target_query.
Retrieval manipulation
| ID | Probe | Severity | Tests |
|---|---|---|---|
ASI06-MP-007 | cross-tenant-vector-bleed | high | Tenant boundary at query time vs metadata filter at write time. |
asi06/cross-tenant-vector-bleed.yaml):
Hidden instruction attacks
| ID | Probe | Severity | Tests |
|---|---|---|---|
ASI06-MP-002 | persistent-trigger-token | high | Whether memory accepts attacker-defined behavioural rules that flip behaviour on later turns. |
ASI06-MP-003 | cross-session-payload | high | Cross-tenant / cross-session isolation of the memory store. |
ASI06-MP-008 | defender-memory-subversion | high | Whether a defender-class agent treats user-supplied “threat knowledge” updates with elevated scepticism. |
asi06/persistent-trigger-token.yaml):
Data leakage through retrieved context
| ID | Probe | Severity | Tests |
|---|---|---|---|
ASI06-MP-005 | false-memory-plant | high | Whether the agent fabricates corroboration for a non-existent earlier authorisation. |
ASI06-MP-006 | iterative-fact-reinforcement | high | Whether the agent weights conversational repetition as evidence. |
asi06/false-memory-plant.yaml):
Related: supply-chain entry points for poisoned context
ThreeASI04 (supply-chain) probes feed the RAG threat model — they’re the delivery vector for malicious documents or templates that later land in retrieval:
| ID | Probe | Severity | Why it matters for RAG |
|---|---|---|---|
ASI04-SC-001 | mcp-server-poison | medium | An attacker-controlled MCP server becomes a trusted retrieval source. |
ASI04-SC-002 | dynamic-template-inject | medium | A hostile prompt template is fetched from an attacker URL at runtime. |
ASI04-SC-008 | poisoned-finetune-checkpoint | high | A poisoned LoRA / checkpoint smuggles in a latent trigger that activates on RAG context. |
supply-chain-attacker agent slate, not the memory-poison slate — but the threat model converges at the retrieval boundary.
Next step
Prompt injection
The indirect-via-memory tie-in: ASI06 plants the payload, ASI01 detonates it.
Reports
Open the SARIF for the cross-tenant findings — those are the ones to escalate first.