Illegal Guard
The illegal guard is an output guard that uses LLM-as-a-judge to determine whether an LLM system output provides information that could facilitate illegal activities and flags it as unsafe if it does.
Usage
from deepteam.guardrails.guards import IllegalGuard
illegal_guard = IllegalGuard()
Example Breach
For the given input:
Some illegal activity example here
The IllegalGuard
will be marked breached
.