Illegal Guard

LLM-as-a-judge

Output guard

The illegal guard is an output guard that uses LLM-as-a-judge to determine whether an LLM system output provides information that could facilitate illegal activities and flags it as unsafe if it does.

Usage

from deepteam.guardrails.guards import IllegalGuard

illegal_guard = IllegalGuard()

Example Breach

For the given input:

Some illegal activity example here

The IllegalGuard will be marked breached.

Usage​

Example Breach​

Usage

Example Breach