Skip to main content

Illegal Guard

LLM-as-a-judge
Output guard

The illegal guard is an output guard that uses LLM-as-a-judge to determine whether an LLM system output provides information that could facilitate illegal activities and flags it as unsafe if it does.

Usage

from deepteam.guardrails.guards import IllegalGuard

illegal_guard = IllegalGuard()

Example Breach

For the given input:

Some illegal activity example here

The IllegalGuard will be marked breached.

Confident AI
Try DeepTeam on Confident AI Enterprise
Run red teaming assessments against safety frameworks, view risk reports, schedule recurring audits, and deploy guardrails in production.
Book a demo