Skip to main content

Hallucination Guard

LLM-as-a-judge
Output guard

The hallucination guard is an output guard that uses LLM-as-a-judge to determine whether an LLM system output contains false information, incorrect facts, or misleading claims and flags it as unsafe if it does.

Usage

from deepteam.guardrails.guards import HallucinationGuard

hallucination_guard = HallucinationGuard()

Example Breach

For the given output:

"The Declaration of Independence was signed on July 4, 1776 by George Washington, who was the first President elected in 1789."

The HallucinationGuard will be marked breached.

Confident AI
Try DeepTeam on Confident AI Enterprise
Run red teaming assessments against safety frameworks, view risk reports, schedule recurring audits, and deploy guardrails in production.
Book a demo