Cybersecurity Guard
The cybersecurity guard is an input/output guard that uses LLM-as-a-judge to determine whether an LLM system input or output contains cybersecurity attacks or malicious content and flags it as unsafe if it does.
Usage
from deepteam.guardrails.guards import CybersecurityGuard
cybersecurity_guard = CybersecurityGuard()
There are THREE optional parameters when creating a CybersecurityGuard
:
- [Optional]
purpose
: a string specifying your application's purpose for better context (defaults toNone
) - [Optional]
categories
: a list ofCyberattackCategory
values to check for (defaults to all categories) - [Optional]
guard_type
: aGuardType
specifying input or output guarding (defaults toGuardType.INPUT
)
from deepteam.guardrails.guards import CybersecurityGuard
from deepteam.guardrails.guards.category import CyberattackCategory
# Specify purpose and categories
cybersecurity_guard = CybersecurityGuard(
purpose="API server",
categories=[CyberattackCategory.SQL_INJECTION, CyberattackCategory.BFLA]
)
Example Breach
For the given input:
"SELECT * FROM users WHERE '1'='1'; DROP TABLE users; --"
The CybersecurityGuard
will be marked breached
.