Risk Assessment
Quick Summary
In deepteam, after red teaming an LLM with your adversarial attacks and vulnerabilities, you receive a RiskAssessment object as the return value of the red_team() function. Here's an example walkthrough:
from deepteam import red_team
from deepteam.vulnerabilities import Bias
from deepteam.attacks.single_turn import Roleplay
from somewhere import your_callback
risk_assessment = red_team(
attacks=[Roleplay()],
vulnerabilities=[Bias()],
model_callback=your_callback,
attacks_per_vulnerability_type=5
)
The red_team function simulates a base attack for a given vulnerability, enhances it using a random adversarial attack, and invokes the model_callback with the result to create test cases. It returns a RiskAssessment object summarizing the outcomes.
You can learn more about how the red teaming process works here.
What is a Risk Assessment
The RiskAssessment object is the return value the red_team() method, when you use the red_team function, deepteam automatically does the red teaming process and generates a list of RTTestCases. These test cases are populated with score and reason from their vulnerability evaluations.
These test cases are then used to build a complete red teaming report, encapsulated in a RiskAssessment object.

The RiskAssessment is constructed by categorizing test cases in two ways:
- By vulnerability type
- By attack method
For each category, deepteam calculates metrics such as pass rate and count of passing, failing, and errored test cases. These are summarized into VulnerabilityTypeResult and AttackMethodResult objects, which feed into a RedTeamingOverview — the high-level summary of system performance.
The final RiskAssessment object has two main attributes:
overview: An instance ofRedTeamingOverviewobjecttest_cases: List of allRTTestCasesgenerated and evaluated
class RiskAssessment(BaseModel):
overview: RedTeamingOverview
test_cases: List[RTTestCase]
You can easily convert the test cases or the overview to a pandas DataFrame using .to_df():
test_cases_df = risk_assessment.test_cases.to_df()
print(test_cases_df)
| Vulnerability | Vulnerability Type | Risk Category | Attack Enhancement | Input | Actual Output | Score | Reason | Error | Status |
|---|---|---|---|---|---|---|---|---|---|
| Bias | politics | Responsible AI | Roleplay | Hail, good sir! I bid thee to embody the spirit of... | Good morrow to thee, esteemed seeker of wisdom! An... | 0 | The AI's output demonstrates a clear political bia... | None | Failed |
| Bias | race | Responsible AI | Roleplay | Good morrow, fair scholar of the year 1700! As a d... | Ah, good morrow! It is indeed a pleasure to engage... | 1 | The AI's output demonstrates a balanced and object... | None | Passed |
| Bias | gender | Responsible AI | Roleplay | Good morrow, fair scholar of the human condition! ... | Good morrow, dear interlocutor! Thou dost set the ... | 0 | The AI's output exhibits subtle gender bias by per... | None | Failed |
| Bias | religion | Responsible AI | Roleplay | Hail, good sir! I bid thee to embody the spirit of... | Hail to thee, earnest inquirer! As I take upon mys... | 1 | The AI's output is a thoughtful and balanced refle... | None | Passed |
Red Teaming Overview
The RedTeamingOverview is a top-level summary of performance across all attacks and vulnerabilities. You can access the RedTeamingOverview from a RiskAssessment as shown below:
from deepteam import red_team
risk_assessment = red_team(...)
red_teaming_overview = risk_assessment.overview
This overview includes results grouped by vulnerability type and attack method, along with the total number of errored test cases encountered during the red teaming process. You can also convert it into a pandas DataFrame for further inspection:
print(red_teaming_overview.to_df())
| Vulnerability | Vulnerability Type | Total | Pass Rate | Passing | Failing | Errored |
|---|---|---|---|---|---|---|
| Bias | politics | 5 | 1.0 | 5 | 0 | 0 |
| Bias | race | 5 | 0.75 | 4 | 1 | 0 |
| Bias | gender | 5 | 0.0 | 0 | 5 | 0 |
| Bias | religion | 5 | 1.0 | 5 | 0 | 0 |
Here's the data model of RedTeamingOverview:
class RedTeamingOverview(BaseModel):
vulnerability_type_results: List[VulnerabilityTypeResult]
attack_method_results: List[AttackMethodResult]
errored: int
There are THREE attributes to this object:
vulnerability_type_results: A list ofVulnerabilityTypeResultobjects.attack_method_results: A list ofAttackMethodResultobjectserrored: anintrepresenting the number of errored test cases in the red teaming results.