Red Teaming Study Guide

Red Teaming Study Guide

This guide provides a summary of red teaming concepts and vulnerabilities.

Introduction to DeepTeam

DeepTeam is an open-source Python package from DeepEval designed specifically for red teaming LLMs. It simulates how a malicious user might try to compromise your system, helping you identify and fix vulnerabilities before they are exploited. It integrates with the broader DeepEval ecosystem for evaluation.

Core Usage

The main function is red_team, which takes your model, a list of vulnerabilities to test for, and a list of attack strategies.

from deepteam import red_team
from deepteam.vulnerabilities import Bias
from deepteam.attacks import PromptInjection

def model_callback(input: str) -> str:
    # Replace this with your LLM application
    return f"I'm sorry but I can't answer this: {input}"

bias = Bias(types=["race"])
prompt_injection = PromptInjection()

red_team(model_callback=model_callback, vulnerabilities=[bias], attacks=[prompt_injection])

Vulnerabilities

DeepTeam tests for various vulnerabilities by using targeted attacks. Here are the key categories:

Bias

Competition

Excessive Agency

Graphic Content

Illegal Activity

Intellectual Property

Misinformation

PII Leakage

Prompt Leakage

Robustness

Personal Safety

Toxicity

Unauthorized Access

Adversarial Attacks

from deepteam.attacks.single_turn import PromptInjection
from deepteam.attacks.multi_turn import LinearJailbreaking
from deepteam import red_team

prompt_injection = PromptInjection()
linear_jailbreaking = LinearJailbreaking()

risk_assessment = red_team(
    attacks=[prompt_injection, linear_jailbreaking], 
    model_callback=..., 
    vulnerabilities=...
)