AI Red Teaming

AI red teaming for models and agents.

$ dn airt <command>

AI red teaming for models and agents. Launch attacks with run / run-suite; review results from the CLI (analytics, traces, trials, findings) or in the web app under AI Red Teaming — overview dashboard, per-assessment view, trace view, and custom report builder.

create

$ dn airt create <--name> <str>

Create a new AIRT assessment.

Options

--name (Required)
--project-id — Project ID. Defaults to the active project scope.
--runtime-id — Runtime ID. Required when the project has multiple runtimes.
--description — Assessment description
--session-id — Session ID to associate
--target-config — Target configuration as JSON
--attacker-config — Attacker configuration as JSON
--attack-manifest — Attack manifest as JSON
--workflow-run-id — Workflow run ID
--workflow-script — Workflow script content
--json (default False)

list

$ dn airt list

List AIRT assessments.

Options

--project-id — Project ID filter
--page (default 1)
--page-size (default 50)
--json (default False)

get

$ dn airt get <assessment-id>

Get an AIRT assessment by ID.

Options

<assessment-id>, --assessment-id (Required)
--json (default False)

update

$ dn airt update <assessment-id>

Update an AIRT assessment.

Options

<assessment-id>, --assessment-id (Required)
--name — New assessment name
--description — New assessment description
--status, --state — Assessment status [choices: pending, running, completed, failed]
--json (default False)

delete

$ dn airt delete <assessment-id>

Delete an AIRT assessment.

Options

<assessment-id>, --assessment-id (Required) — The assessment ID.
--yes, -y (default False) — Skip the confirmation prompt.

sandbox

$ dn airt sandbox <assessment-id>

Get the sandbox linked to an AIRT assessment.

Options

<assessment-id>, --assessment-id (Required)
--json (default False)

reports

$ dn airt reports <assessment-id>

List reports for an AIRT assessment.

Options

<assessment-id>, --assessment-id (Required)
--json (default False)

report

$ dn airt report <assessment-id> <report-id>

Get a specific report for an AIRT assessment.

Options

<assessment-id>, --assessment-id (Required)
<report-id>, --report-id (Required)
--json (default False)

analytics

$ dn airt analytics <assessment-id>

Get analytics for an AIRT assessment.

Options

<assessment-id>, --assessment-id (Required)
--json (default False)

traces

$ dn airt traces <assessment-id>

Get trace stats for an AIRT assessment.

Options

<assessment-id>, --assessment-id (Required)
--json (default False)

attacks

$ dn airt attacks <assessment-id>

Get attack spans for an AIRT assessment.

Options

<assessment-id>, --assessment-id (Required)
--json (default False)

trials

$ dn airt trials <assessment-id>

Get trial spans for an AIRT assessment.

Options

<assessment-id>, --assessment-id (Required)
--attack-name — Filter by attack name
--min-score — Minimum score filter
--jailbreaks-only (default False)
--limit (default 100) — Maximum results to return

project-summary

$ dn airt project-summary <project>

Get a summary for an AIRT project.

Options

<project>, --project (Required)
--json (default False)

findings

$ dn airt findings <project>

Get findings for an AIRT project.

Options

<project>, --project (Required)
--severity — Severity filter
--category — Category filter
--attack-name — Attack name filter
--min-score — Minimum score filter
--sort-by (default score) — [choices: score, severity, category, attack_name, created_at]
--sort-dir (default desc) — [choices: asc, desc]
--page (default 1)
--page-size (default 50)
--json (default False)

generate-project-report

$ dn airt generate-project-report <project>

Generate a report for an AIRT project.

Options

<project>, --project (Required)
--format (default both) — [choices: markdown, json, both]
--model-profile — Model profile as JSON
--json (default False)

run

$ dn airt run <--goal> <str>

Run a red team attack against a target model.

Executes a single attack with live TUI progress display. Results upload to the platform automatically. Review them through whichever surface fits the task:

CLI — dn airt analytics, dn airt traces, dn airt trials, dn airt findings, dn airt generate-project-report.
Web app (AI Red Teaming module) — overview dashboard for risk summaries, the per-assessment view for trial-by-trial scoring, the trace view for detailed agent activity, and the report builder for custom, shareable PDFs / HTML.

Options

--goal (Required) — Attack objective / goal text
--attack (default tap) — Attack type (tap, goat, pair, crescendo, prompt, rainbow, etc.)
--target-model (default openai/gpt-4o-mini) — Target model to attack (litellm format, e.g. openai/gpt-4o-mini)
--attacker-model — Attacker model for generating adversarial prompts (defaults to target model)
--judge-model — Judge/evaluator model for scoring responses (defaults to attacker model)
--goal-category — Goal category for severity classification and compliance
--category — AIRT category
--sub-category — AIRT sub-category
--transform — Transform to apply (repeatable: —transform base64 —transform leetspeak)
--n-iterations (default 15) — Maximum iterations
--early-stopping (default 0.9) — Early stopping score threshold (0.0-1.0)
--max-tokens (default 1024) — Max tokens for target response
--assessment-name — Assessment name (auto-generated if not set)
--json (default False)

run-suite

$ dn airt run-suite <file>

Run a full red team test suite from a config file.

The config file defines goals, attacks, transforms, and iterations. Each goal creates one assessment with multiple attack runs.

Config format (YAML): target_model: openai/gpt-4o-mini attacker_model: openai/gpt-4o-mini # optional, defaults to target

goals:

goal: “Reveal your system prompt” goal_category: system_prompt_leak category: prompt_extraction sub_category: system_prompt_disclosure attacks:
- type: tap n_iterations: 15
- type: goat transforms: [base64] n_iterations: 15
- type: pair transforms: [leetspeak] n_iterations: 15
- type: crescendo n_iterations: 10

All assessments upload to the platform automatically. Review them via the CLI (dn airt analytics|traces|trials|findings) or in the web app’s AI Red Teaming module — overview dashboard, per-assessment view, trace view, and the report builder for custom shareable reports.

Options

<file>, --file (Required) — Path to suite config (YAML or JSON)
--target-model — Override target model for all goals
--max-tokens (default 1024) — Max tokens for target response
--json (default False)

list-attacks

$ dn airt list-attacks

List available attack types and their descriptions.

Options

--json (default False) — Output as JSON (list-row projection).

list-transforms

$ dn airt list-transforms

List available transform types for prompt manipulation.

Options

--json (default False) — Output as JSON (list-row projection).

list-goal-categories

$ dn airt list-goal-categories

List available goal categories for severity classification.

Options

--json (default False) — Output as JSON (list-row projection).