Open source · runs locally · nothing phones home

Make sure your agent does its job —
and only its job.

Praxen compares an AI agent's declared policy against real evidence — its code, live deployment state, and behavioral logs — and reports exactly where observed behavior drifts from intent, before it becomes a risk.

Praxy, the Praxen fox, inspecting a small robot agent

The loop

Define the job. Test against the job.

Agent Behavior Verification focuses on what's most critical — declared intent versus observed reality.

Step 1

Define the job

Write a Worker Remit — the agent's mission, authorized tools, approved channels, counterparties, and forbidden actions.

Step 2

Test reality

Point Praxen at evidence — source code, deployment files, or behavioral history — and it reads the workspace the way an auditor would.

Step 3

Find the gap

Every finding answers one question — does observed behavior match declared intent? — and cites the exact rule and evidence.

Step 4

Report locally

A self-contained HTML report, machine-readable JSON, and a plain-text summary land in ./reports/. Nothing leaves your machine.

In 30 seconds

One sentence to your coding agent

That's the whole interface: "Run a Praxen behavior analysis on ./my-agent." Praxen does the rest.

Declare intent

Write a Worker Remit by hand, or have Praxen draft one from your description or docs. It's the only artifact you customize per agent.

Point at evidence

Source code, live deployment state, conversation logs, governance docs — any mix. Praxen finds the remit, reads the workspace, compares.

Read the gap

Findings tag against OWASP and RAISE, chain into compound attack paths, and trace to the exact remit rule they violate.

Verification patterns

What Praxen catches

Every analysis runs a battery of named detection patterns — not just prompt-injection screening or known-bad code signatures.

  • Policy-implementation divergencethe code or behavior doesn't do what the policy document says.
  • Credential exposuresecrets surfacing in unexpected locations across the workspace.
  • Configuration gapsauto-approved exec, disabled loop detection, missing rate limits.
  • Capability driftnew tools or outbound destinations not in the authorized baseline.
  • Compound signal reasoningindividual findings chained when they combine into a high-severity path.
  • Secondary prompt discoverysession-loaded identity files like SOUL.md / AGENTS.md audited as system prompts.

Get started in minutes

Install the Claude Code plugin

Praxen runs as a plugin in your coding agent. No pip install — the report renderer is Python-stdlib-only. Add the marketplace, install, and point it at an agent.

Copied ✓

Full guide: Installation · Quickstart

Claude Code
# 1 · add the Praxen marketplace
> /plugin marketplace add open-agent-ai-security/praxen

# 2 · install the plugin
> /plugin install praxen@open-agent-ai-security

# 3 · run an analysis
> Run a Praxen behavior analysis on ./my-agent

Verify your agent before it ships.

Praxen runs pre-deployment and on every release — open source, built for the community.