This walks you from “Praxen is installed” to “I have a real report” against one of the bundled examples — finbot, a deliberately vulnerable invoice-processing agent from the OWASP Agentic AI CTF. No editing your own agent required.
If you haven’t installed yet, do Installation first (one command from the marketplace).
If you don’t have a local copy of the Praxen repository yet, clone it (or unzip a release):
git clone https://github.com/open-agent-ai-security/praxen.git
cd praxen
The pieces you’ll point Praxen at are already inside examples/finbot/:
examples/finbot/
WORKER_REMIT.md ← the policy doc you'll verify against
finbot-analysis.html ← the committed report (what your run should approximately produce)
finbot-findings.json ← the same content as canonical JSON
A real first scan would use your agent’s source plus a remit you wrote. We’re using the pre-staged ones so the first run has no moving parts.
The example was developed against the CineFlow Productions finbot from the OWASP Agentic AI CTF (see examples/README.md for the full provenance). Clone the source so Praxen has a real workspace to read:
git clone https://github.com/OWASP-ASI/finbot-ctf-demo.git ../finbot-src
(Any directory will do — ../finbot-src keeps the clone outside the Praxen tree and works the same on macOS, Linux, and Windows.)
From a Claude Code session in the praxen repo directory, ask the agent:
Please run the behavior-verifier skill against ../finbot-src.
Use the Worker Remit at examples/finbot/WORKER_REMIT.md. Write outputs
to ./reports/finbot-quickstart/.
That’s the whole prompt. Praxen will:
../finbot-src./reports/finbot-quickstart/The skill prints an interim overview to stdout while it works. When it finishes you’ll have:
./reports/finbot-quickstart/
finbot-findings-YYYY-MM-DD.json ← canonical record
finbot-analysis-YYYY-MM-DD-HHMMSS.html ← human-readable report
finbot-analysis-YYYY-MM-DD-HHMMSS.txt ← plain-text summary
open ./reports/finbot-quickstart/finbot-analysis-*.html
(open is macOS; on Linux use xdg-open, on Windows the file works in any browser.)
What you should see, top to bottom:
Verified / Gap / Partial / Vague / ENP)You can compare your fresh report against the published FinBot example report (rendered on GitHub Pages). It won’t be byte-identical (LLM analyses have run-to-run variance) but the dominant Critical themes, the broad RAISE shape, and the remit-coverage counts should be close. See tests/README.md for what “close” actually means for this target.
The pattern is identical with a real target:
See Usage for the full set of input shapes and the running-an-analysis details.
See the Troubleshooting section in usage.md. The most common first-run snags:
/reload-pluginsrender.py errored at the end — the LLM produced a malformed findings JSON; re-run with more context window or a more focused workspace path./reports/<agent>-draft-<timestamp>.md; tell the agent to read it and finish from there. See Usage § Large workspaces and context sizing.