Every actionable rule in the Worker Remit, checked against the running code. Gap = declared but unenforced; Partial = enforced but incomplete or bypassable; Vague Policy = too imprecise to verify.
| Rule ID | Section | Rule (quoted) | Status | Finding |
|---|---|---|---|---|
| R-01 | Action Boundaries | "The CLI MUST bundle only from the project's declared sources — its configuration, system prompt, skills, declared MCP servers, and declared subagents — and MUST NOT pull in undeclared content." | Verified | — |
| R-02 | Approval Requirements | "A deploy that opens the deployed API without authentication MUST be confirmed by the operator before it runs; it MUST NOT proceed unattended." | Partial | PRAX-2026-05-29-001 |
| R-03 | Action Boundaries | "Remote MCP servers carried into the bundle MUST be reached over TLS, and their transport configuration MUST be validated before the bundle is shipped." | Partial | PRAX-2026-05-29-002 |
| R-04 | Action Boundaries | "Project configuration MUST be validated before bundling, and a bundle that fails validation MUST NOT be deployed." | Verified | — |
| R-05 | Forbidden Actions | "Credentials — deployment-platform keys, model-provider API keys, hub and tracing tokens, frontend auth secrets — MUST NOT be written into the project, committed to version control, logged, or printed." | Verified | — |
| R-06 | Forbidden Actions | "Secret material MUST NOT be embedded into bundle artifacts that are not meant to carry it — environment files travel as environment files, never folded into the seeded-memory or skills payload." | Verified | — |
| R-07 | Forbidden Actions | "The CLI MUST NOT silently mutate, generate, or deploy any agent surface the developer did not declare in the project." | Partial | PRAX-2026-05-29-001 |
| R-08 | Behavioral Expectations | "Before a deploy, the CLI MUST present the operator a clear summary of what the bundle contains and where it will be shipped — enough for the developer to recognise the surface being deployed." | Verified | — |
| R-09 | Behavioral Expectations | "A dry run MUST generate the deployment artifacts without shipping them or contacting the deployment platform's mutating endpoints." | Partial | PRAX-2026-05-29-006 |
| R-10 | Escalation and Limits | "The project SHOULD publish a threat model and a security-disclosure process, and SHOULD keep the threat model current with what the package actually ships — confirming that the bundler carries only declared sources, that secrets never land in the seeded payload, and that the unauthenticated-deploy confirmation gate genuinely fires." | Partial | PRAX-2026-05-29-005 |
| R-11 | Known Good Baseline | "Dependencies MUST be version-controlled with a committed, pinned lockfile, pinned to compatible ranges, and the dependency tree kept small and reviewable." | Verified | — |
| R-12 | Known Good Baseline | "The deployment bundle's own dependency set MUST be derived only from the project's declared model provider, declared MCP usage, declared sandbox provider, and declared auth provider — never hand-edited into the bundle out of band." | Partial | PRAX-2026-05-29-004 |
| R-13 | Known Good Baseline | "Remote MCP servers declared in the project MUST be pinned to a known-good, integrity-checked version, so the deployed agent does not auto-install an unpinned server afresh." | Gap | PRAX-2026-05-29-003 |
| R-14 | Behavioral Expectations | "The CLI operates under direct developer supervision as a one-shot command; it MUST NOT run as an unattended background service." | Verified | — |
Findings, ordered by severity — each linked to its remit rule, evidence, and a recommended action. Tag chips jump to the relevant entry in the RAISE framework, the OWASP LLM Top 10, or the OWASP Agentic Top 10.
HIGH PRAX-2026-05-29-001 The anonymous-open-API confirmation gate fires only when a frontend is configured, so an anonymous-auth deploy with no frontend ships an open API silently.
"A deploy that opens the deployed API without authentication MUST be confirmed by the operator before it runs; it MUST NOT proceed unattended. / The CLI MUST NOT silently mutate, generate, or deploy any agent surface the developer did not declare in the project."
- In `commands._deploy()` compute the anonymous-API condition from `config.auth.provider == "anonymous"` alone (drop the `frontend.enabled` requirement) so the warning and confirmation fire for every anonymous deploy.
- Add a unit test asserting that an anonymous-auth, no-frontend deploy raises the confirmation prompt and aborts on a non-`y` answer.
HIGH PRAX-2026-05-29-002 Deploy validates MCP transport type but never checks that http/sse MCP endpoints use TLS, so a plaintext http:// MCP server passes validation and ships.
"Remote MCP servers carried into the bundle MUST be reached over TLS, and their transport configuration MUST be validated before the bundle is shipped."
- In `_validate_mcp_for_deploy()`, after expanding `${VAR}` references, reject any http/sse server whose `url` does not begin with `https://` (allowing `http://127.0.0.1`/`localhost` only if a local-loopback exception is intended).
- Add a unit test that a plaintext `http://` MCP url produces a validation error and aborts the deploy.
HIGH PRAX-2026-05-29-003 Remote MCP servers are copied into the bundle by URL only, with no version pin or integrity check, so the deployed agent resolves an unpinned server at runtime.
"Remote MCP servers declared in the project MUST be pinned to a known-good, integrity-checked version, so the deployed agent does not auto-install an unpinned server afresh."
- Require declared remote MCP servers to carry a pinned, integrity-checkable reference (e.g. a content digest or version field) and verify it in `_validate_mcp_for_deploy()` before copying into the bundle.
- Record the resolved MCP server identity/digest in the bundle summary so the operator sees exactly which MCP surface is being shipped.
MEDIUM PRAX-2026-05-29-004 The generated bundle's inferred dependencies are emitted as bare unpinned package names with no committed bundle lockfile.
"The deployment bundle's own dependency set MUST be derived only from the project's declared model provider, declared MCP usage, declared sandbox provider, and declared auth provider — never hand-edited into the bundle out of band."
MEDIUM PRAX-2026-05-29-005 The deploy tooling installs no logging and the bundled threat model is stale, leaving deploy actions recorded only as transient print output.
"The project SHOULD publish a threat model and a security-disclosure process, and SHOULD keep the threat model current with what the package actually ships — confirming that the bundler carries only declared sources, that secrets never land in the seeded payload, and that the unauthenticated-deploy confirmation gate genuinely fires."
- Configure a structured logger (JSON or key-value, file or stderr handler) for the deploy path that records each bundle and deploy action with timestamp, agent name, auth mode, and destination.
- Regenerate `THREAT_MODEL.md` against the current `deepagents_cli.deploy` surface and add a SECURITY.md disclosure process.
MEDIUM PRAX-2026-05-29-006 The local `dev` command seeds the remote LangSmith hub repo before starting the local server, contacting a mutating platform endpoint during a nominally local run.
"A dry run MUST generate the deployment artifacts without shipping them or contacting the deployment platform's mutating endpoints."
MEDIUM PRAX-2026-05-29-007 No adversarial-testing artifact or security-scanning CI exists for the deploy package; testing is functional only.
Controls and behaviors that are correctly implemented and verified during this scan. These represent areas where the agent's implementation aligns with its stated policy and security best practices.
Bundle assembled solely from declared project sources
`bundler._build_seed` and `bundle()` read only the canonical declared layout (AGENTS.md, skills/, mcp.json, user/, subagents/) and never pull in undeclared content, satisfying the CLI's core declared-sources-only guarantee.
Secrets excluded from the seeded-memory payload
The bundler builds `_seed.json` from memory and skills only and copies `.env` as a standalone file, so model/platform credentials never get folded into the seeded-memory or skills payload.
Config validated before any bundle is produced
`DeployConfig.validate` checks required files, MCP transport, sandbox/auth/memories settings, and credentials, and `_deploy`/`_dev` abort on any error before bundling — a failing config is never shipped.
Committed pinned lockfile and dependency-hygiene CI for the CLI
The CLI ships a committed `uv.lock` and the repo runs dependabot plus `check_lockfiles` and `check_sdk_pin` workflows, keeping the CLI's own dependency tree pinned and reviewable.
Pre-deploy bundle summary presented to the operator
`print_bundle_summary` shows the agent name, model, auth mode (explicitly flagging "anonymous (API open to anyone)"), seeded files, MCP presence, sandbox, and destination before a deploy proceeds.
Log files found in the agent's workspace during this scan. Reviewing these files provides runtime evidence to complement the static analysis above.
Each card represents one category and shows the top 3 findings. All items in the Findings section.
Each card represents one category and shows the top 3 findings. All items in the Findings section.
Overall maturity assessment across the six categories of the RAISE framework. This is a maturity model, not a school grade: a score of 3 / 5 means Established, not 60 percent. Most production AI agents today score between Ad hoc (1) and Established (3). See the full RAISE framework reference for the complete scale and scoring.
Maturity Scoring Rubric
Every score above is based on this scale. A score is a snapshot of observable posture — not a verdict on the people or team behind the system.
| Score | Label | Meaning |
|---|---|---|
| 5 | Exemplary | Best-in-class; automated, continuously tested, reference quality. Rarely achieved in shipping systems. |
| 4 | Strong | Comprehensive controls, active management, minor gaps. Production-ready. |
| 3 | Established | Documented controls consistently applied; known gaps accepted. A respectable baseline. |
| 2 | Partial | Some controls exist but coverage is incomplete; key gaps remain. |
| 1 | Ad hoc | Informal or inconsistent measures; relies on individual judgment. |
| 0 | Absent | No evidence this category is addressed at all. |