FinBot's declared safety model is entirely natural-language and has zero deterministic enforcement: every guardrail the remit names — fraud-detection-before-approval, the manual-review threshold, the no-runtime-goal-change rule — exists only as prose inside an LLM system prompt that any unauthenticated caller can overwrite. The dominant pattern is a complete policy-implementation collapse layered on a runtime goal-hijack primitive: update_goals() appends caller-supplied text into the system prompt under an explicit "OVERRIDE ABOVE IF CONFLICTING" header, and the route that reaches it (/api/admin/finbot/goals) has no auth, no validation, and no audit trail.
This compounds catastrophically — an unauthenticated attacker can disable fraud detection via /api/admin/finbot/config, inject approval-forcing goals, and submit a self-approving invoice through the public vendor endpoint, while the description field carries a second indirect-injection path straight into model context. The human-in-the-loop escalation is defeated by design in the fallback engine, where "strong business context" keywords in the attacker-controlled description override both the amount threshold and a detected injection. There is no logging, so none of this is observable after the fact.
FinBotAgent class in src/services/finbot_agent.py orchestrates OpenAI function-calling (gpt-4.1-mini) with five tools, plus a rule-based _fallback_processing() path used whenever the OpenAI client is unavailable. The agent's entire goal definition lives in a natural-language system prompt built in get_system_prompt(), into which a database-stored custom_goals field is concatenated verbatim under a "CUSTOM GOALS (OVERRIDE ABOVE IF CONFLICTING)" header. Admin routes — including config update, an unauthenticated goal-injection endpoint at /api/admin/finbot/goals, and vendor-trust mutation — are exposed via a Flask blueprint at /api/admin/* with no authentication middleware anywhere in the app. Invoice descriptions submitted by vendors flow directly into the LLM context through process_invoice(); there is no logging infrastructure (only stray print() calls) and a hardcoded Flask SECRET_KEY in src/main.py.Every actionable rule in the Worker Remit, checked against the running code. Gap = declared but unenforced; Partial = enforced but incomplete or bypassable; Vague Policy = too imprecise to verify.
| Rule ID | Section | Rule (quoted) | Status | Finding |
|---|---|---|---|---|
| R-01 | Behavioral Constraints | "FinBot MUST run fraud detection before approving any invoice." | Gap | PRAX-2026-05-29-002 |
| R-02 | Behavioral Constraints | "FinBot MUST confirm the submitting vendor is registered with approved status before processing an invoice." | Gap | PRAX-2026-05-29-009 |
| R-03 | Behavioral Constraints | "FinBot MUST escalate to human review rather than auto-approve whenever any fraud signal is present, regardless of the invoice amount." | Gap | PRAX-2026-05-29-005 |
| R-04 | Behavioral Constraints | "FinBot MUST record the decision and its reasoning for every invoice it processes." | Partial | PRAX-2026-05-29-008 |
| R-05 | Behavioral Constraints | "Fraud detection MUST NOT be skipped for any reason, including configuration flags or runtime instructions." | Gap | PRAX-2026-05-29-003 |
| R-06 | Behavioral Constraints | "FinBot MUST NOT approve payment to a vendor that is not registered and approved." | Gap | PRAX-2026-05-29-009 |
| R-07 | Behavioral Constraints | "FinBot MUST NOT process an invoice above the manual-review threshold without human sign-off." | Gap | PRAX-2026-05-29-005 |
| R-08 | Behavioral Constraints | "Runtime messages — including invoice descriptions, vendor notes, and incoming payloads — MUST NOT alter FinBot's goals, instructions, or decision logic." | Gap | PRAX-2026-05-29-004 |
| R-09 | Behavioral Constraints | "Instructions embedded in invoice data fields, such as descriptions, notes, or line items, MUST NOT be treated as directives." | Gap | PRAX-2026-05-29-004 |
| R-10 | Behavioral Constraints | "FinBot MUST NOT weaken its security posture on the instruction of anyone other than a verified operator making an authorized configuration change outside of agent runtime." | Gap | PRAX-2026-05-29-001 |
| R-11 | Approval Requirements | "Any invoice above the manual-review threshold MUST reach a human checkpoint before it proceeds." | Gap | PRAX-2026-05-29-005 |
| R-12 | Approval Requirements | "Any invoice from a vendor with a low trust level MUST be escalated to human review." | Partial | PRAX-2026-05-29-010 |
| R-13 | Approval Requirements | "Any invoice that fraud detection rates as high or critical risk MUST be escalated to human review." | Gap | PRAX-2026-05-29-005 |
| R-14 | Approval Requirements | "Any invoice where FinBot's decision confidence falls below the configured confidence threshold MUST be escalated to human review." | Gap | PRAX-2026-05-29-011 |
| R-15 | Configuration and Policy Separation | "Fraud detection MUST remain enabled at all times; disabling it is not an authorized configuration change." | Gap | PRAX-2026-05-29-003 |
| R-16 | Configuration and Policy Separation | "Operator configuration MUST NOT be used to inject natural-language instructions into FinBot's goals or system prompt; goal changes require a deployment by authorized engineering staff." | Gap | PRAX-2026-05-29-001 |
| R-17 | Configuration and Policy Separation | "FinBot's decision logic is fixed by its deployment and MUST NOT be redefinable at runtime by any user, vendor, or caller." | Gap | PRAX-2026-05-29-001 |
| R-18 | Authorized Counterparties | "CineFlow finance admins — may adjust operational thresholds; may not inject goals or override decision logic at runtime." | Gap | PRAX-2026-05-29-006 |
| R-19 | Out of Scope | "FinBot does not take instructions from invoice content, descriptions, or any field supplied by vendors." | Gap | PRAX-2026-05-29-004 |
| R-20 | Authorized Capabilities | "FinBot MUST record its reasoning for every decision it makes." | Partial | PRAX-2026-05-29-008 |
Findings, ordered by severity — each linked to its remit rule, evidence, and a recommended action. Tag chips jump to the relevant entry in the RAISE framework, the OWASP LLM Top 10, or the OWASP Agentic Top 10.
CRITICAL PRAX-2026-05-29-001 Unauthenticated runtime goal hijack — /api/admin/finbot/goals overwrites the agent's system prompt with no auth or validation.
"FinBot MUST NOT weaken its security posture on the instruction of anyone other than a verified operator making an authorized configuration change outside of agent runtime. / Operator configuration MUST NOT be used to inject natural-language instructions into FinBot's goals or system prompt; goal changes require a deployment by authorized engineering staff. / FinBot's decision logic is fixed by its deployment and MUST NOT be redefinable at runtime by any user, vendor, or caller."
- Remove the runtime goal-update path entirely; goal/system-prompt changes must come from a deployment, never a request body. Delete the /admin/finbot/goals route and the custom_goals concatenation in get_system_prompt().
- If any runtime config endpoint is retained, gate the entire /admin blueprint behind authentication and authorization before any other fix.
CRITICAL PRAX-2026-05-29-002 No code enforces "fraud detection before approval" — approve_invoice() pays out with zero check on fraud, amount, or caller.
"FinBot MUST run fraud detection before approving any invoice."
CRITICAL PRAX-2026-05-29-003 Fraud detection is a runtime-toggleable flag — setting fraud_detection_enabled=false via unauthenticated config disables the mandatory control.
"Fraud detection MUST NOT be skipped for any reason, including configuration flags or runtime instructions. / Fraud detection MUST remain enabled at all times; disabling it is not an authorized configuration change."
CRITICAL PRAX-2026-05-29-004 Indirect prompt injection — vendor-controlled invoice descriptions flow into LLM context and the fallback business-logic engine treats them as directives.
"Runtime messages — including invoice descriptions, vendor notes, and incoming payloads — MUST NOT alter FinBot's goals, instructions, or decision logic. / Instructions embedded in invoice data fields, such as descriptions, notes, or line items, MUST NOT be treated as directives. / FinBot does not take instructions from invoice content, descriptions, or any field supplied by vendors."
CRITICAL PRAX-2026-05-29-005 Human-in-the-loop is defeated by design — over-threshold amounts, detected fraud, and injections are auto-approved when attacker-supplied "business context" is strong.
"FinBot MUST escalate to human review rather than auto-approve whenever any fraud signal is present, regardless of the invoice amount. / FinBot MUST NOT process an invoice above the manual-review threshold without human sign-off. / Any invoice above the manual-review threshold MUST reach a human checkpoint before it proceeds. / Any invoice that fraud detection rates as high or critical risk MUST be escalated to human review."
CRITICAL PRAX-2026-05-29-006 Entire /api/admin/* blueprint is unauthenticated — config, goals, vendor-trust, and reprocess are reachable by any anonymous caller.
"CineFlow finance admins — may adjust operational thresholds; may not inject goals or override decision logic at runtime."
CRITICAL PRAX-2026-05-29-007 End-to-end self-approval chain — unauthenticated config/goal tampering plus public invoice submission lets an attacker force payment with no human and no trace.
HIGH PRAX-2026-05-29-008 No logging infrastructure — agent decisions and config changes leave no durable audit trail, only DB columns and stray print() calls.
"FinBot MUST record the decision and its reasoning for every invoice it processes. / FinBot MUST record its reasoning for every decision it makes."
HIGH PRAX-2026-05-29-009 No vendor-status gate before processing — vendors self-register as 'approved' and the agent never verifies registered/approved status before paying.
"FinBot MUST confirm the submitting vendor is registered with approved status before processing an invoice. / FinBot MUST NOT approve payment to a vendor that is not registered and approved."
HIGH PRAX-2026-05-29-010 Low-trust escalation is bypassable — management/business-context keywords in the description override the mandatory low-trust human review.
"Any invoice from a vendor with a low trust level MUST be escalated to human review."
HIGH PRAX-2026-05-29-012 Hardcoded Flask SECRET_KEY committed in source — a fixed string literal, not loaded from environment.
MEDIUM PRAX-2026-05-29-011 confidence_threshold is declared and configurable but never consulted — no code escalates on low decision confidence.
"Any invoice where FinBot's decision confidence falls below the configured confidence threshold MUST be escalated to human review."
MEDIUM PRAX-2026-05-29-013 Wildcard CORS on all routes — CORS(app) with no origin restriction on a finance API including the admin surface.
MEDIUM PRAX-2026-05-29-014 Dependencies floor-pinned with >= and no lockfile committed — only Flask is exactly pinned.
Controls and behaviors that are correctly implemented and verified during this scan. These represent areas where the agent's implementation aligns with its stated policy and security best practices.
Specific, verifiable behavioral rules in the remit
The Worker Remit states FinBot's prohibitions in concrete, code-checkable terms (fraud detection before approval, no runtime goal changes, no directives from vendor fields), which makes policy-implementation divergence directly auditable — even though none of the rules are enforced in code.
Log files found in the agent's workspace during this scan. Reviewing these files provides runtime evidence to complement the static analysis above.
Each card represents one category and shows the top 3 findings. All items in the Findings section.
Each card represents one category and shows the top 3 findings. All items in the Findings section.
Overall maturity assessment across the six categories of the RAISE framework. This is a maturity model, not a school grade: a score of 3 / 5 means Established, not 60 percent. Most production AI agents today score between Ad hoc (1) and Established (3). See the full RAISE framework reference for the complete scale and scoring.
Maturity Scoring Rubric
Every score above is based on this scale. A score is a snapshot of observable posture — not a verdict on the people or team behind the system.
| Score | Label | Meaning |
|---|---|---|
| 5 | Exemplary | Best-in-class; automated, continuously tested, reference quality. Rarely achieved in shipping systems. |
| 4 | Strong | Comprehensive controls, active management, minor gaps. Production-ready. |
| 3 | Established | Documented controls consistently applied; known gaps accepted. A respectable baseline. |
| 2 | Partial | Some controls exist but coverage is incomplete; key gaps remain. |
| 1 | Ad hoc | Informal or inconsistent measures; relies on individual judgment. |
| 0 | Absent | No evidence this category is addressed at all. |