Risk management
Agent risk assessment framework
Extend model risk assessment to include action risk, control dependency risk, systemic interaction risk (multi-agent/tool chains), and operational resilience risk
Open Level 3 detailExecutive-grade guidance for organisations that need to adopt agentic AI safely, calmly and at scale.
L1 detail
Level 2
Provide clear, consistent rules and responsibilities that can be enforced and audited across the organisation
Open Level 2 detail
Level 2
Prevent, detect, and contain unsafe actions and enable repeatable assurance at scale
Open Level 2 detail
Level 2
Identify and manage residual risk through the lifecycle and in live operation
Open Level 2 detail
Level 2
Ensure ownership, rapid decision-making, and defensible accountability when autonomous actions affect customers or operations
Open Level 2 detail
Level 2
Maintain compliance and reduce audit, enforcement, and reputational risk
Open Level 2 detail
Level 2
Make policy compliance consistent and scalable despite frequent changes to bundles, tools, access, and configuration
Open Level 2 detail
Level 2
Prioritise safe, high-value work and prove benefits while controlling risk
Open Level 2 detail
Level 2
Speed decisions and maintain executive oversight of agent fleets and associated risks
Open Level 2 detail
Risk management
Extend model risk assessment to include action risk, control dependency risk, systemic interaction risk (multi-agent/tool chains), and operational resilience risk
Open Level 3 detailPolicies
Extend acceptable use into explicit autonomy tiers, defined agent red lines, permitted action classes, tool boundaries, and approval thresholds (and link these to enforcement mechanisms in architecture/policy enforcement)
Open Level 3 detailPolicy enforcement
Enforce constraints at runtime on agent actions/tool calls (allowlists, thresholds, jurisdiction, data sensitivity), not post-hoc reviews; implement via a policy decision point intercepting tool invocation and configuration changes
Open Level 3 detailControls
Move from prompt tests to agent evaluations including tool-use, multi-step plans, trajectory correctness, and failure-mode behaviour
Open Level 3 detailOperating model and committees
Clarify which committees approve what for autonomous actions and cross-domain tool access, beyond standard GenAI oversight
Open Level 3 detailAccountability
Define who approves autonomy levels, tool integrations, policy exceptions, and go-live, not just who owns the model
Open Level 3 detailPortfolio prioritisation and funding
Expand portfolio triage to include agentic risk indicators (autonomy, tool access, action impact, resilience dependencies) and value indicators (throughput, cycle time, quality uplift), and use it to drive the required mitigations/controls
Open Level 3 detailCompliance
Map agent workflows/actions to applicable rules (recordkeeping, Consumer Duty, operational resilience, etc), not only model use
Open Level 3 detailAccountability
Establish traceability linking each agent bundle and action class to accountable humans and committees
Open Level 3 detailPortfolio prioritisation and funding
Track realised value vs expected and incremental value attributable to autonomy (cycle time, throughput, error reduction, avoided escalations), alongside safety KRIs (incidents, overrides, drift)
Open Level 3 detailOperating model and committees
Expand the existing AI/Responsible AI committee remit to include agentic AI and review membership to reflect deeper IT and operational involvement
Open Level 3 detailCompliance
Move from “how was this answer generated” to “why did the agent act” with action-level rationales and evidence links (supported by technology instrumentation)
Open Level 3 detailRisk management
Shift from static output bias checks to monitoring fairness of autonomous decisions and impacts (allocations, prioritisation, service levels) over time
Open Level 3 detailControls
Define where humans must approve/override autonomous actions (by action class and impact), not merely review generated content
Open Level 3 detailPolicies
Update the broader policy suite impacted by agents (eg responsible AI/data ethics, AI usage, privacy, cyber, resilience/operational risk, third-party risk, data quality, model risk/validation) and align definitions/requirements across them
Open Level 3 detailPolicy enforcement
Codify the most operational policies (autonomy tiers/red lines, privacy, security/tool access, resilience, validation/evals, recordkeeping) plus the SOPs that operationalise them, into versioned machine-readable rules
Open Level 3 detailControls
Expand logs from prompts/outputs to full action traces (inputs, tools, decisions, state, approvals) with consistent identifiers across systems
Open Level 3 detailCompliance
Maintain continuously updated evidence (policies, tests, logs, approvals, monitoring), enabled by automated evidence pipelines
Open Level 3 detailAccountability
Update existing escalation paths to include agent boundary violations, kill-switch invocation, and tool misuse scenarios
Open Level 3 detailRisk management
Upgrade playbooks to include agent disablement, tool credential rotation, rollback bundles, and customer remediation
Open Level 3 detailPolicy enforcement
Create rapid update mechanisms with testing and controlled rollout for policy rules and constraints
Open Level 3 detailOperating model and committees
Define decision flows for go-live approvals, exception approvals, and incident decisions while preserving the ability to use ad hoc escalation when needed
Open Level 3 detailPolicies
Move from static principles to scenario-based rules for autonomous choices and trade-offs (eg prioritisation, customer impact, escalation)
Open Level 3 detailControls
Expand the enterprise control library to include agent-specific controls (ring-fencing, tool governance, eval thresholds, monitoring) and automate control selection/application by risk tier and agent classification
Open Level 3 detailOperating model and committees
Shift ownership from “IT and data science innovation” to accountable business owners for agent outcomes and customer impact (with IT/data as delivery partners)
Open Level 3 detailPolicies
Expand compliance definitions from “AI use case” to explicit objects - agents, agent bundles, foundation models, connectors, tools, and AI platforms - and define required records, controls, and responsibilities per object
Open Level 3 detailRisk management
Define minimum monitoring for agent fleets (KPIs, KRIs, alert thresholds, evidencing, sampling regimes, response SLAs) and how it is operated
Open Level 3 detailPolicy enforcement
Define how exceptions are requested, approved, time-bound, monitored, and automatically revoked when expired, supported by workflow tooling and policy engines
Open Level 3 detailAccountability
Tie objectives to safe autonomy outcomes (quality, controls adherence, incident rates), not just delivery
Open Level 3 detailControls
Extend observability to include autonomy patterns, tool selection, boundary hits, override rates, and drift triggers, not only model accuracy
Open Level 3 detailPolicy enforcement
Stand up a governance platform to manage agent inventory/bundles, policies, approvals, monitoring views, and evidence automation (distinct from a “control hub” control library)
Open Level 3 detailOperating model and committees
Add operational rhythms for agent fleets (observability, KRIs, incidents, drift, overrides, value metrics), not occasional reporting
Open Level 3 detailRisk management
Move from prompt edge cases to simulation of real workflows, adversarial tool inputs, cascading failures, and boundary violations (ideally executed in a controlled sandbox)
Open Level 3 detailPolicies
Update supplier due diligence and contract clauses for agent connectors/tools (data use, logging, breach handling, residency, sub-processors, change notification, audit rights)
Open Level 3 detailPolicies
Update risk appetite to include measurable autonomy limits (impact thresholds, decision classes, spend caps, customer harm tolerance, override requirements)
Open Level 3 detailControls
Govern and test degrade modes (stop, revert to human, limited autonomy), not just infrastructure failover
Open Level 3 detailPolicy enforcement
Log policy versions, decisions, enforcement outcomes, and overrides across multiple policy domains (privacy, cyber, resilience, responsible AI) with consistent identifiers
Open Level 3 detailRisk management
Combine pre-mortems and unintended consequence scanning into structured ceremonies (ethical purpose, explainability, and control-breakdown workshops) focused on autonomous action pathways
Open Level 3 detailPolicies
Extend from “model owner” to defined owners for foundation models, agent service/bundle, solution design, infrastructure design, connector use, and tool ownership with clear obligations
Open Level 3 detailPolicies
Extend retention to agent memory, action traces, and tool outputs, aligned to privacy, evidencing, and dispute requirements
Open Level 3 detail