Model Risk Management 2.0: Translating MRM Principles to Generative AI
TL;DR: Traditional Model Risk Management (MRM) still applies—governance, validation, monitoring, change control—but generative AI (GenAI) adds new moving parts (prompts, retrieval, agents, tool-calls, content risks) and faster vendor updates. This playbook adapts core MRM to GenAI so you can ship value while staying audit-ready.
1) What “counts as a model” in GenAI (broaden your inventory
In GenAI, the “model” is not just the LLM. Inventory all risk-bearing components:
- Base model (proprietary or open-weight) + version
- Fine-tunes / adapters (LoRA, instruction datasets)
- Retrieval layer (embeddings, vector index, ranking rules, data refresh cadence)
- Prompt system (system prompts, templated prompts, guardrails)
- Agents & tools (function-calling, RPA actions, external APIs)
- Safety filters (PII/toxicity detectors, policy blocks)
- Post-processors (compilers, format validators, redactors)
MRM principle: If a change to it can change an output that affects a decision, it belongs in scope.
2) What actually changed vs. classic predictive models
- Open-world inputs: Free-text prompts and attachments → higher unpredictability.
- Non-determinism: Temperature, context, and vendor drift introduce variance.
- Content risk: Hallucination, toxicity, private-data leakage, copyright misuse.
- Autonomy: Agents take actions (book refunds, schedule meetings) → operational risk.
- Vendor velocity: Models update silently; “same endpoint, new behavior.”
Implication: Controls must cover content quality and action safety, not just accuracy.
3) A GenAI MRM lifecycle (mapped to familiar controls)
A) Governance & Materiality
- Intended use: Describe decisions supported, users, and prohibited uses.
- Materiality score (Low/Med/High) across:
- Impact (financial, customer, regulatory)
- Exposure (volume, autonomy)
- Data sensitivity (PII/PHI/PCI/IP)
- Owner & approver: Business owner accepts residual risk; Risk/Compliance signs off.
Evidence auditors expect: use-case brief, materiality worksheet, RACI.
B) Design & Data Controls
- Data minimization: Only inject what’s needed; mask/redact PII by default.
- Retrieval hygiene: Source-of-truth locations, freshness SLAs, dedup rules.
- Policy boundaries: Disallow high-risk actions or require approval (HITL).
Evidence: data-flow diagram, field-level access matrix, redaction tests, retrieval freshness dashboard.
C) Development & Validatio
Create a validation plan that mixes quantitative and qualitative checks:
- Golden set (100–500 real tasks) with expected outputs and allowed variance.
- Quality metrics: factual grounding rate, citation correctness, completeness, format validity.
- Risk metrics: hallucination rate, PII leakage detection, toxicity, bias/fairness screens.
- Robustness: prompt-injection tests, jailbreak attempts, out-of-domain inputs.
- Action safety: dry-run tools, idempotent APIs, pre-commit previews, rollback.
Evidence: test harness results, red-team report, sign-off memo.
D) Implementation & Monitoring
- Controls at runtime:
- Routing: SLM default; escalate to LLM only when needed.
- Guardrails: allow/deny tool lists, sandboxed actions, rate limits.
- HITL: approvals for irreversible actions; diff view before commit.
- Observability: log prompt, context snapshot, retrieved sources, tool calls, outputs, human edits.
- Live metrics: p95 latency, cost/task, escalation rate, grounding %, incident count.
Evidence: runbook, dashboards, immutable traces, support/incident procedures.
E) Change Management (the GenAI hotspot)
Treat prompts, policies, retrieval configs and model versions as code:
- Versioning: every change gets an ID, diff, tests, approver, rollback plan.
- Vendor updates: record upstream model hash/version; re-run golden set; pause/roll back if regressions.
- Material changes trigger revalidation (e.g., new data source, autonomy level, audience).
Evidence: changelog, approvals, regression results, rollback artifacts.
F) Documentation (right-sized, not encyclopedic)
- System card (how the whole pipeline works)
- Model card (intended use, limits, known risks, eval results)
- User guidance (what to review, when to escalate)
- Data sheets (sources, retention, masking)
Evidence: PDFs or wiki pages linked to each deployment record.
4) Practical control checklist (copy/paste into your tracker)
Policy & Governance
- AI Acceptable Use Policy updated for GenAI
- Use-case intake with owner, materiality, prohibited uses
- TPRM (third-party risk) completed for vendors/endpoints
Data & Privacy
- PII redaction and field-level access enforced
- Retrieval data cataloged; freshness and dedup rules defined
- No training on customer data without explicit approval
Validation & Safety
- Golden set + success criteria defined
- Hallucination, toxicity, bias, injection tests passed thresholds
- Tool actions simulated; HITL gates for irreversible operations
Monitoring & Incidents
- Tracing enabled end-to-end (prompt → tools → output)
- Alerts for grounding drop, cost spikes, error bursts
- Incident playbook for data leak or unsafe action
Change Control
- Prompts, retrieval, and models under version control
- Vendor/model updates logged; regression run required
- Rollback path tested quarterly
5) Scoring materiality & deciding review depth
Simple 3×3 grid (score 1–3 each):
- Impact: 1 = cosmetic text; 2 = customer communications; 3 = money/health/legal
- Exposure: 1 = low volume, suggestions; 2 = medium volume, HITL; 3 = high volume, autonomous actions
- Data: 1 = public; 2 = internal; 3 = regulated/PII/code/IP
- 3–4 (Low): lightweight validation, monthly monitoring
- 5–6 (Medium): golden set + red team + monthly drift checks
- 7–9 (High): full validation, HITL, weekly monitoring, quarterly revalidation
6) KPIs & KRIs to prove control (and value)
- Grounding rate (% claims with citations)
- Contradiction rate (model vs. retrieved evidence)
- Hallucination rate (unsupported statements)
- HITL adherence (% irreversible actions with approval)
- Escalation rate (to LLM/human) & trend
- Unit cost per completed task (inference + API + ops)
- Bias indicators (domain-appropriate tests)
- Incident count & time-to-contain
- Business value proxy (edit distance/time saved/first-contact resolution)
7) Third-party AI risk: five clauses you shouldn’t skip
- Data use & retention: no training on your data; default deletion after processing; verified purge on request.
- Inference isolation: VPC or tenant isolation; documented side-channel protections.
- Security posture: encryption, keys, breach windows, pen-test cadence, independent assurance (e.g., SOC/ISO).
- Model transparency: version pinning, change logs, incident reporting obligations.
- Exit & portability: export of prompts, logs, fine-tunes, embeddings; reasonable assistance to migrate.
8) 30/60/90-day rollout plan
Days 0–30 — Baseline & guardrails
- Stand up a use-case intake and inventory (include prompts/retrieval).
- Publish a one-page AI AUP and HITL standard.
- Build a golden set for your top two GenAI use cases.
- Enable tracing and cost/latency dashboards.
Days 31–60 — Validate & monitor
- Run full validation (quality + risk + robustness) for those use cases.
- Add PII redaction and allow/deny tool lists.
- Define regression thresholds; wire vendor version checks.
- Start monthly drift reviews with business and risk.
Days 61–90 — Industrialize
- Put prompts/retrieval/configs into version control with approvals + rollback.
- Integrate TPRM clauses with top vendors; pin versions where possible.
- Expand golden set and coverage to next 3–5 use cases.
- Publish system/model cards and the MRM operating procedure.
9) Common pitfalls (and simple fixes)
- Treating prompts as “not code.”
Fix: version, test, review, and rollback prompts like code. - No golden set = no regression alarm.
Fix: build a small but real test suite; update quarterly. - HITL omitted for speed.
Fix: require approvals only for irreversible actions; auto-approve safe paths. - Over-centralizing approvals.
Fix: delegate within guardrails; focus risk on high-materiality changes. - Opaque vendor updates.
Fix: contract for version visibility, keep a shadow canary set, and pause on regressions.
10) Templates (steal this language)
Use-Case Statement (1 page)
- Goal & decision supported
- Users & prohibited uses
- Materiality score & rationale
- Data classes & masking rules
- Controls: HITL, guardrails, tools allowed
- KPIs/KRIs & thresholds
- Rollback & contingency
Release Note (per change)
- Change ID + description
- Affected components (prompt, retrieval, base model, tool)
- Test results vs. thresholds (pass/fail)
- Approver & date
- Rollback steps & owner
Bottom line
MRM for GenAI is not a new bureaucracy—it’s a tighter loop: inventory → validate → monitor → change-control, with special attention to content risks and action safety. Do the basics well, keep evidence close, and you’ll move faster and safer—because you’ll know exactly what changed, why it’s acceptable, and how to reverse it when it isn’t.