Model Risk Management 2.0: Translating MRM Principles to Generative AI

September 17, 2025 by

ak in Uncategorized

TL;DR: Traditional Model Risk Management (MRM) still applies—governance, validation, monitoring, change control—but generative AI (GenAI) adds new moving parts (prompts, retrieval, agents, tool-calls, content risks) and faster vendor updates. This playbook adapts core MRM to GenAI so you can ship value while staying audit-ready.

1) What “counts as a model” in GenAI (broaden your inventory

In GenAI, the “model” is not just the LLM. Inventory all risk-bearing components:

Base model (proprietary or open-weight) + version
Fine-tunes / adapters (LoRA, instruction datasets)
Retrieval layer (embeddings, vector index, ranking rules, data refresh cadence)
Prompt system (system prompts, templated prompts, guardrails)
Agents & tools (function-calling, RPA actions, external APIs)
Safety filters (PII/toxicity detectors, policy blocks)
Post-processors (compilers, format validators, redactors)

MRM principle: If a change to it can change an output that affects a decision, it belongs in scope.

2) What actually changed vs. classic predictive models

Open-world inputs: Free-text prompts and attachments → higher unpredictability.
Non-determinism: Temperature, context, and vendor drift introduce variance.
Content risk: Hallucination, toxicity, private-data leakage, copyright misuse.
Autonomy: Agents take actions (book refunds, schedule meetings) → operational risk.
Vendor velocity: Models update silently; “same endpoint, new behavior.”

Implication: Controls must cover content quality and action safety, not just accuracy.

3) A GenAI MRM lifecycle (mapped to familiar controls)

A) Governance & Materiality

Intended use: Describe decisions supported, users, and prohibited uses.
Materiality score (Low/Med/High) across:
1. Impact (financial, customer, regulatory)
2. Exposure (volume, autonomy)
3. Data sensitivity (PII/PHI/PCI/IP)
Owner & approver: Business owner accepts residual risk; Risk/Compliance signs off.

Evidence auditors expect: use-case brief, materiality worksheet, RACI.

B) Design & Data Controls

Data minimization: Only inject what’s needed; mask/redact PII by default.
Retrieval hygiene: Source-of-truth locations, freshness SLAs, dedup rules.
Policy boundaries: Disallow high-risk actions or require approval (HITL).

Evidence: data-flow diagram, field-level access matrix, redaction tests, retrieval freshness dashboard.

C) Development & Validatio

Create a validation plan that mixes quantitative and qualitative checks:

Golden set (100–500 real tasks) with expected outputs and allowed variance.
Quality metrics: factual grounding rate, citation correctness, completeness, format validity.
Risk metrics: hallucination rate, PII leakage detection, toxicity, bias/fairness screens.
Robustness: prompt-injection tests, jailbreak attempts, out-of-domain inputs.
Action safety: dry-run tools, idempotent APIs, pre-commit previews, rollback.

Evidence: test harness results, red-team report, sign-off memo.

D) Implementation & Monitoring

Controls at runtime:
- Routing: SLM default; escalate to LLM only when needed.
- Guardrails: allow/deny tool lists, sandboxed actions, rate limits.
- HITL: approvals for irreversible actions; diff view before commit.
Observability: log prompt, context snapshot, retrieved sources, tool calls, outputs, human edits.
Live metrics: p95 latency, cost/task, escalation rate, grounding %, incident count.

Evidence: runbook, dashboards, immutable traces, support/incident procedures.

E) Change Management (the GenAI hotspot)

Treat prompts, policies, retrieval configs and model versions as code:

Versioning: every change gets an ID, diff, tests, approver, rollback plan.
Vendor updates: record upstream model hash/version; re-run golden set; pause/roll back if regressions.
Material changes trigger revalidation (e.g., new data source, autonomy level, audience).

Evidence: changelog, approvals, regression results, rollback artifacts.

F) Documentation (right-sized, not encyclopedic)

System card (how the whole pipeline works)
Model card (intended use, limits, known risks, eval results)
User guidance (what to review, when to escalate)
Data sheets (sources, retention, masking)

Evidence: PDFs or wiki pages linked to each deployment record.

4) Practical control checklist (copy/paste into your tracker)

Policy & Governance

AI Acceptable Use Policy updated for GenAI
Use-case intake with owner, materiality, prohibited uses
TPRM (third-party risk) completed for vendors/endpoints

Data & Privacy

PII redaction and field-level access enforced
Retrieval data cataloged; freshness and dedup rules defined
No training on customer data without explicit approval

Validation & Safety

Golden set + success criteria defined
Hallucination, toxicity, bias, injection tests passed thresholds
Tool actions simulated; HITL gates for irreversible operations

Monitoring & Incidents

Tracing enabled end-to-end (prompt → tools → output)
Alerts for grounding drop, cost spikes, error bursts
Incident playbook for data leak or unsafe action

Change Control

Prompts, retrieval, and models under version control
Vendor/model updates logged; regression run required
Rollback path tested quarterly

5) Scoring materiality & deciding review depth

Simple 3×3 grid (score 1–3 each):

Impact: 1 = cosmetic text; 2 = customer communications; 3 = money/health/legal
Exposure: 1 = low volume, suggestions; 2 = medium volume, HITL; 3 = high volume, autonomous actions
Data: 1 = public; 2 = internal; 3 = regulated/PII/code/IP

3–4 (Low): lightweight validation, monthly monitoring
5–6 (Medium): golden set + red team + monthly drift checks
7–9 (High): full validation, HITL, weekly monitoring, quarterly revalidation

6) KPIs & KRIs to prove control (and value)

Grounding rate (% claims with citations)
Contradiction rate (model vs. retrieved evidence)
Hallucination rate (unsupported statements)
HITL adherence (% irreversible actions with approval)
Escalation rate (to LLM/human) & trend
Unit cost per completed task (inference + API + ops)
Bias indicators (domain-appropriate tests)
Incident count & time-to-contain
Business value proxy (edit distance/time saved/first-contact resolution)

7) Third-party AI risk: five clauses you shouldn’t skip

Data use & retention: no training on your data; default deletion after processing; verified purge on request.
Inference isolation: VPC or tenant isolation; documented side-channel protections.
Security posture: encryption, keys, breach windows, pen-test cadence, independent assurance (e.g., SOC/ISO).
Model transparency: version pinning, change logs, incident reporting obligations.
Exit & portability: export of prompts, logs, fine-tunes, embeddings; reasonable assistance to migrate.

8) 30/60/90-day rollout plan

Days 0–30 — Baseline & guardrails

Stand up a use-case intake and inventory (include prompts/retrieval).
Publish a one-page AI AUP and HITL standard.
Build a golden set for your top two GenAI use cases.
Enable tracing and cost/latency dashboards.

Days 31–60 — Validate & monitor

Run full validation (quality + risk + robustness) for those use cases.
Add PII redaction and allow/deny tool lists.
Define regression thresholds; wire vendor version checks.
Start monthly drift reviews with business and risk.

Days 61–90 — Industrialize

Put prompts/retrieval/configs into version control with approvals + rollback.
Integrate TPRM clauses with top vendors; pin versions where possible.
Expand golden set and coverage to next 3–5 use cases.
Publish system/model cards and the MRM operating procedure.

9) Common pitfalls (and simple fixes)

Treating prompts as “not code.”
Fix: version, test, review, and rollback prompts like code.
No golden set = no regression alarm.
Fix: build a small but real test suite; update quarterly.
HITL omitted for speed.
Fix: require approvals only for irreversible actions; auto-approve safe paths.
Over-centralizing approvals.
Fix: delegate within guardrails; focus risk on high-materiality changes.
Opaque vendor updates.
Fix: contract for version visibility, keep a shadow canary set, and pause on regressions.

10) Templates (steal this language)

Use-Case Statement (1 page)

Goal & decision supported
Users & prohibited uses
Materiality score & rationale
Data classes & masking rules
Controls: HITL, guardrails, tools allowed
KPIs/KRIs & thresholds
Rollback & contingency

Release Note (per change)

Change ID + description
Affected components (prompt, retrieval, base model, tool)
Test results vs. thresholds (pass/fail)
Approver & date
Rollback steps & owner

Bottom line

MRM for GenAI is not a new bureaucracy—it’s a tighter loop: inventory → validate → monitor → change-control, with special attention to content risks and action safety. Do the basics well, keep evidence close, and you’ll move faster and safer—because you’ll know exactly what changed, why it’s acceptable, and how to reverse it when it isn’t.

[email protected]

+420257325117

Blog