Trustworthiness Measurement Catalogue — Metrics for Safety, Fairness, Robustness, Explainability, and More
Trustworthiness Measurement Catalogue — Metrics for Safety, Fairness, Robustness, Explainability, and More
NIST AI RMF Implementation Metrics & Evaluation
+ On this page
Key takeaways
- Trustworthiness dimensions provide quantitative evidence of AI system safety, fairness, and reliability.
- Metrics must be tracked continuously, not only at model training.
- All measures feed into AIMS performance evaluation (§9.1) and Management Review (§9.3).
Overview & context
Trustworthiness metrics translate abstract ethical and safety expectations into measurable, auditable indicators.
They are used to monitor, validate, and improve AI systems continuously through dashboards and audit evidence packs.
Zen AI Governance adopts a hybrid approach combining NIST AI RMF “MEASURE” guidance with ISO/IEC 42001 performance evaluation and EU AI Act Article 15 requirements.
Measurement framework
- All metrics grouped under six trustworthiness pillars: Safety, Fairness, Robustness, Explainability, Privacy, and Security.
- Each metric includes: Name, Definition, Formula, Threshold, Frequency, Owner, Evidence Reference.
- Results tracked in the AI Trustworthiness Dashboard and reviewed quarterly.
- Metrics aligned with ISO 42001 §9.1 (performance monitoring) and §10 (improvement).
Safety & reliability metrics
| Metric | Definition | Threshold |
|---|
| Failure Rate | Ratio of incorrect AI outputs to total predictions/actions. | < 2% |
| Recovery Time | Average time to recover from AI outage or rollback. | < 10 min |
| Human Override Success | % of oversight interventions resolving anomalies without incident escalation. | > 95% |
Fairness & bias metrics
| Metric | Definition | Threshold |
|---|
| Demographic Parity | Probability of positive outcome equal across protected groups. | ≤ 5% disparity |
| Equal Opportunity Difference | Difference in true positive rate across groups. | ≤ 5% |
| Bias Drift Index | Change in fairness metrics over time (drift detection). | ≤ 2% / quarter |
Robustness & resilience metrics
| Metric | Definition | Threshold |
|---|
| Model Drift | Deviation of live performance vs training baseline. | < 5% |
| Input Perturbation Robustness | Accuracy variance under minor input noise. | < 3% |
| Data Poisoning Resistance | Drop in performance when attacked with synthetic corruptions. | < 5% |
Explainability & transparency metrics
- Feature Attribution Stability: Variance in SHAP/LIME scores across retrains < 10%.
- Explanation Coverage: % of predictions accompanied by user-facing rationale ≥ 90%.
- Documentation Completeness Index: % of required fields in Model Card completed ≥ 95%.
- User Clarity Rating: Average comprehension score in usability testing ≥ 4/5.
Privacy & data integrity metrics
- Re-identification Risk: Probability of re-linking anonymised data ≤ 0.05%.
- Data Completeness: Missing values in dataset ≤ 2%.
- Retention Compliance: Records within defined retention period ≥ 99%.
- Access Breach Count: Zero unauthorised access incidents per quarter.
Cybersecurity & attack resistance
- Vulnerability Patch Lag: Time from patch release to deployment ≤ 14 days.
- Prompt Injection Defense Rate: % of detected + blocked malicious inputs ≥ 98%.
- Model Watermark Verification: Regular validation of model integrity hash (weekly).
- Incident Response SLA: Response to model threat alerts ≤ 4 hours.
Implementation & dashboards
- Metrics collected automatically via ML monitoring tools (Weights & Biases, EvidentlyAI, Arize, etc.).
- Results visualised in Power BI / Looker dashboards and reviewed monthly.
- Breaches of thresholds automatically create CAPA entries in AIMS.
- Evidence (plots, logs, reports) linked to each metric ID for audits.
Checklist & references
- All six trustworthiness pillars implemented and reviewed quarterly.
- Metrics integrated into automated monitoring pipelines.
- Dashboard live with thresholds and CAPA triggers configured.
- Quarterly trend reports submitted to AI Governance Board.
- Aligned references: ISO/IEC 42001 §9.1, NIST AI RMF “MEASURE”, EU AI Act Art.15.
© Zen AI Governance UK Ltd • Regulatory Knowledge • v1 11 Nov 2025 • This page is general guidance, not legal advice.
Related Articles
Dashboards & Governance Reporting — Metrics, KPIs, Incident Trends & Waiver Dashboards
Zen AI Governance — Knowledge Base • EU/UK alignment • Updated 12 Nov 2025 www.zenaigovernance.com ↗ Dashboards & Governance Reporting — Metrics, KPIs, Incident Trends & Waiver Dashboards NIST AI RMF Implementation Governance Analytics + On this page ...
Transparency & User Disclosure Policy — Communication, Explainability & User Rights
Zen AI Governance — Knowledge Base • Transparency & User Rights • Updated 15 Nov 2025 www.zenaigovernance.com ↗ Transparency & User Disclosure Policy — Communication, Explainability & User Rights Governance & Policies EU/UK Aligned + On this page On ...
Building an AIMS End-to-End (ISO/IEC 42001:2023)
Zen AI Governance — Knowledge Base • EU/UK alignment • Updated 07 Nov 2025 www.zenaigovernance.com ↗ ISO/IEC 42001 AIMS — Risk Management Method & Waivers (EU/UK aligned) ISO/IEC 42001 – AIMS Risk Management EU/UK aligned + On this page On this page ...
Master AI Policy — Purpose, Roles, Requirements & Enforcement
Zen AI Governance — Knowledge Base • Organisational Policy • Updated 14 Nov 2025 www.zenaigovernance.com ↗ Master AI Policy — Purpose, Roles, Requirements & Enforcement Governance & Policies EU/UK Aligned + On this page On this page Purpose & ...
Risk Management Framework & Treatment Plan (Clause 6.1 — EU/UK aligned)
Zen AI Governance — Knowledge Base • EU/UK alignment • Updated 08 Nov 2025 www.zenaigovernance.com ↗ Risk Management Framework & Treatment Plan (ISO/IEC 42001:2023) ISO/IEC 42001 – AIMS Risk Management EU/UK Aligned + On this page On this page ...