Trustworthiness Measurement Catalogue — Metrics for Safety, Fairness, Robustness, Explainability, and More

Trustworthiness Measurement Catalogue — Metrics for Safety, Fairness, Robustness, Explainability, and More

Zen AI Governance — Knowledge Base EU/UK alignment Updated 11 Nov 2025 www.zenaigovernance.com ↗

Trustworthiness Measurement Catalogue — Metrics for Safety, Fairness, Robustness, Explainability, and More

NIST AI RMF Implementation Metrics & Evaluation
+ On this page
Key takeaways
  • Trustworthiness dimensions provide quantitative evidence of AI system safety, fairness, and reliability.
  • Metrics must be tracked continuously, not only at model training.
  • All measures feed into AIMS performance evaluation (§9.1) and Management Review (§9.3).

Overview & context

Trustworthiness metrics translate abstract ethical and safety expectations into measurable, auditable indicators. They are used to monitor, validate, and improve AI systems continuously through dashboards and audit evidence packs. Zen AI Governance adopts a hybrid approach combining NIST AI RMF “MEASURE” guidance with ISO/IEC 42001 performance evaluation and EU AI Act Article 15 requirements.

Measurement framework

  • All metrics grouped under six trustworthiness pillars: Safety, Fairness, Robustness, Explainability, Privacy, and Security.
  • Each metric includes: Name, Definition, Formula, Threshold, Frequency, Owner, Evidence Reference.
  • Results tracked in the AI Trustworthiness Dashboard and reviewed quarterly.
  • Metrics aligned with ISO 42001 §9.1 (performance monitoring) and §10 (improvement).

Safety & reliability metrics

MetricDefinitionThreshold
Failure RateRatio of incorrect AI outputs to total predictions/actions.< 2%
Recovery TimeAverage time to recover from AI outage or rollback.< 10 min
Human Override Success% of oversight interventions resolving anomalies without incident escalation.> 95%

Fairness & bias metrics

MetricDefinitionThreshold
Demographic ParityProbability of positive outcome equal across protected groups.≤ 5% disparity
Equal Opportunity DifferenceDifference in true positive rate across groups.≤ 5%
Bias Drift IndexChange in fairness metrics over time (drift detection).≤ 2% / quarter

Robustness & resilience metrics

MetricDefinitionThreshold
Model DriftDeviation of live performance vs training baseline.< 5%
Input Perturbation RobustnessAccuracy variance under minor input noise.< 3%
Data Poisoning ResistanceDrop in performance when attacked with synthetic corruptions.< 5%

Explainability & transparency metrics

  • Feature Attribution Stability: Variance in SHAP/LIME scores across retrains < 10%.
  • Explanation Coverage: % of predictions accompanied by user-facing rationale ≥ 90%.
  • Documentation Completeness Index: % of required fields in Model Card completed ≥ 95%.
  • User Clarity Rating: Average comprehension score in usability testing ≥ 4/5.

Privacy & data integrity metrics

  • Re-identification Risk: Probability of re-linking anonymised data ≤ 0.05%.
  • Data Completeness: Missing values in dataset ≤ 2%.
  • Retention Compliance: Records within defined retention period ≥ 99%.
  • Access Breach Count: Zero unauthorised access incidents per quarter.

Cybersecurity & attack resistance

  • Vulnerability Patch Lag: Time from patch release to deployment ≤ 14 days.
  • Prompt Injection Defense Rate: % of detected + blocked malicious inputs ≥ 98%.
  • Model Watermark Verification: Regular validation of model integrity hash (weekly).
  • Incident Response SLA: Response to model threat alerts ≤ 4 hours.

Implementation & dashboards

  • Metrics collected automatically via ML monitoring tools (Weights & Biases, EvidentlyAI, Arize, etc.).
  • Results visualised in Power BI / Looker dashboards and reviewed monthly.
  • Breaches of thresholds automatically create CAPA entries in AIMS.
  • Evidence (plots, logs, reports) linked to each metric ID for audits.

Checklist & references

  • All six trustworthiness pillars implemented and reviewed quarterly.
  • Metrics integrated into automated monitoring pipelines.
  • Dashboard live with thresholds and CAPA triggers configured.
  • Quarterly trend reports submitted to AI Governance Board.
  • Aligned references: ISO/IEC 42001 §9.1, NIST AI RMF “MEASURE”, EU AI Act Art.15.

© Zen AI Governance UK Ltd • Regulatory Knowledge • v1 11 Nov 2025 • This page is general guidance, not legal advice.