An AI system survives audit when an external assessor, a regulator, or an internal compliance officer can pick any single output produced by the system in production - a credit decision, a contract clause, a triage recommendation - and reconstruct the full evidence chain behind it. That chain has six links: the source documents or transactions that were the input, the retrieval and pre-processing steps applied to them, the prompt and model version invoked, the post-processing and validation rules applied to the model output, the human-oversight gate that approved or rejected the output, and the timestamped log entry that records all of this [1].
The EU AI Act, Regulation (EU) 2024/1689, formalises most of these requirements for systems classified as high-risk under Annex III, including credit scoring, recruitment, education, law enforcement, and access to essential services. Article 12 requires automatic logging across the system lifecycle. Article 13 requires the system to be transparent enough for the deployer to interpret its output. Article 14 requires effective human oversight. Article 15 requires accuracy, robustness, and cybersecurity to be designed in. Article 17 requires a quality management system. None of these are negotiable for high-risk deployments from August 2026 onward [1].
GDPR Article 22 has imposed a related requirement since 2018: where a decision is based solely on automated processing and produces a legal or similarly significant effect, the data subject has the right to obtain human intervention, to express their point of view, and to contest the decision. That right is meaningless if the system cannot reconstruct what it did. The Court of Justice of the European Union confirmed in SCHUFA (C-634/21) in December 2023 that automated credit scoring falls within Article 22, and that the controller must be able to explain the decision [2].
ISO/IEC 42001:2023, the management-system standard for AI, codifies the practice independently of any regulator. It expects the organisation to maintain documented information about the AI system's intended use, its data sources, its risk assessments, its operational controls, and the results of its monitoring. ENISA's 2023 multilayer framework for the cybersecurity of AI maps these expectations to specific technical controls at the data, model, and deployment layers [3][4].
Continue reading+
The practical test we use with every prospect: pick one output the system produced last week and ask the team to reconstruct the evidence chain in twenty minutes. If they cannot, the system will not pass an external audit, regardless of how impressive the demo looks.