AI incident response: when the model misbehaves

An AI incident is rarely a clean break. It is usually a slow, ambiguous suspicion that something is wrong, followed by a scramble to define what happened. A small amount of preparation collapses that scramble.

Michael McCarroll 14 min read Updated June 2026

What counts as an AI incident

Existing incident response programmes are tuned for confidentiality, integrity and availability events. AI broadens the catalogue. A working definition includes any event in which an AI system produces an output that materially harms the organisation, its customers or third parties — whether through error, leakage, bias, manipulation or unintended behaviour.

Useful subcategories include factual hallucination in customer-facing output; inadvertent disclosure of confidential or personal data; demonstrably biased decisions affecting protected groups; successful prompt injection from third-party content; and regressions arising from a supplier-side model change.

Detection without logs is fiction

Most AI incidents are first noticed externally — by a customer, a journalist, a partner. The internal capability to confirm, investigate and remediate depends almost entirely on whether the relevant interaction was logged. Without prompts, outputs, model versions and timestamps, the organisation is reduced to guessing.

Logging is therefore the single most underweighted AI control. The default for any AI system that handles material work should be that prompts, outputs and metadata are retained for a defined period, protected appropriately, and available to a defined incident response team.

A four-stage response that adapts existing playbooks

The classic prepare–detect–respond–recover lifecycle applies, with AI-specific adjustments (NIST 2024).

Prepare. Catalogue the AI systems that could trigger incidents, identify their owners, define detection thresholds, and decide in advance who has the authority to disable a feature or roll back a model version. Run a tabletop exercise with an AI-specific scenario.

Detect. Combine operational metrics (refusal rates, override rates, complaint rates), automated content checks, and channels for human reporting. Treat all three as legitimate detection sources.

Respond. Confirm the scope using logs. Contain by disabling the specific feature, throttling traffic, pinning to a known-good model version or inserting a human gate. Communicate internally with the affected business owners and externally where obligations require.

Recover. Restore the service in a configuration whose risk has been re-assessed. Issue corrections to anyone who acted on the bad output. Conduct a post-incident review whose output is changes to controls, not blame.

Notification: an under-appreciated trigger

AI incidents involving personal data are still personal-data incidents under the UK and EU GDPR. The 72-hour notification clock starts when the controller becomes aware of a breach likely to result in a risk to individuals (ICO 2024). The fact that the output stayed inside the organisation does not necessarily prevent the obligation arising.

The EU AI Act adds further obligations for providers and deployers of high-risk systems, including serious-incident reporting to the relevant market surveillance authority within defined timeframes (European Parliament 2024). Organisations subject to sectoral rules — financial services, healthcare, critical infrastructure — should also expect AI-specific incident reporting expectations from their regulators.

The post-incident review that actually changes things

A post-incident review whose only output is a memo does not change the system. A useful review produces concrete amendments: to the AI register, to the risk register, to the control set, to the supplier contract terms, to the training material, and to the detection thresholds. Each change has an owner and a date.

The signal that an incident response programme is mature is not the absence of incidents. It is the rate at which incidents produce durable change, and the falling severity of the incidents that follow.

References

  • European Parliament (2024) Regulation (EU) 2024/1689 (Artificial Intelligence Act). Official Journal of the European Union.
  • Information Commissioner's Office (2024) Personal data breaches: a guide. Wilmslow: ICO.
  • ISO/IEC (2023) ISO/IEC 42001:2023 Information technology — Artificial intelligence — Management system. Geneva: ISO/IEC.
  • National Institute of Standards and Technology (2024) Computer Security Incident Handling Guide (SP 800-61 Rev. 3, draft). Gaithersburg, MD: NIST.

Be ready before the model misbehaves

ISO-STANDARD.app links every AI system to its incident playbook, owners and notification obligations — so a suspected issue becomes a structured response within minutes, not days.

ISO-STANDARD.app ships a ready-to-adopt ISO 42001 workspace with the risk register, controls catalogue, policies and audit-ready exports already wired together — no spreadsheet sprawl, no consultant lock-in.

Prefer a conversation? Email hello@iso-standard.app — a real human responds within one business day.

Trust & security
ISO 27001 aligned
Controls mapped to Annex A
Encryption in transit & at rest
TLS 1.3 · AES-256
MFA enforced
TOTP required for all admins
GDPR & UK GDPR
DPA on request · EU/UK data
SOC 2 ready posture
Audit-grade logging
RLS-isolated tenants
Row-level data separation
← All guidesHome →