Data Protection Impact Assessments (DPIAs) are often treated as paperwork. For AI systems, they should be a design tool: a structured way to map data flows, identify risks, and select controls that are actually implementable.
Start with a data flow map you can verify
AI systems typically move data across more components than teams expect: UI, orchestration services, retrieval, tool APIs, logs, caches, and external model providers. A DPIA should document:
- What data enters prompts, retrieval, and tool calls.
- What data is stored, where, and for how long.
- Which vendors and regions are involved (see data residency).
If the data flow cannot be validated via telemetry, it is not reliable (see telemetry schema).
Classify data and define boundaries
DPIAs move faster when classification is explicit. Define what data is allowed, what must be redacted, and what is blocked (see data classification and data minimisation).
Select controls that match the risk
Common controls in AI DPIAs include:
- Retention and deletion. Different retention for content-bearing fields vs metadata (see retention and deletion).
- Access control. Who can see logs, prompts, retrieved sources, and tool outputs.
- Output scanning. Detect sensitive disclosures (see policy layering).
- Vendor terms. Data usage, training opt-out, retention guarantees, and change notices (see procurement).
Make decisions auditable
A DPIA should not just list controls. It should record why choices were made and how they are enforced. Decision logging and reason codes support audits and investigations (see decision logging and compliance audits).
Connect DPIAs to delivery and change control
AI systems change quickly. DPIAs should be updated when data sources change, tool access expands, or new vendors/models are introduced. Use feature flags and approvals to keep changes controlled (see feature flags and approvals).
A good DPIA is not a blocker. It is a map: it makes risk visible and turns it into design choices you can defend.