Definition
Differential privacy (DP) is a mathematical framework for protecting individual records when computing on a dataset. Introduced by Cynthia Dwork and colleagues in 2006, it formalizes the intuition that an analysis is private if its outcome would be almost the same whether or not any single record were included. The "almost" is parameterized by epsilon — smaller epsilon, stronger privacy, lower utility.
Mechanism
Implemented by adding calibrated noise (commonly Laplace or Gaussian) to outputs, queries, or transformations. Noise scale is determined by the sensitivity of the function and the privacy budget. The result is a quantitative bound on what an attacker could learn about any individual record from the output.
Application in the AI enablement data layer
In LLM Capsule, differential-privacy-based protection is applied during the structure-preserving encapsulation step. The capsule (AI-ready context) carries the differential-privacy guarantee on top of field-level tokenization. This addresses inference risks that field-level masking alone cannot bound — particularly for operational data where structure, sequence, and aggregate patterns themselves carry sensitive information.
What it is not
- Not a legal or compliance guarantee. It is a technical framework with a tunable parameter.
- Not a yes/no guarantee. Privacy and utility trade off via the privacy budget.
- Not a substitute for governance, audit, or policy.
Why it matters here
Operational data — network logs, configurations, OT manifests, clinical workflows — leaks through patterns, not just identifiers. Differential privacy is the framework that lets enterprise governance reason quantitatively about that leakage risk and enforce a budget per workflow.
Acceptable claims
- "Privacy-preserving with a defined risk-reduction scope"
- "Bounded inference risk under the policy's privacy budget"
- "Differential-privacy-based encapsulation"
Claims to avoid
- "Mathematically impossible to reconstruct"
- "100% safe"
- "GDPR guaranteed"
- "Zero risk"