Skip to content
PatentBrief

IP Strategy

Data Privacy & IP

Privacy-enhancing technology patents; differential privacy; federated learning; homomorphic encryption; zero-knowledge proofs; GDPR and trade secret protection for proprietary datasets.

FAQ

What privacy-enhancing technology (PET) innovations are patentable, and who holds major differential privacy and federated learning patents?

Privacy-enhancing technologies represent one of the most active areas of new patent filing at the intersection of cryptography; machine learning; and regulatory compliance: DIFFERENTIAL PRIVACY: DEFINITION: mathematical privacy guarantee; adding calibrated noise to query results or model training to prevent identification of individuals while preserving statistical utility; APPLE: has deployed differential privacy in iOS since iOS 10 (2016) for collecting usage statistics; QuickType keyboard; Emoji usage; Apple uses differential privacy in iOS health data aggregation; Apple has filed patents on specific differential privacy mechanisms (calibrated Laplace mechanism implementations; local differential privacy protocols); GOOGLE: pioneered RAPPOR (Randomized Aggregatable Privacy-Preserving Ordinal Response) for Chrome telemetry; differential privacy team (former Frank McSherry; Ulfar Erlingsson); Tensorflow Privacy library; Google has filed patents on specific DP mechanism implementations; APPLE vs. GENERIC DP: 'adding noise to protect privacy' is potentially an abstract idea; specific technical implementations of the Laplace mechanism calibrated to specific epsilon/delta privacy budgets + specific data schema + specific utility preservation algorithms are more likely patentable; FEDERATED LEARNING: DEFINITION: training ML models across multiple devices/parties without centralizing raw data; the model gradient (not the data) is shared; Apple: uses FL for next-word prediction (QuickType keyboard); Siri improvement; emoji suggestions; without Apple ever seeing the raw keystrokes; Google: Gboard keyboard prediction; first major deployment of FL at scale; PATENTS: Google US10,572,822 (federated learning for mobile keyboard prediction); specific gradient aggregation methods; specific client selection strategies (FedAvg algorithm); Apple patents on on-device training without server round-trips; MICROSOFT AZURE ML: FL for enterprise (multiple hospital FL training without patient data leaving hospital); specific FL aggregation algorithms; privacy budget tracking; OPEN SOURCE: TensorFlow Federated; PySyft (OpenMined); Flower (federated learning framework); significant open source creates prior art landscape; COMPANY PATENT STRATEGY: file provisionals on specific FL implementations before publishing research; specific gradient compression + privacy budget = more defensible than generic FL claims.

How does GDPR and CCPA affect patent prosecution, trade secret protection, and AI model privacy?

Privacy regulations like GDPR and CCPA create complex interactions with IP law — particularly around trade secrets and AI models — that every company operating with personal data must understand: GDPR-PATENT PROSECUTION INTERACTION: PUBLICATION OF PATENT APPLICATIONS: patent applications publish 18 months after filing; for AI-related patents, the application must describe the technical invention sufficiently (enabling disclosure); TENSION WITH GDPR: if the AI model was trained on personal data, does disclosing the training methodology in the patent application violate any GDPR obligations?; GENERALLY NO: patent applications describe technical methods; they do not need to include the actual personal data used for training; describing 'we trained on a dataset of X medical records' is technical description, not disclosure of the personal data itself; GDPR RIGHT TO ERASURE (ARTICLE 17) vs. TRADE SECRETS: MACHINE UNLEARNING: GDPR Article 17 gives individuals the right to have their data deleted; for AI models, this potentially requires 'unlearning' the contribution of specific individuals' data; this creates tension with trade secrets: if model weights are trade secrets and a user requests erasure → can the company modify the trade secret model in response without defeating trade secret protection?; MACHINE UNLEARNING PATENTS: Google; Amazon; Microsoft have filed patents on methods for efficiently removing the influence of specific training data points from trained ML models without full retraining; this is both a privacy compliance tool and an ML efficiency tool; CCPA AND DATA TRADE SECRETS: CCPA (California Consumer Privacy Act) gives California residents right to know what personal data is collected; right to deletion; California courts recognize that trade secrets in data (the compiled customer database itself) can be protected even while giving consumers access to their own individual data; HOWEVER: a company that protects its customer database as a trade secret must still respond to individual opt-out and deletion requests; trade secret protection does not override individual data rights; GDPR DATA TRANSFERS AND IP: STANDARD CONTRACTUAL CLAUSES (SCCs) for EU-US data transfers can affect ML model training workflows; SCHREMS II (CJEU 2020): Privacy Shield invalidated; SCCs now primary mechanism; implications: training EU personal data on US servers requires SCCs or adequacy decision; IP owned by US entities from EU-funded training may require careful structuring; GDPR and the NEW EU AI ACT (2024): the EU AI Act applies to AI systems used in the EU; high-risk AI systems (medical; employment; biometrics; critical infrastructure) require conformity assessment; technical documentation; incident reporting; the AI Act creates new IP strategy considerations: documentation required for compliance may need to be protected as trade secret; risk management systems may themselves be patentable innovations.

What are homomorphic encryption and zero-knowledge proof patents, and who are the major players?

Homomorphic encryption (HE) and zero-knowledge proofs (ZKPs) are advanced cryptographic techniques that allow computation on private data — creating significant commercial value and a rapidly growing patent landscape: HOMOMORPHIC ENCRYPTION — FUNDAMENTALS: DEFINITION: encryption scheme allowing computations on ciphertext such that the result, when decrypted, equals the result of the same computation on the plaintext; 'compute without seeing'; TYPES: Partially Homomorphic Encryption (PHE): addition OR multiplication only; BGV; BFV; CKKS schemes; Fully Homomorphic Encryption (FHE): both addition AND multiplication — any computation possible; computationally expensive; MAJOR HE PATENT HOLDERS: IBM: most significant contributor to practical HE research; IBM Research developed HElib library (open source); IBM patents on: specific FHE scheme implementations; hardware acceleration for HE bootstrapping; specific error-correction in lattice-based cryptography; MICROSOFT: SEAL library (Simple Encrypted Arithmetic Library; open source); Microsoft Research develops HE for Azure cloud services; patents on specific optimizations for CKKS (Cheon-Kim-Kim-Song) scheme for approximate arithmetic; GOOGLE: Chrome certificate transparency uses cryptography patents; Tink cryptographic library; INPHER: privacy-first data analytics using HE; ZAMA (France): open source FHE library (TFHE; Concrete); COMMERCIAL HE USE CASES: privacy-preserving ML (compute on encrypted health data without seeing the data); private database queries (financial; healthcare); multi-party computation for collaborative analytics; ZERO-KNOWLEDGE PROOFS (ZKPs): DEFINITION: cryptographic protocol allowing one party to prove knowledge of a fact to another party without revealing the fact itself; 'prove you know without revealing what you know'; ZKP PATENT LANDSCAPE: STARKWARE (STARK proofs — Scalable Transparent ARguments of Knowledge): Eli Ben-Sasson; computational integrity proofs; blockchain transaction validation without revealing transaction content; AZTEC PROTOCOL (PLONK proofs); ALEO: privacy-preserving blockchain using zk-SNARKs; COINBASE (zero-knowledge KYC verification); STRIPE (privacy-preserving fraud detection); ZKP APPLICATIONS: blockchain: private transactions; identity verification without revealing private data; supply chain: prove compliance without revealing commercial terms; AI: zero-knowledge ML (prove model produced output without revealing model weights); PATENT ELIGIBILITY FOR CRYPTOGRAPHY: mathematical algorithms are abstract ideas under Alice; anchor claims in: specific hardware implementation (ASIC for FHE acceleration; specific Montgomery multiplication optimizations); specific cryptographic parameter sets; specific security reduction proofs; claim the system implementation (server + client + specific cryptographic protocol) not just the abstract mathematical operation.

How should companies protect proprietary datasets as trade secrets, and what constitutes a legally protected data trade secret?

Data is often more valuable than the algorithms that process it — and in a world where ML model performance scales with data, protecting proprietary datasets as trade secrets is increasingly critical: TRADE SECRET PROTECTION FOR DATASETS: THE LEGAL STANDARD: DTSA and most UTSA states: trade secret = (1) information that has economic value from not being known; AND (2) reasonable measures to maintain secrecy; DATASETS THAT QUALIFY: compiled customer databases (specific selection + arrangement + annotations = value from not being known; the raw data from public sources alone may not qualify); annotated training datasets (the annotation work; curation decisions; and quality filtering that produced the ML-ready dataset = trade secret even if underlying raw content was public); proprietary benchmarks and evaluation datasets (specific test cases used to evaluate model performance; withheld from publication); de-identified clinical datasets with specific cohort selection criteria; financial transaction fraud labels (which transactions are confirmed fraud is extremely valuable and not publicly known); WHAT IS NOT READILY PROTECTED: publicly available data downloaded from common sources (though the curation + cleaning pipeline applied to it may be protected); data that can be independently generated by competitors with reasonable effort; REASONABLE MEASURES FOR DATA TRADE SECRETS: ACCESS CONTROLS: role-based access; minimum necessary principle; data access logs; DATA CLASSIFICATION POLICY: label proprietary datasets with specific confidentiality classification; PHYSICAL + TECHNICAL SECURITY: encrypted at rest + in transit; air-gapped systems for most sensitive; VENDOR NDA: NDAs with all vendors who access the data; EMPLOYEE RESTRICTIONS: NDA + IP assignment agreement; training on data confidentiality; exit interview to confirm no data was taken; API ACCESS CONTROL: if providing API access to model trained on proprietary data, structure API to prevent model extraction attacks; DATASET THEFT CASES: Epic Systems v. Tata Consultancy Services: $940M trade secret verdict (2016); TCS improperly accessed Epic's software; demonstrated that data access by vendors creates misappropriation risk; LinkedIn v. hiQ Labs: web scraping of public LinkedIn profiles; courts found computer fraud issues but not trade secret (public data); PRIVACY-TRADE SECRET TENSION: proprietary medical datasets may contain HIPAA-protected PHI; HIPAA de-identification + trade secret protection can coexist; de-identification must be genuine but the de-identified dataset can then be a trade secret; THE GDPR ERASURE TENSION: if a training dataset contains a user who later requests erasure under GDPR Article 17, the company must remove that individual's data AND may need to retrain the model ('machine unlearning'); this process — if done incorrectly — can inadvertently reveal other trade secrets (model architecture details; other training data characteristics); handle carefully.

Related Guides

AI IP StrategyTrade Secret StrategyCybersecurity PatentsOpen Source IP