How Cloud Systems Automatically Create and Train AI Data Models
A cloud-based system that generates fake, privacy-safe data to train AI models, ensuring they remain accurate while protecting sensitive personal information.
Patent Number
US 11615208
Status
Active
Filing Date
October 4, 2018
Grant Date
March 28, 2023
Expiration
~October 2038 (estimated)
Claims
23
Assignee
Capital One Services LLC
Inventors
Austin Walters, Kate Key, Mark Watson, Jeremy Goodsitt, Michael Walters, Vincent Pham, Noriaki TATSUMI, Anh Truong, Kenneth Taylor, Reza Farivar, Fardin Abdi Taghi Abad
Citations
4 forward · 121 backward
What it covers
This patent describes a cloud system that automates the creation of AI models by using synthetic data—fake data that mimics the statistical properties of real, sensitive information. The system takes a reference dataset, turns categories into numbers, and uses a 'dataset generator' to create synthetic versions. During training, the system constantly compares the model's output to the original data, using a 'similarity metric' and a 'prediction metric' to ensure the model is both accurate and statistically similar to real-world patterns. If the model drifts too far from the desired accuracy or similarity, the system applies a penalty to a loss function, forcing the model to adjust itself until it meets specific quality criteria.
What it doesn't cover
- —Does not cover the use of real, unmasked personal data for training purposes.
- —Does not cover manual, non-automated methods of data labeling or model training.
- —Does not cover hardware-specific AI acceleration (e.g., specific GPU architectures).
- —Does not cover methods that do not involve a penalty-based loss function for synthetic data generation.
The clever bit
The system uses a feedback loop that treats the 'similarity' of the synthetic data as a constraint in the loss function, effectively forcing the AI to learn the structure of the data without ever seeing the actual sensitive values.
Why it matters
In industries like banking, companies cannot use real customer data (like account numbers or social security numbers) to train AI models due to strict privacy laws. This patent provides a technical framework for 'privacy-preserving' AI development, allowing companies to build powerful machine learning tools without risking data breaches or violating regulations like GDPR or CCPA.
Real-world examples
- 1.Fraud detection systems in banking
- 2.Credit risk assessment models
- 3.Automated customer service chatbots
- 4.Privacy-compliant data analysis platforms
Generated by PatentBrief · Not legal advice · patentbrief.org
US 11615208 · 2026