# How Cloud Systems Automatically Create and Train AI Data Models

> A cloud-based system that generates fake, privacy-safe data to train AI models, ensuring they remain accurate while protecting sensitive personal information.

- **Patent:** US 11615208
- **Original title:** Systems and methods for synthetic data generation
- **Owner:** Capital One Services LLC
- **Granted:** 2023
- **Status:** Active
- **Times cited:** 4
- **Field:** ai_ml, finance, software, telecommunications

## What it does

This patent describes a cloud system that automates the creation of AI models by using synthetic data—fake data that mimics the statistical properties of real, sensitive information. The system takes a reference dataset, turns categories into numbers, and uses a 'dataset generator' to create synthetic versions. During training, the system constantly compares the model's output to the original data, using a 'similarity metric' and a 'prediction metric' to ensure the model is both accurate and statistically similar to real-world patterns. If the model drifts too far from the desired accuracy or similarity, the system applies a penalty to a loss function, forcing the model to adjust itself until it meets specific quality criteria.

## What it does NOT cover

- Does not cover the use of real, unmasked personal data for training purposes.
- Does not cover manual, non-automated methods of data labeling or model training.
- Does not cover hardware-specific AI acceleration (e.g., specific GPU architectures).
- Does not cover methods that do not involve a penalty-based loss function for synthetic data generation.

## The clever bit

The system uses a feedback loop that treats the 'similarity' of the synthetic data as a constraint in the loss function, effectively forcing the AI to learn the structure of the data without ever seeing the actual sensitive values.

## Real-world examples

1. Fraud detection systems in banking
2. Credit risk assessment models
3. Automated customer service chatbots
4. Privacy-compliant data analysis platforms

## Why it matters

In industries like banking, companies cannot use real customer data (like account numbers or social security numbers) to train AI models due to strict privacy laws. This patent provides a technical framework for 'privacy-preserving' AI development, allowing companies to build powerful machine learning tools without risking data breaches or violating regulations like GDPR or CCPA.

## Frequently asked questions

### What does How Cloud Systems Automatically Create and Train AI Data Models cover?

A cloud-based system that generates fake, privacy-safe data to train AI models, ensuring they remain accurate while protecting sensitive personal information.

### Who owns patent US 11615208?

Capital One Services LLC owns this patent, granted in 2023.

### When does this patent expire?

This patent is expected to expire on March 28, 2043, when the invention enters the public domain.

### What is patent US 11615208 cited by?

This patent has been cited by 4 later patents that build on its ideas.

### What problem does this patent solve?

In industries like banking, companies cannot use real customer data (like account numbers or social security numbers) to train AI models due to strict privacy laws. This patent provides a technical framework for 'privacy-preserving' AI development, allowing companies to build powerful machine learning tools without risking data breaches or violating regulations like GDPR or CCPA.

### What does this patent NOT cover?

Does not cover the use of real, unmasked personal data for training purposes.

**Full plain-English explainer:** https://patentbrief.org/patent/us/11615208/dall-e-text-to-image-generation

**Original patent:** https://patents.google.com/patent/US11615208

---

_Source: PatentBrief — https://patentbrief.org. Patent facts are from public records; the plain-English explanation is PatentBrief's._


## Related patents

Semantically similar inventions in the PatentBrief corpus:

- [Training AI on Private Data Without Seeing It](https://patentbrief.org/patent/us/12518214/distributed-machine-learning-systems-including-generation-of-synthetic-data) — This patent describes a way to train artificial intelligence models using private data stored on many separate computers, by generating fake data that mimics the real data's patterns, so the private data itself never leaves its original location.
- [How to Automatically Detect and Fix Changes in AI Model Data](https://patentbrief.org/patent/us/10599957/systems-and-methods-for-detecting-data-drift-for-data-used-in-machine-learning-m) — This patent describes a system that automatically notices when the real-world data an AI model sees changes, causing its predictions to become less accurate, and then fixes the model.
- [Training Robot AI Models Faster Using Smart Simulations](https://patentbrief.org/patent/us/11836577/reinforcement-learning-model-training-through-simulation) — This patent describes a cloud service that helps train artificial intelligence models for robots by running simulations, even suggesting improvements to the AI's learning rules before starting.
- [How Devices Train Shared AI Models While Keeping Your Data Private](https://patentbrief.org/patent/us/12443890/partially-local-federated-learning) — This patent describes a method for training a machine learning model across many devices, where each device keeps some parts of the model and its data private, only sharing updates for the common, global parts of the model.
- [Training AI Models Across Different Computers](https://patentbrief.org/patent/us/12574477/distributed-deep-learning-using-a-distributed-deep-neural-network) — This 2026 patent describes a way to train AI models on one computer, send a version to another computer for further training with private data, and then update the original model with the improvements.