# How Cloud Systems Automatically Create and Train AI Data Models

> A cloud-based system that generates fake, privacy-safe data to train AI models, ensuring they remain accurate while protecting sensitive personal information.

- **Patent:** US 11615208
- **Original title:** Systems and methods for synthetic data generation
- **Owner:** Capital One Services LLC
- **Granted:** 2023
- **Status:** Active
- **Times cited:** 4
- **Field:** ai_ml, finance, software, telecommunications

## What it does

This patent describes a cloud system that automates the creation of AI models by using synthetic data—fake data that mimics the statistical properties of real, sensitive information. The system takes a reference dataset, turns categories into numbers, and uses a 'dataset generator' to create synthetic versions. During training, the system constantly compares the model's output to the original data, using a 'similarity metric' and a 'prediction metric' to ensure the model is both accurate and statistically similar to real-world patterns. If the model drifts too far from the desired accuracy or similarity, the system applies a penalty to a loss function, forcing the model to adjust itself until it meets specific quality criteria.

## What it does NOT cover

- Does not cover the use of real, unmasked personal data for training purposes.
- Does not cover manual, non-automated methods of data labeling or model training.
- Does not cover hardware-specific AI acceleration (e.g., specific GPU architectures).
- Does not cover methods that do not involve a penalty-based loss function for synthetic data generation.

## The clever bit

The system uses a feedback loop that treats the 'similarity' of the synthetic data as a constraint in the loss function, effectively forcing the AI to learn the structure of the data without ever seeing the actual sensitive values.

## Real-world examples

1. Fraud detection systems in banking
2. Credit risk assessment models
3. Automated customer service chatbots
4. Privacy-compliant data analysis platforms

## Why it matters

In industries like banking, companies cannot use real customer data (like account numbers or social security numbers) to train AI models due to strict privacy laws. This patent provides a technical framework for 'privacy-preserving' AI development, allowing companies to build powerful machine learning tools without risking data breaches or violating regulations like GDPR or CCPA.

## Frequently asked questions

### What does How Cloud Systems Automatically Create and Train AI Data Models cover?

A cloud-based system that generates fake, privacy-safe data to train AI models, ensuring they remain accurate while protecting sensitive personal information.

### Who owns patent US 11615208?

Capital One Services LLC owns this patent, granted in 2023.

### When does this patent expire?

This patent is expected to expire on March 28, 2043, when the invention enters the public domain.

### What is patent US 11615208 cited by?

This patent has been cited by 4 later patents that build on its ideas.

### What problem does this patent solve?

In industries like banking, companies cannot use real customer data (like account numbers or social security numbers) to train AI models due to strict privacy laws. This patent provides a technical framework for 'privacy-preserving' AI development, allowing companies to build powerful machine learning tools without risking data breaches or violating regulations like GDPR or CCPA.

### What does this patent NOT cover?

Does not cover the use of real, unmasked personal data for training purposes.

**Full plain-English explainer:** https://patentbrief.org/patent/us/11615208/dall-e-text-to-image-generation

**Original patent:** https://patents.google.com/patent/US11615208

---

_Source: PatentBrief — https://patentbrief.org. Patent facts are from public records; the plain-English explanation is PatentBrief's._
