{
  "patent_number": "US 20230297446",
  "country": "US",
  "title": "How to Train AI Models with Fake Data Using Generative Networks",
  "original_title": "Data model generation using generative adversarial networks",
  "summary": "This patent describes a method for training artificial intelligence models using specially generated fake, or 'synthetic,' data created by a generative adversarial network, ensuring the synthetic data is high-quality and safe for training.",
  "what_it_does": "The patent outlines a system for generating data models, like those used in AI, by first creating synthetic data. A \"model optimizer\" receives a request to build a data model (Claim 21). It then sets up computing resources and uses a \"generative network\" to create a synthetic dataset. This generative network includes a \"decoder network\" that takes simplified \"decoder input data\" from a \"code space\" and transforms it into more complex \"decoder output data\" in a \"sample space\" (Claim 21). The generative network is trained to ensure the synthetic data's structure, or \"schema,\" matches that of real \"reference data\" (Claim 22). Before training, the model optimizer can check the synthetic data's quality by calculating scores for statistical correlation, data similarity, or overall data quality compared to the real data (Claim 23). If the synthetic data meets certain quality standards (Claim 24), the computing resources then use it to train the actual data model. Finally, this trained data model can be used to process real \"production data\" (Claim 21). For example, a bank could use this to generate fake customer transaction data that looks real but contains no actual customer information, then train a fraud detection AI on this fake data.",
  "what_it_does_not_cover": [
    "Does not cover generating synthetic data without using a generative network that specifically includes a decoder network transforming data from a code space to a sample space. (Claim 21)",
    "Does not cover training data models directly with real-world, non-synthetic datasets. (Claim 21)",
    "Does not cover synthetic data generation where the output data's schema does not match a reference dataset's schema. (Claim 22)",
    "Does not cover methods that do not evaluate the synthetic dataset using at least one of a statistical correlation score, a data similarity score, or a data quality score. (Claim 23)",
    "Does not cover scenarios where the 'code space' for the decoder input data has a dimensionality equal to or greater than the 'sample space' of the decoder output data. (Claim 25)",
    "Does not cover systems that do not employ a 'model optimizer' to manage the request, resource provisioning, and evaluation steps. (Claim 21)"
  ],
  "filed": "2023-05-22",
  "granted": null,
  "expires": "2043-05-22",
  "status": "active",
  "holder": "Capital One Services",
  "holder_url": "https://patentbrief.org/company/capital-one-services",
  "inventors": [
    {
      "name": "Austin Walters",
      "url": "https://patentbrief.org/inventor/austin-walters"
    },
    {
      "name": "Kate Key",
      "url": "https://patentbrief.org/inventor/kate-key"
    },
    {
      "name": "Mark Watson",
      "url": "https://patentbrief.org/inventor/mark-watson"
    },
    {
      "name": "Jeremy Goodsitt",
      "url": "https://patentbrief.org/inventor/jeremy-goodsitt"
    },
    {
      "name": "Vincent Pham",
      "url": "https://patentbrief.org/inventor/vincent-pham"
    },
    {
      "name": "Anh Truong",
      "url": "https://patentbrief.org/inventor/anh-truong"
    },
    {
      "name": "Kenneth Taylor",
      "url": "https://patentbrief.org/inventor/kenneth-taylor"
    },
    {
      "name": "Reza Farivar",
      "url": "https://patentbrief.org/inventor/reza-farivar"
    },
    {
      "name": "Fardin Abdi Taghi Abad",
      "url": "https://patentbrief.org/inventor/fardin-abdi-taghi-abad"
    }
  ],
  "times_cited": 0,
  "tags": [
    "ai_ml",
    "software",
    "finance",
    "telecommunications",
    "consumer_electronics"
  ],
  "abstract": "Methods for generating data models using a generative adversarial network can begin by receiving a data model generation request by a model optimizer from an interface. The model optimizer can provision computing resources with a data model. As a further step, a synthetic dataset for training the data model can be generated using a generative network of a generative adversarial network, the generative network trained to generate output data differing at least a predetermined amount from a reference dataset according to a similarity metric. The computing resources can train the data model using the synthetic dataset. The model optimizer can evaluate performance criteria of the data model and, based on the evaluation of the performance criteria of the data model, store the data model and metadata of the data model in a model storage. The data model can then be used to process production data.",
  "url": "https://patentbrief.org/patent/us/20230297446/data-model-generation-using-generative-adversarial-networks",
  "markdown_url": "https://patentbrief.org/patent/us/20230297446/data-model-generation-using-generative-adversarial-networks/md",
  "google_patents_url": "https://patents.google.com/patent/US20230297446",
  "relatedPatents": []
}