# How AI Systems Learn to Predict and Act Simultaneously

> A method for training AI models that combines supervised learning for prediction with reinforcement learning for decision-making in a single, coordinated system.

- **Patent:** US 11170293
- **Original title:** Multi-model controller
- **Owner:** Microsoft Technology Licensing LLC
- **Granted:** 2021
- **Status:** Active
- **Times cited:** 4
- **Field:** ai_ml, software, mechanical, telecommunications

## What it does

This patent describes a system that uses two types of neural networks working in tandem to improve how an AI makes decisions. First, a Recurrent Neural Network (RNN) processes data to create a 'state' (a summary of what is happening) and predicts a result. Second, a Q-Network (QN) looks at that state to suggest the best action to take. The system then uses a 'reference result' (the actual outcome) to update the RNN using supervised learning, while simultaneously updating the Q-Network using reinforcement learning based on the action taken and the new state. This allows the system to get better at both understanding its environment and choosing the right actions over time.

## What it does NOT cover

- Does not cover systems that rely solely on supervised learning without a reinforcement learning feedback loop.
- Does not cover static models that do not update their parameters (the 'second' model versions) based on new observation values.
- Does not cover non-recurrent neural networks that lack the ability to maintain state information over time.
- Does not cover systems that do not use a communications interface to receive external reference result values.

## The clever bit

The innovation lies in using the same reference result to update both the predictive model (RNN) and the decision-making model (QN) in a coordinated loop, ensuring the 'state' information used for decision-making is constantly being refined by the predictive model's accuracy.

## Real-world examples

1. Autonomous robotics navigation
2. Real-time industrial process control
3. Automated trading systems
4. Adaptive game AI agents

## Why it matters

This patent is significant because it addresses the 'credit assignment' problem in AI, where a system needs to learn from both immediate feedback and long-term goals. By linking prediction (RNN) and action (QN) training, it helps bridge the gap between simple pattern recognition and complex decision-making. It is a foundational piece of engineering for autonomous agents that must navigate dynamic environments where they receive delayed or sparse rewards.

## Frequently asked questions

### What does How AI Systems Learn to Predict and Act Simultaneously cover?

A method for training AI models that combines supervised learning for prediction with reinforcement learning for decision-making in a single, coordinated system.

### Who owns patent US 11170293?

Microsoft Technology Licensing LLC owns this patent, granted in 2021.

### When does this patent expire?

This patent is expected to expire on November 9, 2041, when the invention enters the public domain.

### What is patent US 11170293 cited by?

This patent has been cited by 4 later patents that build on its ideas.

### What problem does this patent solve?

This patent is significant because it addresses the 'credit assignment' problem in AI, where a system needs to learn from both immediate feedback and long-term goals. By linking prediction (RNN) and action (QN) training, it helps bridge the gap between simple pattern recognition and complex decision-making. It is a foundational piece of engineering for autonomous agents that must navigate dynamic environments where they receive delayed or sparse rewards.

### What does this patent NOT cover?

Does not cover systems that rely solely on supervised learning without a reinforcement learning feedback loop.

**Full plain-English explainer:** https://patentbrief.org/patent/us/11170293/alphago-policy-and-value-networks

**Original patent:** https://patents.google.com/patent/US11170293

---

_Source: PatentBrief — https://patentbrief.org. Patent facts are from public records; the plain-English explanation is PatentBrief's._
