Action selection with a reward estimator applied to machine learning
There is provided an information processing apparatus including a reward estimator generator using action history data, including state data expressing a state, action data expressing an action taken by an agent, and a r…
Original patent title: “Action selection with a reward estimator applied to machine learning”
What this patent covers
The actual claim
There is provided an information processing apparatus including a reward estimator generator using action history data, including state data expressing a state, action data expressing an action taken by an agent, and a reward value expressing a reward obtained as a result of the action, as learning data to generate, through machine learning, a reward estimator estimating a reward value from inputted state data and action data. The reward estimator generator includes: a basis function generator generating a plurality of basis functions; a feature amount vector calculator calculating feature amount vectors by inputting state data and action data in the action history data into the basis functions; and an estimation function calculator calculating an estimation function estimating the reward value included in the action history data from the feature amount vectors according to regressive/discriminative learning. The reward estimator includes the plurality of basis functions and the estimation function.
What this patent does NOT cover
The boundaries
These exclusions are unique to PatentBrief — derived from the actual claim language, not patent-office boilerplate.
Schematic visualization of the patent's claim structure. Hand-drawn diagrams in progress for each landmark patent.
Patent Abstract
Patent abstract
There is provided an information processing apparatus including a reward estimator generator using action history data, including state data expressing a state, action data expressing an action taken by an agent, and a reward value expressing a reward obtained as a result of the action, as learning data to generate, through machine learning, a reward estimator estimating a reward value from inputted state data and action data. The reward estimator generator includes: a basis function generator generating a plurality of basis functions; a feature amount vector calculator calculating feature amount vectors by inputting state data and action data in the action history data into the basis functions; and an estimation function calculator calculating an estimation function estimating the reward value included in the action history data from the feature amount vectors according to regressive/discriminative learning. The reward estimator includes the plurality of basis functions and the estimation function.
Patent Journey
From filing to today
Patent Filed
2015
Patent Granted
2019 · 4yr after filing
Active Today
2026
Expires
2035
PatentBrief Score
Impact Score
Early stage
Citation count
0/40
No citations yet
Claim breadth
9/20
Moderate scope
Recency
10/20
Granted 5–10 years ago
Assignee scale
20/20
Major technology company
PatentBrief Impact Score — based on citation count, claim breadth, recency, and assignee scale. Not a legal assessment.
The original legal language
Original claims
14 claims as filed with the patent office.
Citations
Patent lineage
Stay in the loop
Get a weekly digest of new patents.
One email per week. No spam. Unsubscribe anytime.
Keep exploring
Related patents you should know
US 12564871 · 2026
A Fixture for Cleaning Showerheads with Multiple Separate Chambers
This patent describes a cleaning device for showerheads that uses a fixture with three or more separate internal compartments and channels to direct cleaning fluid to the showerhead's upper surfaces.
ASM IP HOLDING BV
US 12324579 · 2025
Surgical Stapler Battery Health Check During Operation
This patent describes a powered surgical stapler that can detect if some of its rechargeable battery cells are damaged while it's actually firing staples, helping ensure the procedure finishes safely.
CILAG GMBH INT
US 12471982 · 2025
Surgical Tool That Combines Energy Treatment and Stapling
CILAG's patent details a surgical instrument that applies therapeutic energy to tissue, monitors its properties, then deploys staples, adapting the stapling based on the initial energy treatment and monitoring.
CILAG GMBH INT
US 11918209 · 2024
Real-Time Surgical Instrument Status on Live Video During Operations
This patent describes a surgical system that shows live video from inside the body and overlays important information about the surgical tool directly onto the screen, helping surgeons operate more precisely.
CILAG GMBH INT
US 8697359 · 2014
How to Use CRISPR-Cas9 to Edit Genes in Human Cells
This patent describes a method and system for precisely altering gene expression in eukaryotic cells, including human cells, using an engineered CRISPR-Cas9 system that targets and cleaves specific DNA sequences.
Massachusetts Institute of Technology
US 4683195 · 1987
How to Make Many Copies of a Specific DNA Segment
This patent describes the Polymerase Chain Reaction (PCR), a fundamental process for making millions of copies of a specific DNA or RNA segment from a tiny sample, enabling its detection.
Cetus Corp
Patent monitoring