PatentBrief

Action selection with a reward estimator applied to machine learning

There is provided an information processing apparatus including a reward estimator generator using action history data, including state data expressing a state, action data expressing an action taken by an agent, and a r…

Granted 2019activeExpires 2035Owned by Sony CorpInvented by Yoshiyuki Kobayashi

Original patent title: “Action selection with a reward estimator applied to machine learning

What this patent covers

The actual claim

There is provided an information processing apparatus including a reward estimator generator using action history data, including state data expressing a state, action data expressing an action taken by an agent, and a reward value expressing a reward obtained as a result of the action, as learning data to generate, through machine learning, a reward estimator estimating a reward value from inputted state data and action data. The reward estimator generator includes: a basis function generator generating a plurality of basis functions; a feature amount vector calculator calculating feature amount vectors by inputting state data and action data in the action history data into the basis functions; and an estimation function calculator calculating an estimation function estimating the reward value included in the action history data from the feature amount vectors according to regressive/discriminative learning. The reward estimator includes the plurality of basis functions and the estimation function.

What this patent does NOT cover

The boundaries

    These exclusions are unique to PatentBrief — derived from the actual claim language, not patent-office boilerplate.

    Action selection with a reward…(Primary claim)

    Schematic visualization of the patent's claim structure. Hand-drawn diagrams in progress for each landmark patent.

    Patent Abstract

    Patent abstract

    There is provided an information processing apparatus including a reward estimator generator using action history data, including state data expressing a state, action data expressing an action taken by an agent, and a reward value expressing a reward obtained as a result of the action, as learning data to generate, through machine learning, a reward estimator estimating a reward value from inputted state data and action data. The reward estimator generator includes: a basis function generator generating a plurality of basis functions; a feature amount vector calculator calculating feature amount vectors by inputting state data and action data in the action history data into the basis functions; and an estimation function calculator calculating an estimation function estimating the reward value included in the action history data from the feature amount vectors according to regressive/discriminative learning. The reward estimator includes the plurality of basis functions and the estimation function.

    Patent Journey

    From filing to today

    Patent Filed

    2015

    Patent Granted

    2019 · 4yr after filing

    Active Today

    2026

    Expires

    2035

    PatentBrief Score

    Impact Score

    39/ 100

    Early stage

    Citation count

    0/40

    No citations yet

    Claim breadth

    9/20

    Moderate scope

    Recency

    10/20

    Granted 5–10 years ago

    Assignee scale

    20/20

    Major technology company

    PatentBrief Impact Score — based on citation count, claim breadth, recency, and assignee scale. Not a legal assessment.

    The original legal language

    Original claims

    14 claims as filed with the patent office.

    Citations

    Patent lineage

    Cites earlier patents

    6

    earlier patents this invention cites as foundations

    View prior art →

    Stay in the loop

    Get a weekly digest of new patents.

    One email per week. No spam. Unsubscribe anytime.

    Keep exploring

    Related patents you should know

    US 12564871 · 2026

    A Fixture for Cleaning Showerheads with Multiple Separate Chambers

    This patent describes a cleaning device for showerheads that uses a fixture with three or more separate internal compartments and channels to direct cleaning fluid to the showerhead's upper surfaces.

    ASM IP HOLDING BV

    US 12324579 · 2025

    Surgical Stapler Battery Health Check During Operation

    This patent describes a powered surgical stapler that can detect if some of its rechargeable battery cells are damaged while it's actually firing staples, helping ensure the procedure finishes safely.

    CILAG GMBH INT

    US 12471982 · 2025

    Surgical Tool That Combines Energy Treatment and Stapling

    CILAG's patent details a surgical instrument that applies therapeutic energy to tissue, monitors its properties, then deploys staples, adapting the stapling based on the initial energy treatment and monitoring.

    CILAG GMBH INT

    US 11918209 · 2024

    Real-Time Surgical Instrument Status on Live Video During Operations

    This patent describes a surgical system that shows live video from inside the body and overlays important information about the surgical tool directly onto the screen, helping surgeons operate more precisely.

    CILAG GMBH INT

    US 8697359 · 2014

    How to Use CRISPR-Cas9 to Edit Genes in Human Cells

    This patent describes a method and system for precisely altering gene expression in eukaryotic cells, including human cells, using an engineered CRISPR-Cas9 system that targets and cleaves specific DNA sequences.

    Massachusetts Institute of Technology

    US 4683195 · 1987

    How to Make Many Copies of a Specific DNA Segment

    This patent describes the Polymerase Chain Reaction (PCR), a fundamental process for making millions of copies of a specific DNA or RNA segment from a tiny sample, enabling its detection.

    Cetus Corp

    Patent monitoring

    Get notified when Sony Corp files a new patent

    Get notified when this company files a new patent. Weekly digest · Confirm via email · Unsubscribe anytime.

    Last reviewed: · PatentBrief is not a law firm and this is not legal advice.