Convergence Analysis of Reinforcement Learning Agents - PowerPoint PPT Presentation

About This Presentation

Title:

Convergence Analysis of Reinforcement Learning Agents

Description:

Players use stochastic strategies. Players only observe their reward. ... Simulations of stochastic algorithm and deterministic dynamics converge as expected. ... – PowerPoint PPT presentation

Number of Views:31

Avg rating:3.0/5.0

Slides: 7

Provided by: sriniva4

Learn more at: http://web.mit.edu

Category:

Tags: agents | analysis | convergence | learning | of | reinforcement | stochastic | use

Transcript and Presenter's Notes

Title: Convergence Analysis of Reinforcement Learning Agents

1
Convergence Analysis of Reinforcement Learning
Agents

Srinivas Turaga
9.912
30th March, 2004

2
The Learning Algorithm
The Assumptions

Players use stochastic strategies.
Players only observe their reward.
Players attempt to estimate the value of choosing
a particular action.

The Algorithm

Play action i with probability Pr(i)
Observe reward r
Update value function v

3
The Learning Algorithm
The Algorithm
Value of action i

Play action i with probability Pr(i)
Proportional to value of action i
Observe reward r
Depends on other players choice j also
Update value function v
2 simple schemes

Algorithm 1
Algorithm 2
If action i chosen
If action i not chosen
forgetting
no forgetting
4
Analysis Techniques

Analysis of stochastic dynamics is hard!
So approximate
Consider average case (deterministic)
Consider continuous time (differential equation)

Random! Discrete time!
Deterministic! Discrete time!
Deterministic! Continuous time!
5
Results - Matching Pennies Game

Analysis shows a stable fixed point corresponding
to matching behavior.
Simulations of stochastic algorithm and
deterministic dynamics converge as expected.

Analysis shows a fixed point corresponding to the
Nash equilibrium. Linear stability analysis shows
marginal stability.
Simulations of stochastic algorithm and
deterministic dynamics diverge to corners.

6
Future Directions

Validate approximation technique.
Analyze properties of more general reinforcement
learners.
Consider situations with asymmetric learning
rates.
Study behavior of algorithms for arbitrary payoff
matrices.

Write a Comment

User Comments (0)

About PowerShow.com

Recommended Relevance Latest Highest Rated Most Viewed

Sort by:

Related More from user

CrystalGraphics Presentations

World's Best PowerPoint Templates PowerPoint PPT Presentation

World's Best PowerPoint Templates - CrystalGraphics offers more PowerPoint templates than anyone else in the world, with over 4 million to choose from. Winner of the Standing Ovation Award for “Best PowerPoint Templates” from Presentations Magazine. They'll give your presentations a professional, memorable appearance - the kind of sophisticated look that today's audiences expect. Boasting an impressive range of designs, they will support your presentations with inspiring background photos or videos that support your themes, set the right mood, enhance your credibility and inspire your audiences.

CrystalGraphics 3D Character Slides for PowerPoint PowerPoint PPT Presentation

CrystalGraphics 3D Character Slides for PowerPoint - CrystalGraphics 3D Character Slides for PowerPoint

Chart and Diagram Slides for PowerPoint PowerPoint PPT Presentation

Chart and Diagram Slides for PowerPoint - Beautifully designed chart and diagram s for PowerPoint with visually stunning graphics and animation effects. Our new CrystalGraphics Chart and Diagram Slides for PowerPoint is a collection of over 1000 impressively designed data-driven chart and editable diagram s guaranteed to impress any audience. They are all artistically enhanced with visually stunning color, shadow and lighting effects. Many of them are also animated. And they’re ready for you to use in your PowerPoint presentations the moment you need them. – PowerPoint PPT presentation

Related Presentations

Reinforcement Learning Dealing with Complexity and Safety in RL PowerPoint PPT Presentation

Reinforcement Learning Dealing with Complexity and Safety in RL - What are the belief-space properties that allow some POMDP problems to be approximated efficiently, explaining the point-based algorithms success? | PowerPoint PPT presentation | free to view

Rmax: A NearOptimal, Polynomial Time Reinforcement Learning Algorithm PowerPoint PPT Presentation

Rmax: A NearOptimal, Polynomial Time Reinforcement Learning Algorithm - Two airlines compete daily to supply transportation services to the US Army ... Theorem (impossibility): Given an imperfect monitoring setup, where the agent ... | PowerPoint PPT presentation | free to view

Chapter 6: Temporal Difference Learning PowerPoint PPT Presentation

Chapter 6: Temporal Difference Learning - R. S. Sutton and A. G. Barto: Reinforcement Learning: An Introduction. 1 ... R. S. Sutton and A. G. Barto: Reinforcement Learning: An Introduction. 13 ... | PowerPoint PPT presentation | free to view

Reinforcement Learning: Learning Algorithms PowerPoint PPT Presentation

Reinforcement Learning: Learning Algorithms - Claim: Both converge to V (.) From now on St = S(t) 1. 2. 3. 4. 5 ... Once values converged. or .. Always at the states visited. 24. Monte-Carlo: Evaluation ... | PowerPoint PPT presentation | free to view

Repeated Auction Games and Learning Dynamics in Electronic Logistics Marketplaces: Regulation through Information PowerPoint PPT Presentation

Repeated Auction Games and Learning Dynamics in Electronic Logistics Marketplaces: Regulation through Information - Repeated Auction Games and Learning Dynamics in Electronic Logistics Marketplaces: Regulation through Information Hani S. Mahmassani University of Maryland | PowerPoint PPT presentation | free to view

A Non-Technical Introduction to Social Network Analysis Barry Wellman PowerPoint PPT Presentation

A Non-Technical Introduction to Social Network Analysis Barry Wellman - The Argument | PowerPoint PPT presentation | free to view

Knowledge Representation and Machine Learning PowerPoint PPT Presentation

Knowledge Representation and Machine Learning - ... directions with a decreasing rate over time. Both converge on optimal value ... report CMU-RI-TR-96-31, Robotics Institute, Carnegie Mellon University, October, ... | PowerPoint PPT presentation | free to view

Multi-Level Learning in Hybrid Deliberative/Reactive Mobile Robot Architectural Software Systems PowerPoint PPT Presentation

Multi-Level Learning in Hybrid Deliberative/Reactive Mobile Robot Architectural Software Systems - Multi-Level Learning in Hybrid Deliberative/Reactive Mobile Robot Architectural ... Studies have contributed to the population of a case database that will be used ... | PowerPoint PPT presentation | free to view

Chapter 24 Using Ant Colony Agents for Designing Energy-Efficient Protocols for Wireless Ad Hoc and Sensor Networks PowerPoint PPT Presentation

Chapter 24 Using Ant Colony Agents for Designing Energy-Efficient Protocols for Wireless Ad Hoc and Sensor Networks - Chapter 24 Using Ant Colony Agents for Designing Energy-Efficient Protocols for Wireless Ad Hoc and Sensor Networks Isaac Woungang (Department of Computer Science ... | PowerPoint PPT presentation | free to view

Adaptive Agents for Modern Strategy Games: an Approach Based on Reinforcement Learning PowerPoint PPT Presentation

Adaptive Agents for Modern Strategy Games: an Approach Based on Reinforcement Learning - Age of Empires (Ensemble Studios / Microsoft) Act of War (Eugen Systems / Atari) ... TD-Gammon became the best Backgammon player in the world [Tesauro 2002] But... | PowerPoint PPT presentation | free to view

Knowledge Acquisition and Problem Solving PowerPoint PPT Presentation

Knowledge Acquisition and Problem Solving - CS 785 Fall 2004 Gheorghe Tecuci tecuci@gmu.edu http://lac.gmu.edu/ Learning Agents Center and Computer Science Department George Mason University | PowerPoint PPT presentation | free to view

Dr. C. Lee Giles PowerPoint PPT Presentation

Dr. C. Lee Giles - IST 511 Information Management: Information and Technology Machine Learning Dr. C. Lee Giles David Reese Professor, College of Information Sciences and Technology | PowerPoint PPT presentation | free to view

IST programme INTEGRATED PROGRAMME PORTFOLIO ANALYSIS 2004 PowerPoint PPT Presentation

IST programme INTEGRATED PROGRAMME PORTFOLIO ANALYSIS 2004 - Ensure the co-evolution of technology and applications ... LASAGNE. Monitoring and Measurement. Economics of the Internet. Network- and Service- Management ... | PowerPoint PPT presentation | free to view

Social Network Analysis Thomas W' Valente PowerPoint PPT Presentation

Social Network Analysis Thomas W' Valente - 2. Network Analysis - Introduction and History. Relations vs. Attributes ... Ties are often among those of the same sex, ethnicity, social class, and so on. 14 ... | PowerPoint PPT presentation | free to view

Knowledge Representation and Machine Learning PowerPoint PPT Presentation

Knowledge Representation and Machine Learning - 'The study of how to put knowledge into a form that a computer ... 'Find a block which is taller than the one you are holding and put it in the box' SAINT (1963) ... | PowerPoint PPT presentation | free to view

Chapter 6: Temporal Difference Learning PowerPoint PPT Presentation

Chapter 6: Temporal Difference Learning - These methods bootstrap and sample, combining aspects of DP and MC methods ... What is common to all three classes of methods? DP, MC, TD ... | PowerPoint PPT presentation | free to view

Machine Learning PowerPoint PPT Presentation

Machine Learning - Tabula Rasa. No background knowledge other than the training examples. Knowledge-based learning ... Tabula Rasa, fully supervised. Qns: How do we test a learner? ... | PowerPoint PPT presentation | free to view

ABSTRACT PowerPoint PPT Presentation

ABSTRACT - In the proposed framework, an agent-based learning system via Bayesian-SLA is designed ... The developed Bayesian-SLA framework is implemented to investigate the ... | PowerPoint PPT presentation | free to view

Machine Learning PowerPoint PPT Presentation

Machine Learning - Machine Learning | PowerPoint PPT presentation | free to view

Machine Learning PowerPoint PPT Presentation

Machine Learning - Learning Improving the performance of the agent-w.r.t. the external performance measure Dimensions: What can be learned?--Any of the boxes representing | PowerPoint PPT presentation | free to view

Iterated prisoners dilemma using Zhus algorithm PowerPoint PPT Presentation

Iterated prisoners dilemma using Zhus algorithm - Basics. Normal single-agent learning. Environment has ... Basics (contd) Multi-Agent ... Basics (contd) Definition 2: A Nash Equilibrium is such a profile ... | PowerPoint PPT presentation | free to view

CS 512 Machine Learning PowerPoint PPT Presentation

CS 512 Machine Learning - Personalized news or mail filter. Personalized tutoring ... Medical text mining (e.g. migraines to calcium channel blockers to magnesium) 9 ... | PowerPoint PPT presentation | free to view

Dynamics of Learning PowerPoint PPT Presentation

Dynamics of Learning - Santa Fe Institute: James P. Crutchfield, P.I. Future Plans (6 months out) New problems: ... PI: James P. Crutchfield; Cosma Shalizi, Post-Doc. Santa Fe Institute ... | PowerPoint PPT presentation | free to view

Principles and Applications of Probabilistic Learnin PowerPoint PPT Presentation

Principles and Applications of Probabilistic Learnin - Principles and Applications of Probabilistic Learning Padhraic Smyth Department of Computer Science University of California, Irvine www.ics.uci.edu/~smyth | PowerPoint PPT presentation | free to view

Simulating Game Theoretic Micro Trade Networks as the Dynamics of Entrepreneurial Organization Formations PowerPoint PPT Presentation

Simulating Game Theoretic Micro Trade Networks as the Dynamics of Entrepreneurial Organization Formations - Simulating Game Theoretic Micro Trade Networks as the Dynamics of Entrepreneurial Organization Formations The Need for Entrepreneurial Network Analysis Brian Uzzi ... | PowerPoint PPT presentation | free to view

CSC 480: Artificial Intelligence PowerPoint PPT Presentation

CSC 480: Artificial Intelligence - CSC 480: Artificial Intelligence Dr. Franz J. Kurfess Computer Science Department Cal Poly This sample set has a few non-binary attributes, such as Patrons ... | PowerPoint PPT presentation | free to view

CS 391L: Machine Learning Introduction PowerPoint PPT Presentation

CS 391L: Machine Learning Introduction - CS 391L: Machine Learning Introduction Raymond J. Mooney University of Texas at Austin What is Learning? Herbert Simon: Learning is any process by which a system ... | PowerPoint PPT presentation | free to view