Vincent Thomas - PowerPoint PPT Presentation

About This Presentation

Title:

Vincent Thomas

Description:

Les agents agissent simultan ment. Politique individuelle. i L'objectif maximiser la somme des r compenses individuelles. Pour le moment, sans interaction. Agent 1 ... – PowerPoint PPT presentation

Number of Views:48

Avg rating:3.0/5.0

Slides: 40

Provided by: mfiUniv

Category:

Tags: agissent | thomas | vincent

Transcript and Presenter's Notes

Title: Vincent Thomas

1
Introduction dinteractions directesdans les
processus de décisionmarkoviens

Vincent Thomas
Christine Bourjot
Vincent Chevrier

2
Présentation

Travail en cours
Systèmes multi-agents
Réactifs règles stimulus-réponse
Sans mémoire
Construction automatique de comportements
De manière décentralisée
Pour résoudre des problèmes collectifs
Dans un cadre coopératif

3
Plan

Modèles markoviens
MDP
Extensions
Notre proposition
Interac-DEC-MDP
Formalisme
Exemples
Résolution
Conclusion

4
MDP

MDP Markov Decision Process ltS,A,T,Rgt
S ensemble détats
A ensemble dactions
T matrice de transition évolution du système
stochastique
T S x A ? P(S)
R récompense fonction à optimiser
R S x A ? P(Re)
Un MDP un problème de décision
Trouver politique (comportement réactif) ? S ?
P(A)
Qui maximise la somme des récompenses à long
terme
Algorithmes pour construire politique
Planification (value iteration, )
Apprentissage (Q-learning, )
Trouve politique optimale

Mono-agent
5
Extensions des MDPs

DEC-MDP Decentralized-MDP
Formalisme pour problème de décision
Représenter agents réactifs
Exécution décentralisée et simultanée
Observabilité partielle
Fonction de Observations vers Actions ?i Si ?
P(Ai)
Représenter problème sous forme dun processus
Matrice de transition
T S x A1 x A2 x A3 x ? P(S)
Fonction de récompense
R S x A1 x A2 x A3 x ? P(Re)
Actions des agents vues comme influences sur
processus
Objectif Maximiser la somme des récompenses

Multi-agent
6
Fonctionnement (Initial)
S
7
Fonctionnement (Observations)
S
8
Fonctionnement (Décision)
S
9
Fonctionnement (Action)
a1
a2
S
10
Fonctionnement (Évolution)
11
Fonctionnement (Récompenses)
a1
a2
S ? S
a1,a2
R
R
12
Difficultés dans les DEC-MDP

Difficultés
Couplages implicites
Dans transitions T
Résultat de action dépend des autres
Dans récompenses R
Récompense dépend des autres
Évolution dépend des comportements des autres
Résolution
Centralisée ? mono-agent
Explosion combinatoire
Décentralisée
Problème co-évolution
Tragédie des communs
Problème de credit assignment
Notre proposition

Trouver un compromis
13
Plan

Modèles markoviens
MDP
Extensions
Notre proposition
Interac-DEC-MDP
Formalisme
Exemples
Résolution
Conclusion

14
Proposition

Motivation
Besoins de raisonner au niveau collectif sont
limités
Échange, Partage de ressources,
Raisonner individuel est moins coûteux
Gestion des ressources attribuées
Nouveau cadre formel
Interac-DEC-MDP
Restreindre les systèmes considérés
Séparer les décisions collectives des décisions
individuelles
Moins expressif
Restriction ? Système Factorisés

15
Cadre général
Apprentissage Égoïste
Gestion Du collectif

Les agents peuvent agir individuellement
Pas influence des autres ? Transitions
indépendantes
Les actions des agents sont récompensées dans
leur espace
Pas de couplage de R ? Récompenses indépendantes
Chaque agent à des perceptions partielles
Etat, Récompenses, comportements des autres

16
Cadre général
Apprentissage Égoïste
Gestion Du collectif

Les agents peuvent agir individuellement
Pas influence des autres ? Transitions
indépendantes
Les actions des agents sont récompensées dans
leur espace
Pas de couplage de R ? Récompenses indépendantes
Chaque agent à des perceptions partielles
Etat, Récompenses, comportements des autres
Interaction entre agents
Seuls couplages
Semi-centralisée entre agents impliqués

17
Cadre général
Apprentissage Égoïste
Gestion Du collectif

Les agents peuvent agir individuellement
Pas influence des autres ? Transitions
indépendantes
Les actions des agents sont récompensées dans
leur espace
Pas de couplage de R ? Récompenses indépendantes
Chaque agent à des perceptions partielles
Etat, Récompenses, comportements des autres
Interaction entre agents
Seuls couplages
Semi-centralisée entre agents impliqués
Mais pas trivial
Remise en cause du comportement individuel

18
Cadre général
Apprentissage Égoïste
Gestion Du collectif

Les agents peuvent agir individuellement
Pas influence des autres ? Transitions
indépendantes
Les actions des agents sont récompensées dans
leur espace
Pas de couplage de R ? Récompenses indépendantes
Chaque agent à des perceptions partielles
Etat, Récompenses, comportements des autres
Interaction entre agents
Seuls couplages
Semi-centralisée entre agents impliqués
Mais pas trivial
Remise en cause du comportement individuel

19
Formalisme Agents

Chaque agent i est décrit par un MDP
ltSi,Ai,Ti,Rigt
Si espace état individuel
Ai espace action individuel
Ti transition individuelle
Ri récompense individuelle
Les agents agissent simultanément
Politique individuelle ?i
Lobjectif ?maximiser la somme des récompenses
individuelles
Pour le moment, sans interaction

Agent 1
Agent 2
Agent3
20
Interactions directes

Définition
Influences mutuelles réciproques ponctuelles
Il sagit des seuls couplages du système
Agent i peut influencer état de j
Les agents impliqués peuvent raisonner
Politique dépend des agents impliqués
Processus de négociation

Agent i
Agent j
Interaction
1
Agent i
Agent j
2
Décision
Résultat
3
21
Représentation interactions

Ajout dinstances d'interactions
Ik interaction k
Iensemble des interactions
Interaction différents résultats possibles
Rik,l résultat l
Rik ensemble des résultats de Ik
Chaque résultat matrice de transition
TRik,l

Sport collectif
Interactions
?
Ik
Ik
Rik,l
Rik,l
S?S
S?S
22
Politiques dinteraction

Individuelle
Déclenchement
Collective
Semi-centralisation
Résolution dinteraction
Pour chaque couple

Agent i
Agent j
Décision
Interaction
Agent i
Agent j
Décision
Interaction
23
Formalisme Modèle dexécution

Module daction
Décision
Exécution
Module interaction
Pour tout agent i
Déclenchement
Décision jointe
Exécution de linteraction

Ik
Ik
Rik,l
Rik,l
S?S
S?S
24
Nouveau problème

Les agents peuvent
Agir
Interagir
Objectif déterminer
Politique daction
Politique de déclenchement
Politique de résolution
De manière décentralisée
Pour maximiser une récompense perçue
partiellement par les agents

25
Plan

Modèles markoviens
MDP
Extensions
Notre proposition
Interac-DEC-MDP
Formalisme
Exemples
Résolution
Conclusion

26
Exemples

Partage de nourriture
Partage de ressources
Pompiers
Chaque agent
Position
Possède seau plein/vide
Action individuelles
Les agents ne se gênent pas
T indépendants
Un agent reçoit une récompense
Met de leau dans le feu
R indépendant
Possibilité déchanger des seaux
Interaction
Deux résultats échange effectif / refusé
Intérêt de linteraction
Plus vite dans les échanges

Feu
Agents
Eau
27
Exemple simple

Deux agents
Positions limitées
Échanges possibles
Conséquences
Agent A voit feu et récompense mais pas eau
Agent B voit eau mais pas le feu ni les
récompenses

A
B
28
Plan

Modèles markoviens
MDP
Extensions
Notre proposition
Interac-DEC-MDP
Formalisme
Exemples
Résolution
Conclusion

29
Résolution

En cours
Deux objectifs
Apprentissage individuel ? Collectif
Apprentissage collectif ? Individuel
Représentation décentralisée des politiques
Apprentissage individuel ? Collectif
Utilise les apprentissages individuels
Maximiser somme des récompenses escomptées
Représentation décentralisée des résolutions
dinteractions

30
Utilisation des Qinterac

Chaque agent dispose de
Description
S État du système
RIk,l Résultat dinteraction
A,P Agent Actif ou Passif
Interaction

Agent a A
Ik
Ik
Agent b P
Introduction du collectif
Rik,l
Rik,l
S?S
S?S
31
Approche naïve

3 apprentissages dépendants
Apprentissage actions individuelles
Q-learning individuel

? ? ?
? ? ?
A
B
? ? ?
32
Approche naïve

3 apprentissages dépendants
Apprentissage actions individuelles
Apprentissage des interactions

33
Approche naïve

3 apprentissages dépendants
Apprentissage actions individuelles
Apprentissage des interactions
Apprentissage des déclenchements

34
Problème à résoudre

Il reste à remettre à jour comportement
individuel
B na rien appris
Solution transfert de récompense

35
Essais

Forcer la Q-valeur de lautre agents
Donne des résultats
Pour linstant fait à la main
Apprentissages simultanés
Converge souvent
Reste à analyser plus finement ce passage.
Références au MDP faiblement couplés

36
Plan

Modèles markoviens
MDP
Extensions
Notre proposition
Interac-DEC-MDP
Formalisme
Exemples
Résolution
Conclusion

37
Conclusion

Un nouveau modèle Interac-DEC-MDP
Actions
Interactions
Problème collectif perçu partiellement
Séparer les décisions collectives / individuelles
Actions
Conséquences locales
Interactions
Conséquences plus globales
Décisions prises à plusieurs
Définit une nouvelle entité
Ensemble dagents
Transfert de récompense

38
Perspectives

Un exemple très simple
2 agents
Perception globale
Mais algorithmique non triviale
Première étape
Résoudre à deux agents
Par la suite
Changer déchelle (plus dagents)
Perceptions partielles
DEC-MDP (couplages supplémentaires)

Apprentissage Dans des systèmes Réels
39
Exemple
R1
R2
R3
R1 R2 R3
5 5 10 Peu importe
8 1 10 Clef et coffre
8 3 10 Individuelles

Write a Comment

User Comments (0)

About PowerShow.com

Recommended Relevance Latest Highest Rated Most Viewed

Sort by:

Related More from user

CrystalGraphics Presentations

World's Best PowerPoint Templates PowerPoint PPT Presentation

World's Best PowerPoint Templates - CrystalGraphics offers more PowerPoint templates than anyone else in the world, with over 4 million to choose from. Winner of the Standing Ovation Award for “Best PowerPoint Templates” from Presentations Magazine. They'll give your presentations a professional, memorable appearance - the kind of sophisticated look that today's audiences expect. Boasting an impressive range of designs, they will support your presentations with inspiring background photos or videos that support your themes, set the right mood, enhance your credibility and inspire your audiences.

CrystalGraphics 3D Character Slides for PowerPoint PowerPoint PPT Presentation

CrystalGraphics 3D Character Slides for PowerPoint - CrystalGraphics 3D Character Slides for PowerPoint

Chart and Diagram Slides for PowerPoint PowerPoint PPT Presentation

Chart and Diagram Slides for PowerPoint - Beautifully designed chart and diagram s for PowerPoint with visually stunning graphics and animation effects. Our new CrystalGraphics Chart and Diagram Slides for PowerPoint is a collection of over 1000 impressively designed data-driven chart and editable diagram s guaranteed to impress any audience. They are all artistically enhanced with visually stunning color, shadow and lighting effects. Many of them are also animated. And they’re ready for you to use in your PowerPoint presentations the moment you need them. – PowerPoint PPT presentation

Related Presentations

Thomas M. Shinnick, Ph.D. PowerPoint PPT Presentation

Thomas M. Shinnick, Ph.D. - Title: PowerPoint Presentation Author: Shinnick Last modified by: Alison Hottes Created Date: 11/1/2002 8:23:49 PM Document presentation format: On-screen Show (4:3) | PowerPoint PPT presentation | free to view

Strong Vincent High School PowerPoint PPT Presentation

Strong Vincent High School - Strong Vincent High School | PowerPoint PPT presentation | free to view

Flavia and Vincent dance for the children PowerPoint PPT Presentation

Flavia and Vincent dance for the children - They danced a Quickstep accompanied by the School Orchestra to the theme of 'The ... Tango for which they are world famous, and a very fast and magical Quickstep. ... | PowerPoint PPT presentation | free to view

The battle of St' Vincent PowerPoint PPT Presentation

The battle of St' Vincent - Jervis proposed a toast to victory in the battle that he now knew was imminent ... Decisive action - Nelson orders HMS Captain to wear ship ... | PowerPoint PPT presentation | free to view

Improving Satellite Microwave Products Deborah K. Smith, Chelle Gentemann, Thomas Meissner, Kyle Hilburn, Frank J. Wentz PowerPoint PPT Presentation

Improving Satellite Microwave Products Deborah K. Smith, Chelle Gentemann, Thomas Meissner, Kyle Hilburn, Frank J. Wentz - Improving Satellite Microwave Products Deborah K. Smith, Chelle Gentemann, Thomas Meissner, Kyle Hilburn, Frank J. Wentz Distributed Information Services for Climate ... | PowerPoint PPT presentation | free to view

Shantell Dixon, Thomas Harris PowerPoint PPT Presentation

Shantell Dixon, Thomas Harris - http://www.youtube.com/watch?v=lAur_I077NA&NR= McCarthy ... 25 billion dollars for the construction of 41,000 miles of interstate ... | PowerPoint PPT presentation | free to view

Taking Student Learning Seriously Vincent Tinto Syracuse University PowerPoint PPT Presentation

Taking Student Learning Seriously Vincent Tinto Syracuse University - 'Mobilizing for Student Success: An Institutional Responsibility' ... Cooperative Learning for Higher Education Faculty (Phoenix: Oryx Press, 1998) ... | PowerPoint PPT presentation | free to view

PowerPoint PPT Presentation

- A Wall in Naples Thomas Jones 1743 1803 1782 ... | PowerPoint PPT presentation | free to view

PRINCIPLES OF ENVIRONMENTAL HEALTH SCIENCES EHS 500 PowerPoint PPT Presentation

PRINCIPLES OF ENVIRONMENTAL HEALTH SCIENCES EHS 500 - 1. PRINCIPLES OF ENVIRONMENTAL HEALTH SCIENCES. EHS 500. Thomas G. Robins, MD, MPH ... GROUP DISCUSSIONS OF CASE STUDIES AND SELECTED JOURNAL ARTICLES (~6 hours) ... | PowerPoint PPT presentation | free to view

Occurrence of Radium-224, Radium-226, and Radium-228 in Aquifers Used Primarily for Drinking Water in the United States: Retrospective Survey of Results from 1987 to 2004 PowerPoint PPT Presentation

Occurrence of Radium-224, Radium-226, and Radium-228 in Aquifers Used Primarily for Drinking Water in the United States: Retrospective Survey of Results from 1987 to 2004 - Occurrence of Radium-224, Radium-226, and Radium-228 in Aquifers Used ... Zoltan Szabo, Eric Jacobsen, Jeffrey M Fischer, Thomas F Kraemer, and Vincent T dePaul ... | PowerPoint PPT presentation | free to view

Diapositive 1 PowerPoint PPT Presentation

Diapositive 1 - ... N 4 NOM PRENOM TELEPHONE CLASSEMENT LICENCE HUYNH Vincent 06 03 24 51 22 4/6 7850662 C GALLONI Maxwel 06 59 02 92 10 4/6 9843385 F BOUKOBZA Patrick 06 85 ... | PowerPoint PPT presentation | free to view

Closing Remarks PowerPoint PPT Presentation

Closing Remarks - Sun Hung Kei Properties Ltd. Distinguished Speakers. Mr Thomas C Y Chan, JP ... The University of Hong Kong. Ir Vincent W S Tong, BBS. See you next year! ... | PowerPoint PPT presentation | free to view

Animation PowerPoint PPT Presentation

Animation - Title: 1 Author: Vincent Last modified by: FAA(Chi) Created Date: 3/8/2005 3:41:06 PM Document presentation format: Company | PowerPoint PPT presentation | free to view

Field Emission Display (FED) PowerPoint PPT Presentation

Field Emission Display (FED) - ... electric current flows out of a polarized electrical device. ... 3 professors from the Cullen College of Engineering In Houston, TX. Vincent Donnelly ... | PowerPoint PPT presentation | free to view

REESE-LOVE PowerPoint PPT Presentation

REESE-LOVE - Hugh Reese, Sr Hugh Reese, Jr Simeon Averitt Reese William Henry Reese William Thomas Reese, Sr. William Thomas Reese, Jr. REESE-LOVE FAMILY William Thomas Reese, Jr ... | PowerPoint PPT presentation | free to view

Poetic Elements PowerPoint PPT Presentation

Poetic Elements - Poetic Elements Poetry is thoughts that breath and words that burn Thomas Gray Poetry is about interpretation It is not meant to be taken literally | PowerPoint PPT presentation | free to view

Living Justice: Catholic Social Teaching in Action Thomas Massaro, S.J. Franklin, WI: Sheed PowerPoint PPT Presentation

Living Justice: Catholic Social Teaching in Action Thomas Massaro, S.J. Franklin, WI: Sheed - Peace on Earth. Pastoral Constitution on the Church in the ... Church sees 'signs of the times' Development, new word for peace ... Peace and Reconciliation ... | PowerPoint PPT presentation | free to view

Florence Nightingale PowerPoint PPT Presentation

Florence Nightingale - Founded the Nightingale School and Home for Nurses at Saint Thomas' Hospital in London ... http://www.florence-nightingale.co.uk, March 20, 1999. http://www. ... | PowerPoint PPT presentation | free to view

CESAMES PowerPoint PPT Presentation

CESAMES - PARIS 5 UNIVERSITY PARIS DESCARTES P5 ... Patrick Mignon, directeur l'INSEP. Vincent Spenlehauer, directeur du GARIG. Chercheurs associ s ... | PowerPoint PPT presentation | free to view

Viewing the Medical Discourse through the Lens of Literature PowerPoint PPT Presentation

Viewing the Medical Discourse through the Lens of Literature - The Strange Case of Dr.Jehyll and Mr.Hyde by Robert Louis Stevenson's. Nineteenth century ... The Magic Mountain by Thomas Mann ... | PowerPoint PPT presentation | free to view

Mona Lisa PowerPoint PPT Presentation

Mona Lisa - Mona Lisa. Leonardo da Vinci. The Birth of Venus. Botticelli. The Arnolfini ... Leonardo da Vinci. Starry Night. Vincent Van Gogh. Nightwatch. Rembrandt ... | PowerPoint PPT presentation | free to view

Vincent PowerPoint PPT Presentation

Vincent - Vincent | PowerPoint PPT presentation | free to view

HABITS OF MIND PowerPoint PPT Presentation

HABITS OF MIND - knowing how to act intelligently when you DON'T know the answer ... (Vincent Van Gogh) 10. LISTENING TO OTHERS WITH UNDERSTANDING & EMPATHY ... | PowerPoint PPT presentation | free to view

Mona Lisa La Gioconda PowerPoint PPT Presentation

Mona Lisa La Gioconda - Mona Lisa La Gioconda. Leonardo da Vinci. 1503-05. The Starry Night, Vincent van Gogh,1889 ... Observing visitors of art galleries has revealed that the average ... | PowerPoint PPT presentation | free to view

What is CAM PowerPoint PPT Presentation

What is CAM - 'The figures suggest that the research community has a large task ahead and that ... Pragmatism (Vickers & Thomas) Patients. Safety. AND compare conventional medicine ... | PowerPoint PPT presentation | free to view

Acupuncture safety: what patients report on adverse events PowerPoint PPT Presentation

Acupuncture safety: what patients report on adverse events - Advice from acupuncturists to reduce or stop prescribed conventional ... Vincent & Coulter QSHC 2002; 11: 76-80. 4. What about delayed conventional diagnosis? ... | PowerPoint PPT presentation | free to view

Acupuncture safety: what patients report Hugh MacPherson PowerPoint PPT Presentation

Acupuncture safety: what patients report Hugh MacPherson - Adverse events following acupuncture: prospective survey of 32,000 consultations ... Hugh MacPherson, Kate Thomas, Stephen Walters, and Mike Fitter. ... | PowerPoint PPT presentation | free to view