ETVC/LIX Colloquium--Paris Information Geometry and its Applications -- alpha divergence Shun-ichi Amari RIKEN Brain Science Institute - PowerPoint PPT Presentation

About This Presentation
Title:

ETVC/LIX Colloquium--Paris Information Geometry and its Applications -- alpha divergence Shun-ichi Amari RIKEN Brain Science Institute

Description:

Information Geometry and ... Divergence over -representation Adequate representation of P Integration of Stochastic Evidences Information Geometry Approach ... – PowerPoint PPT presentation

Number of Views:163
Avg rating:3.0/5.0
Slides: 57
Provided by: vide61
Category:

less

Transcript and Presenter's Notes

Title: ETVC/LIX Colloquium--Paris Information Geometry and its Applications -- alpha divergence Shun-ichi Amari RIKEN Brain Science Institute


1
ETVC/LIX Colloquium--ParisInformation Geometry
and its Applications -- alpha
divergenceShun-ichi AmariRIKEN Brain Science
Institute
2
Information Geometry
Systems Theory
Information Theory
Statistics
Neural Networks
Combinatorics
Physics
Information Sciences
Math.
AI
Riemannian Manifold Dual Affine Connections
Manifold of Probability Distributions
3
Information Geometry ?
Riemannian metric Dual affine connections
4
Invariance
1. Invariant under reparameterization
2. Invariant under different representation
5
Riemannian Structure
6
Affine Connection
  • covariant derivative parallel transport

straight line
7
Duality two affine connections
Y
X
Y
X
Riemannian geometry
8
Dual Affine Connections
e-geodesic
m-geodesic
9
Alpha affine connection -duality
10
Dually flat manifold
11
Information Geometry -- Dually Flat Manifold
Convex Analysis Legendre transformation Divergence
Pythagorean theorem I-projection
12
Dually Flat Manifold
1. Potential Functions
---convex (Bregman, Legendre transformation)
2. Divergence
3. Pythagoras Theorem
  • Projection Theorem
  • 5.Dual foliation

13
Projection Theorem
14
Applications to Statistics
curved exponential family
estimation
testing
15
Other Applications
  • Systems theory
  • Information theory
  • Optimization
  • Belief propagation
  • Vision and signal processing
  • Machine Learning Boosting
  • Neuromanifold
  • Higher-order correlations
  • Mathematics --- Orlicz space (Pistone, Gracceli)
  • Physics ---

Amari-Nagaoka, Methods of Information Geometry,
AMS Oxford U., 2000
16
Linear Programming(cone programming)
17
Information Geometry of Neuromanifolds
Multilayer Perceptron
  • Shun-ichi Amari
  • RIKEN Brain Science Institute

collaborators H.Park T.Ozeki K.Fukumizu
18
Multilayer Perceptron
neuromanifold
space of functions
19
singularities
20
Milnor attracter
21
Information Geometry of Belief Propagation
  • Shun-ichi Amari (RIKEN BSI)
  • Shiro Ikeda (Inst. Statist. Math.)
  • Toshiyuki Tanaka (Tokyo Metropolitan U.)

22
Information Geometry on Positive Arrays
invariance
Riemannian metric
divergence
dual affine connections
23
dally flat space
convex functions
divergence
invariance
Bregman divergence
f-divergence
24
space of positive measures vectors, matrices,
arrays
f-divergence
Bregman divergence
a-divergence
25
Csiszar f-divergence
26
(No Transcript)
27
divergence
28
divergence
KL-divergence
29
Invariance --- characterization of f-divergence
30
Invariance ? f-divergence
31
Invariance
32
Bregman divergence
33
Legendre duality
34
Example
exponential family
35
divergence
36
Geometry
Straightness (affine connection)
37
Pythagorean Theorem
38
Projection Theorem
39
U-divergence
40
ß-divergence
41
Geometry
42
a-representation
typical case
43
Divergence over a-representation
44
(No Transcript)
45
Manifold of positive-definite matrices
S Ppotential function
Adequate representation of P
46
Integration of Stochastic Evidences ?Information
Geometry Approach
  • Shun-ichi Amari
  • RIKEN Brain Science Institute

47
Various Means


geometric
arithmetic
harmonic

Any other mean?
48
Generalized mean f-mean f(u) monotone
f-representation of u
scale free
a-representation
49
- mean
50
-mean
51
divergence positive real numbers
52
Family of Distributions
mixture family
exponential family
53
Bayes Predictive Distribution
54
Optimal Property
given data
find
55
Theorem
The predictive distribution minimizes
the risk.
The Bayes predictive distribution
The product of experts
predictive distribution Hellinger
pessimistic optimistic
56
Conclusions Alpha-divergence alpha
connection f- and Bregman-divergence
-- invariancy and dually flat invariant
averaging -- alpha family Renyi,
Chernoff and Tsallis entropy
Write a Comment
User Comments (0)
About PowerShow.com