Title: ETVC/LIX Colloquium--Paris Information Geometry and its Applications -- alpha divergence Shun-ichi Amari RIKEN Brain Science Institute
1ETVC/LIX Colloquium--ParisInformation Geometry
and its Applications -- alpha
divergenceShun-ichi AmariRIKEN Brain Science
Institute
2Information Geometry
Systems Theory
Information Theory
Statistics
Neural Networks
Combinatorics
Physics
Information Sciences
Math.
AI
Riemannian Manifold Dual Affine Connections
Manifold of Probability Distributions
3Information Geometry ?
Riemannian metric Dual affine connections
4Invariance
1. Invariant under reparameterization
2. Invariant under different representation
5Riemannian Structure
6Affine Connection
- covariant derivative parallel transport
straight line
7Duality two affine connections
Y
X
Y
X
Riemannian geometry
8Dual Affine Connections
e-geodesic
m-geodesic
9Alpha affine connection -duality
10Dually flat manifold
11Information Geometry -- Dually Flat Manifold
Convex Analysis Legendre transformation Divergence
Pythagorean theorem I-projection
12Dually Flat Manifold
1. Potential Functions
---convex (Bregman, Legendre transformation)
2. Divergence
3. Pythagoras Theorem
- Projection Theorem
- 5.Dual foliation
13Projection Theorem
14Applications to Statistics
curved exponential family
estimation
testing
15Other Applications
- Systems theory
- Information theory
- Optimization
- Belief propagation
- Vision and signal processing
- Machine Learning Boosting
- Neuromanifold
- Higher-order correlations
- Mathematics --- Orlicz space (Pistone, Gracceli)
- Physics ---
Amari-Nagaoka, Methods of Information Geometry,
AMS Oxford U., 2000
16Linear Programming(cone programming)
17Information Geometry of Neuromanifolds
Multilayer Perceptron
- Shun-ichi Amari
- RIKEN Brain Science Institute
collaborators H.Park T.Ozeki K.Fukumizu
18Multilayer Perceptron
neuromanifold
space of functions
19singularities
20Milnor attracter
21 Information Geometry of Belief Propagation
- Shun-ichi Amari (RIKEN BSI)
- Shiro Ikeda (Inst. Statist. Math.)
- Toshiyuki Tanaka (Tokyo Metropolitan U.)
22Information Geometry on Positive Arrays
invariance
Riemannian metric
divergence
dual affine connections
23dally flat space
convex functions
divergence
invariance
Bregman divergence
f-divergence
24space of positive measures vectors, matrices,
arrays
f-divergence
Bregman divergence
a-divergence
25Csiszar f-divergence
26(No Transcript)
27divergence
28divergence
KL-divergence
29Invariance --- characterization of f-divergence
30Invariance ? f-divergence
31Invariance
32Bregman divergence
33Legendre duality
34Example
exponential family
35divergence
36Geometry
Straightness (affine connection)
37Pythagorean Theorem
38Projection Theorem
39U-divergence
40ß-divergence
41Geometry
42a-representation
typical case
43Divergence over a-representation
44(No Transcript)
45Manifold of positive-definite matrices
S Ppotential function
Adequate representation of P
46Integration of Stochastic Evidences ?Information
Geometry Approach
- Shun-ichi Amari
- RIKEN Brain Science Institute
47Various Means
geometric
arithmetic
harmonic
Any other mean?
48Generalized mean f-mean f(u) monotone
f-representation of u
scale free
a-representation
49- mean
50-mean
51divergence positive real numbers
52Family of Distributions
mixture family
exponential family
53Bayes Predictive Distribution
54Optimal Property
given data
find
55Theorem
The predictive distribution minimizes
the risk.
The Bayes predictive distribution
The product of experts
predictive distribution Hellinger
pessimistic optimistic
56Conclusions Alpha-divergence alpha
connection f- and Bregman-divergence
-- invariancy and dually flat invariant
averaging -- alpha family Renyi,
Chernoff and Tsallis entropy