Machine learning techniques for quantifying neural synchrony: application to the diagnosis of Alzheimer's disease from EEG - PowerPoint PPT Presentation

About This Presentation
Title:

Machine learning techniques for quantifying neural synchrony: application to the diagnosis of Alzheimer's disease from EEG

Description:

Machine learning techniques for quantifying neural synchrony: application to the diagnosis of Alzheimer's disease from EEG Justin Dauwels LIDS, MIT – PowerPoint PPT presentation

Number of Views:1225
Avg rating:3.0/5.0

less

Transcript and Presenter's Notes

Title: Machine learning techniques for quantifying neural synchrony: application to the diagnosis of Alzheimer's disease from EEG


1
Machine learning techniques for quantifying
neural synchrony application to the diagnosis
of Alzheimer's disease from EEG
  • Justin Dauwels
  • LIDS, MIT
  • Amari Research Unit, Brain Science Institute,
    RIKEN
  • June 9, 2008

2
RIKEN Brain Science Institute
  • RIKEN Wako Campus (near Tokyo)
  • about 400 researchers and staff (20 foreign)
  • 300 research fellows and visiting scientists
  • about 60 laboratories
  • research covers most aspects of brain science

Collaborators François Vialatte, Theo Weber,
Shun-ichi Amari, Andrzej Cichocki (RIKEN,
MIT) Project Early diagnosis of Alzheimers
disease based on EEG Financial Support
3
Research Overview
Machine learning signal processing for
applications in NEUROSCIENCE
development of ALGORITHMS to analyze brain signals
  • EEG (RIKEN, MIT, MGH)
  • diagnosis of Alzheimers disease
  • detection/prediction of epileptic seizures
  • analysis of EEG evoked by visual/auditory
    stimuli
  • EEG during meditation
  • projects related to brain-computer interface
    (BMI)
  • Calcium imaging (RIKEN, NAIST, MIT)
  • effect of calcium on neural growth
  • role of calcium propagation in gliacells and
    neurons

subject of this talk
4
Overview
  • Alzheimers Disease (AD)
  • EEG of AD patients decrease in synchrony
  • Synchrony measure in time-frequency domain
  • Pairs of EEG signals
  • Collections of EEG signals
  • Numerical Results
  • Outlook

5
Alzheimer's disease
Outside glimpse clinical perspective
Evolution of the disease (stages)
One disease, many symptoms
EEG data
  • 2 to 5 years before
  • mild cognitive impairment (often unnoticed)
  • 6 to 25 progress to Alzheimer's per year

memory, language, executive functions, apraxia,
apathy, agnosia, etc
  • Mild (early stage)
  • becomes less energetic or spontaneous
  • noticeable cognitive deficits
  • still independent (able to compensate)

Memory (forgetting relatives)
  • Moderate (middle stage)
  • Mental abilities decline
  • personality changes
  • become dependent on caregivers

Apathy
  • Severe (late stage)
  • complete deterioration of the personality
  • loss of control over bodily functions
  • total dependence on caregivers

Loss of Self-control
  • 2 to 5 of people over 65 years old
  • up to 20 of people over 80
  • Jeong 2004 (Nature)

Video sources Alzheimer society
6
Alzheimer's disease
Inside glimpse brain atrophy
amyloid plaques and neurofibrillary tangles
Video source Alzheimer society
Images Jannis Productions. (R. Fredenburg S.
Jannis)
Video source P. Thompson, J.Neuroscience, 2003
7
Overview
  • Alzheimers Disease (AD)
  • EEG of AD patients decrease in synchrony
  • Synchrony measure in time-frequency domain
  • Pairs of EEG signals
  • Collections of EEG signals
  • Numerical Results
  • Outlook

8
Alzheimer's disease
Inside glimpse abnormal EEG
EEG system inexpensive, mobile, useful for
screening
Brain slow-down
slow rhythms (0.5-8 Hz) fast rhythms (8-30 Hz)
(Babiloni et al., 2004 Besthorn et al.,
1997 Jelic et al. 1996, Jeong 2004 Dierks et
al., 1993).
focus of this project
Decrease of synchrony
  • AD vs. MCI (Hogan et al. 203 Jiang et
    al., 2005)
  • AD vs. Control (Hermann, Demilrap, 2005,
    Yagyu et al. 1997 Stam et al., 2002 Babiloni et
    al. 2006)
  • MCI vs. mildAD (Babiloni et al., 2006).

Images www.cerebromente.org.br
9
Spontaneous (scalp) EEG
Time-frequency X(t,f)2 (wavelet transform)
f (Hz)
Time-frequency patterns (bumps)
Fourier X(f)2
Fourier power
t (sec)
amplitude
EEG x(t)
10
Fourier transform
2
3
1
3
2
1

Frequency
High frequency
Low frequency
11
Windowed Fourier transform


Fourier basis functions
Window function
windowed basis functions
f
Windowed Fourier Transform
t
12
Spontaneous EEG
Time-frequency X(t,f)2 (wavelet transform)
f (Hz)
Time-frequency patterns (bumps)
Fourier X(f)2
Fourier power
t (sec)
amplitude
EEG x(t)
13
Signatures of local synchrony
f (Hz)

Time-frequency patterns (bumps)
EEG stems from thousands of neurons bump if
neurons are phase-locked local synchrony
t (sec)
14
Alzheimer's disease
Inside glimpse abnormal EEG
EEG system inexpensive, mobile, useful for
screening
Brain slow-down
slow rhythms (0.5-8 Hz) fast rhythms (8-30 Hz)
(Babiloni et al., 2004 Besthorn et al.,
1997 Jelic et al. 1996, Jeong 2004 Dierks et
al., 1993).
focus of this project
Decrease of synchrony
  • AD vs. MCI (Hogan et al. 203 Jiang et
    al., 2005)
  • AD vs. Control (Hermann, Demilrap, 2005,
    Yagyu et al. 1997 Stam et al., 2002 Babiloni et
    al. 2006)
  • MCI vs. mildAD (Babiloni et al., 2006).

Images www.cerebromente.org.br
15
Overview
  • Alzheimers Disease (AD)
  • EEG of AD patients decrease in synchrony
  • Synchrony measure in time-frequency domain
  • Pairs of EEG signals
  • Collections of EEG signals
  • Numerical Results
  • Outlook

16
Comparing EEG signal rhythms ?

2 signals
PROBLEM I Signals of 3 seconds sampled at
100 Hz (? 300 samples) Time-frequency
representation of one signal about 25 000
coefficients
17
Comparing EEG signal rhythms ?(2)
PROBLEM II Shifts in time-frequency!
18
Sparse representation bump model
f(Hz)
f(Hz)
Bumps Sparse representation
t (sec)
f(Hz)
t (sec)
104- 105 coefficients
  • Assumptions
  • time-frequency map is suitable representation
  • oscillatory bursts (bumps) convey key
    information

t (sec)
about 102 parameters
Normalization
F. Vialatte et al. A machine learning approach
to the analysis of time-frequency maps and its
application to neural dynamics, Neural Networks
(2007).
19
Similarity of bump models...
How similar or synchronous are two bump
models? GLOBAL synchrony
Reminder bumps due to LOCAL synchrony
MULTI-SCALE approach
20
... by matching bumps
y2
y1
Some bumps match Offset between matched bumps
SIMILAR bump models if Many matches Strongly
overlapping matches
21
... by matching bumps (2)
  • Bumps in one model, but NOT in other
  • ? fraction of spurious bumps ?spur
  • Bumps in both models, but with offset
  • ? Average time offset dt (delay)
  • ? Timing jitter with variance st
  • ? Average frequency offset df
  • ? Frequency jitter with variance sf
  • Synchrony only st and ?spur relevant

Stochastic Event Synchrony (SES) (?spur,
dt, st, df, sf )
PROBLEM Given two bump models, compute (?spur,
dt, st, df, sf )
22
Overview
  • Alzheimers Disease (AD)
  • EEG of AD patients decrease in synchrony
  • Synchrony measure in time-frequency domain
  • Pairs of EEG signals
  • Collections of EEG signals
  • Numerical Results
  • Outlook

23
Average synchrony
3. SES for each pair of models 4. Average the SES
parameters
  1. Group electrodes in regions
  2. Bump model for each region

24
Beyond pairwise interactions...
Multi-variate similarity
Pairwise similarity
25
...by clustering
HARD combinatorial problem!
y2
y1
y3
y4
y5
  • Models similar if
  • few deletions/large clusters
  • little jitter

y2
y1
y3
y4
y5
Constraint in each cluster at most one bump from
each signal
26
Overview
  • Alzheimers Disease (AD)
  • EEG of AD patients decrease in synchrony
  • Synchrony measure in time-frequency domain
  • Pairs of EEG signals
  • Collections of EEG signals
  • Numerical Results
  • Outlook

27
EEG Data
  • EEG of 22 Mild Cognitive Impairment (MCI)
    patients and 38 age-matched
  • control subjects (CTR) recorded while in rest
    with closed eyes
  • ? spontaneous EEG
  • All 22 MCI patients suffered from Alzheimers
    disease (AD) later on
  • Electrodes located on 21 sites according to
    10-20 international system
  • Electrodes grouped into 5 zones (reduces number
    of pairs)
  • 1 bump model per zone
  • Used continuous artifact-free intervals of 20s
  • Band pass filtered between 4 and 30 Hz

EEG data provided by Prof. T. Musha
28
Similarity measures
  • Correlation and coherence
  • Granger causality (linear system) DTF, ffDTF,
    dDTF, PDC, PC, ...
  • Phase Synchrony compare instantaneous phases
    (wavelet/Hilbert transform)
  • State space based measures
  • sync likelihood, S-estimator,
    S-H-N-indices, ...
  • Information-theoretic measures

FREQUENCY
TIME
No Phase Locking
Phase Locking
29
(No Transcript)
30
Sensitivity (average synchrony)
Corr/Coh
Granger
Info. Theor.
State Space
Phase
SES
Mann-Whitney test small p value suggests large
difference in statistics of both groups
Significant differences for ffDTF and ?!
31
Classification
ffDTF
  • Clear separation, but not yet useful as
    diagnostic tool
  • Additional indicators needed (fMRI, MEG, DTI,
    ...)
  • Can be used for screening population
    (inexpensive, simple, fast)

32
Correlations
Strong (anti-) correlations families of
sync measures
33
Overview
  • Alzheimers Disease (AD)
  • EEG of AD patients decrease in synchrony
  • Synchrony measure in time-frequency domain
  • Pairs of EEG signals
  • Collections of EEG signals
  • Numerical Results
  • Outlook

34
Ongoing work
  • Time-varying similarity parameters
  • st

no stimulus
no stimulus
stimulus
high st
low st
high st
high st
low st
high st
35
Future work
  • Matching event patterns instead of single events
  • allows us to extract patterns in
    time-frequency map of EEG!
  • HYPOTHESIS
  • Perhaps specific patterns occur in time-frequency
    EEG maps
  • of AD patients
  • before onset of epileptic seizures
  • REMARK

f(Hz)
coupling between frequency bands
t (sec)
36
Conclusions
  • Measure for similarity of point processes
    (stochastic event synchrony)
  • Key idea alignment of events
  • Solved by statistical inference
  • Application EEG synchrony of MCI patients
  • About 85 correctly classified perhaps useful
    for screening population
  • Ongoing/future work time-varying SES, extracting
    patterns of bumps

37
References software
  • References
  • Quantifying Statistical Interdependence by
    Message Passing on Graphs Algorithms and
    Application to Neural Signals, Neural Computation
    (under revision)
  • A Comparative Study of Synchrony Measures for the
    Early Diagnosis of Alzheimer's Disease Based on
    EEG, NeuroImage (under revision)
  • Measuring Neural Synchrony by Message Passing,
    NIPS 2007
  • Quantifying the Similarity of Multiple
    Multi-Dimensional Point Processes by Integer
    Programming with Application to Early Diagnosis
    of Alzheimer's Disease from EEG, EMBC 2008
    (submitted)
  • Software
  • MATLAB implementation of the synchrony measures

38
Machine learning techniques for quantifying
neural synchrony application to the diagnosis
of Alzheimer's disease from EEG
  • Justin Dauwels
  • LIDS, MIT
  • Amari Research Unit, Brain Science Institute,
    RIKEN
  • June 9, 2008

39
Machine learning for neuroscience
  • Multi-scale in time and space
  • Data fusion EEG, fMRI, spike data, bio-imaging,
    ...
  • Large-scale inference
  • Visualization

Behavior ? Brain ? Brain Regions ? Neural
Assemblies ? Single neurons ? Synapses ? Ion
channels
40
Estimation
Simple closed form expressions
Deltas average offset
Sigmas var of offset
artificial observations (conjugate prior)
...where
41
Large-scale synchrony
Apparently, all brain regions affected...
42
Alzheimer's disease
Outside glimpse the future (prevalence)
USA (Hebert et al. 2003)
  • 2 to 5 of people over 65 years old
  • Up to 20 of people over 80
  • Jeong 2004 (Nature)

Million of sufferers
World (Wimo et al. 2003)
Million of sufferers
43
Ongoing and future work
Applications
  • Fluctuations of EEG synchrony
  • Caused by auditory stimuli and music (T.
    Rutkowski)
  • Caused by visual stimuli (F. Vialatte)
  • Yoga professionals (F. Vialatte)
  • Professional shogi players (RIKEN Fujitsu)
  • Brain-Computer Interfaces (T. Rutkowski)
  • Spike data from interacting monkeys (N. Fujii)
  • Calcium propagation in gliacells (N. Nakata)
  • Neural growth (Y. Tsukada Y. Sakumura)
  • ...

Algorithms
  • alternative inference techniques (e.g., MCMC,
    linear programming)
  • time dependent (Gaussian processes)
  • multivariate (T.Weber)

44
Fitting bump models
?
Signal
gradient method
F. Vialatte et al. A machine learning approach
to the analysis of time-frequency maps and its
application to neural dynamics, Neural Networks
(2007).
45
Boxplots
  • SURPRISE!
  • No increase in jitter, but significantly less
    matched activity!
  • Physiological interpretation
  • neural assemblies more localized?
  • harder to establish large-scale synchrony?

46
Similarity of bump models...
How similar or synchronous are two bump
models?
47
Probabilistic inference
POINT ESTIMATION ?(i1) argmaxx log p(y, y,
c(i1) ,? )
Uniform prior p(?) dt, df average
offset, st, sf variance of offset Conjugate
prior p(?) still closed-form expression Other
kind of prior p(?) numerical optimization
(gradient method)
48
Probabilistic inference
MATCHING c(i1) argmaxc log p(y, y, c, ?(i)
)
EQUIVALENT to (imperfect) bipartite max-weight
matching problem c(i1) argmaxc log p(y, y,
c, ?(i) ) argmaxc Skk wkk(i) ckk
s.t. Sk ckk 1 and Sk ckk 1 and ckk 2
0,1
find heaviest set of disjoint edges
not necessarily perfect
  • ALGORITHMS
  • Polynomial-time algorithms gives optimal
    solution(s) (Edmond-Karp and Auction
    algorithm)
  • Linear programming relaxation extreme points of
    LP polytope are integral
  • Max-product algorithm gives optimal solution if
    unique Bayati et al. (2005), Sanghavi (2007)

49
Max-product algorithm
MATCHING c(i1) argmaxc log p(y, y, c, ?(i)
)
Generative model
p(y, y, c, ?) / I(c) p?(?) ?kk (N(t k tk
dt ,st,kk) N(f k fk df ,sf, kk) ß-2)ckk
50
Max-product algorithm
MATCHING c(i1) argmaxc log p(y, y, c, ?(i)
)
Conditioning on ?
µ?
µ?
µ?
µ?
51
Max-product algorithm (2)
  • Iteratively compute messages
  • At convergence, compute marginals p(ckk)
    µ?(ckk) µ?(ckk) µ?(ckk)

52
Algorithm
PROBLEM Given two bump models, compute (?spur,
dt, st, df, sf )
?
APPROACH (c,?) argmaxc,? log p(y, y, c,
?)
SOLUTION Coordinate descent c(i1)
argmaxc log p(y, y, c, ?(i) ) ?(i1)
argmaxx log p(y, y, c(i1) ,? )
MATCHING ? max-product
ESTIMATION ? closed-form
53
Generative model
yhidden
  • Generate bump model (hidden)
  • geometric prior for number n of bumps
  • p(n) (1- ? S) (? S)-n
  • bumps are uniformly distributed in rectangle
  • amplitude, width (in t and f) all i.i.d.
  • Generate two noisy observations
  • offset between hidden and observed bump
  • Gaussian random vector with
  • mean ( dt /2, df /2)
  • covariance diag(st/2, sf /2)
  • amplitude, width (in t and f) all i.i.d.
  • deletion with probability pd

y
y
( -dt /2, -df /2)
( dt /2, df /2)
Easily extendable to more than 2 observations
54
Generative model (2)
y
y
i
( -dt /2, -df /2)
i
j
( dt /2, df /2)
  • Binary variables ckk
  • ckk 1 if k and k are observations of
    same hidden bump, else ckk 0 (e.g., cii 1
    cij 0)
  • Constraints bk Sk ckk and bk Sk ckk
    are binary (matching constraints)
  • Generative Model p(y, y, yhidden , c, dt , df
    , st , sf ) (symmetric in y and y)
  • Eliminate yhidden ? offset is Gaussian RV with
    mean ( dt , df ) and covariance diag (st , sf)
  • Probabilistic Inference

?
p(y, y, c, ?) ? p(y, y, yhidden , c, ?)
dyhidden
(c,?) argmaxc,? log p(y, y, c, ?)
55
Summary
  • Bumps in one model, but NOT in other
  • ? fraction of spurious bumps ?spur
  • Bumps in both models, but with offset
  • ? Average time offset dt (delay)
  • ? Timing jitter with variance st
  • ? Average frequency offset df
  • ? Frequency jitter with variance sf

PROBLEM Given two bump models, compute (?spur,
dt, st, df, sf )
?
APPROACH (c,?) argmaxc,? log p(y, y, c,
?)
56
Objective function
y
y
i
( -dt /2, -df /2)
i
j
( dt /2, df /2)
  • Logarithm of model log p(y, y, c, ?) Skk
    wkk ckk log I(c) log p?(?) ?

wkk -(1/st (t k tk dt)2 1/sf (f k
fk df)2 ) - 2 log ß ß pd (?/V)1/2
Euclidean distance between bump centers
  • Large wkk if a) bumps are close b)
    small pd c) few bumps per volume element
  • No need to specify pd , ?, and V, they only
    appear through ß knob to control matches

57
Distance measures
Scaling
wkk 1/st,kk (t k tk dt)2 1/sf,kk
(f k fk df)2 2 log ß st,kk (?tk
?tk) st sf,kk (?fk ?fk) sf

Non-Euclidean
58
Generative model
p(y, y, c, ?) / I(c) p?(?) ?kk (N(t k tk
dt ,st,kk) N(f k fk df ,sf, kk) ß-2)ckk
59
Prior for parameters
  • Expect bumps to appear at about same frequency,
    but delayed
  • Frequency shift requires non-linear
    transformation, less likely than delay
  • Conjugate priors for st and sf (scaled inverse
    chi-squared)
  • Improper prior for dt and dt p(dt) 1 p(df)

60
Preliminary results for multi-variate model
linear comb of pc

CTR
MCI
61
Probabilistic inference
PROBLEM Given two bump models, compute (?spur,
dt, st, df, sf )
?
APPROACH (c,?) argmaxc,? log p(y, y, c,
?)
SOLUTION Coordinate descent c(i1)
argmaxc log p(y, y, c, ?(i) ) ?(i1)
argmaxx log p(y, y, c(i1) ,? )
MATCHING
POINT ESTIMATION
Minx2 X, y2Y d(x,y)
X
Y
62
Generative model
yhidden
  • Generate bump model (hidden)
  • geometric prior for number n of bumps
  • p(n) (1- ? S) (? S)-n
  • bumps are uniformly distributed in rectangle
  • amplitude, width (in t and f) all i.i.d.

y1
y2
y3
y4
y5
  • Generate M noisy observations
  • offset between hidden and observed bump
  • Gaussian random vector with
  • mean ( dt,m /2, df,m /2)
  • covariance diag(st,m/2, sf,m
    /2)
  • amplitude, width (in t and f) all i.i.d.
  • deletion with probability pd
  • (other prior pc0 for cluster size)

pc (i) p(cluster size i y) (i
1,2,,M)
Parameters ? dt,m , df,m , st,m , sf,m, pc
63
Role of local synchrony
Stimuli
Consolidation
Stimulus
Assembly activation
Assembly recall
Hebbian consolidation
Voice
Face
Voice
(Hebb 1949, Fuster 1997)
64
Probabilistic inference
PROBLEM Given M bump models, compute ? dt,m ,
df,m , st,m , sf,m, pc
APPROACH (c,?) argmaxc,? log p(y, y, c,
?)
SOLUTION Coordinate descent c(i1)
argmaxc log p(y, y, c, ?(i) ) ?(i1)
argmaxx log p(y, y, c(i1) ,? )
CLUSTERING (IP or MP)
POINT ESTIMATION
  • Integer program
  • Max-product algorithm (MP) on sparse graph
  • Integer programming methods (e.g., LP
    relaxation)
Write a Comment
User Comments (0)
About PowerShow.com