Title: Grouping Multivariate Time Series Variables: Applications to Chemical Process and Visual Field Data
1Grouping Multivariate Time Series Variables
Applications to Chemical Process and Visual Field
Data
- Allan Tucker - Birkbeck College
- Stephen Swift - Brunel University
- Nigel Martin - Birkbeck College
- Xiaohui Liu - Brunel University
2Introduction
- Present a methodology to group Multivariate Time
Series (MTS) variables - MTS is a series of observations recorded over
time - Test on two real-world applications
- Grouping - partitioning a set of objects into a
number of mutually exclusive subsets - Many, if not all, are NP-Hard
3MTS Example
4Grouping MTS - Introduction
- Desirable to model MTS as a group of several
smaller dimensional MTS - Decompose MTS into several smaller dimensional
MTS based on dependencies in data - Large number of dependencies because one variable
may affect another after a certain time lag
5Grouping MTS - Methodology
One High Dimensional MTS (X)
6Correlation Search
- Spearmans Rank Correlation used
- Entire Search Space is too large
- Invalid Triples
- Autocorrelations
- duplicates irrespective of direction where lag
0 e.g. (xi ,xj ,0) and (xj ,xi ,0) - Evolutionary Programming approach found to be the
most efficient
7Grouping Genetic Algorithm- Representation and
Operators
- Previously compared and contrasted different GA
representations and operators - Falkenauers Crossover Mutation ensure Schema
Theory holds for grouping problems
Chromosome 0 1 1 0 0 2 1 2 0 1 2
8Grouping- The Grouping Metric Properties
- If Q is empty, then fitness maximised when each
variable is in a separate group - If Q contains all pairings of variables (the
entire search space), then fitness maximised
when all variables in the same group - If data is from mixed set of MTS, fitness
maximised when variables in the same group have
as many correlations as possible in Q and
variables in different groups have as few
correlations as possible in Q
9Oil Refinery Data
- Oil Refinery Process in Scotland
- Data recorded every minute
- Hundreds of variables
- Years of data available on repository
- Selected 50 interrelated variables over 10000
time points - Large Time Lags (up to 120 minutes between some
variables)
10Visual Field Data
The interval between tests is about 6 months
5
6
6
6
5
5
6
6
7
5
Typically, 76 points are measured
5
5
5
5
5
6
7
7
4
4
4
3
2
2
4
6
7
8
Values Range Between 60 very good, 0 blind
4
3
3
2
2
1
1
B
8
8
13
14
14
15
15
1
1
B
9
9
The number of tests can range between 10 and 44
13
13
13
14
15
15
13
11
10
9
12
12
12
12
12
11
10
10
Nerve Fibre Bundle (Right Eye)
12
12
12
11
11
10
X
12
11
11
11
Usual Position of Blind Spot (Right Eye)
B
11Oil Refinery Data - Results (1)
- Very rapid generation of Groups (seconds)
- 3 major groups discovered, 2 relating to the
upper and lower trays of the column - Most of the single variables appear noisy
- Used as a method for pre-processing data before
model building where time is short
12Oil Refinery Data - Results (2)
13Visual Field Data - Results (1)- Patient Group
Comparison
- Patients are ordered on Average Sensitivity
- Patient 1 - lowest and Patient 82 - the highest
- Graph goes from light (BRHC) to dark (TLHC)
14Visual Field Data - Results (2)
- High Sensitivity implies similar groups
- Small groups in general
- Points in the eye will be associated with similar
nerve fibre bundles - Low Sensitivity implies dissimilar groups
- Large groups in general
- Different areas of the visual field may be
deteriorating
15Conclusions
- Decomposing Large, High-Dimensional MTS is a
challenging one - Proposed methodology very encouraging
- Oil Refinery Data 3 relatively independent
sub-systems rapidly identified - Visual Field Data Discovered groups offer ideal
starting point for modelling as a VAR process
16Future Work
- Experimenting with new datasets
- Gene Expression Data
- EEG Data
- Determining the ideal Parameters
- e.g. Qlen is very influential on final groupings
- Combining the two stages - correlation search
and grouping into one incremental process
17Acknowledgements
- Engineering and Physical Sciences Research
Council, UK - Moorfields Eye Hospital, UK
- Honeywell Technology Centre, USA
- Honeywell Hi-Spec Solutions, UK
- BP-Amoco, UK