Grouping Multivariate Time Series Variables: Applications to Chemical Process and Visual Field Data - PowerPoint PPT Presentation

About This Presentation
Title:

Grouping Multivariate Time Series Variables: Applications to Chemical Process and Visual Field Data

Description:

Allan Tucker - Birkbeck College. Stephen Swift - Brunel University ... MTS is a series of observations recorded over time. Test on two real-world applications ... – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 18
Provided by: allant5
Category:

less

Transcript and Presenter's Notes

Title: Grouping Multivariate Time Series Variables: Applications to Chemical Process and Visual Field Data


1
Grouping Multivariate Time Series Variables
Applications to Chemical Process and Visual Field
Data
  • Allan Tucker - Birkbeck College
  • Stephen Swift - Brunel University
  • Nigel Martin - Birkbeck College
  • Xiaohui Liu - Brunel University

2
Introduction
  • Present a methodology to group Multivariate Time
    Series (MTS) variables
  • MTS is a series of observations recorded over
    time
  • Test on two real-world applications
  • Grouping - partitioning a set of objects into a
    number of mutually exclusive subsets
  • Many, if not all, are NP-Hard

3
MTS Example
4
Grouping MTS - Introduction
  • Desirable to model MTS as a group of several
    smaller dimensional MTS
  • Decompose MTS into several smaller dimensional
    MTS based on dependencies in data
  • Large number of dependencies because one variable
    may affect another after a certain time lag

5
Grouping MTS - Methodology
One High Dimensional MTS (X)
6
Correlation Search
  • Spearmans Rank Correlation used
  • Entire Search Space is too large
  • Invalid Triples
  • Autocorrelations
  • duplicates irrespective of direction where lag
    0 e.g. (xi ,xj ,0) and (xj ,xi ,0)
  • Evolutionary Programming approach found to be the
    most efficient

7
Grouping Genetic Algorithm- Representation and
Operators
  • Previously compared and contrasted different GA
    representations and operators
  • Falkenauers Crossover Mutation ensure Schema
    Theory holds for grouping problems

Chromosome 0 1 1 0 0 2 1 2 0 1 2
8
Grouping- The Grouping Metric Properties
  • If Q is empty, then fitness maximised when each
    variable is in a separate group
  • If Q contains all pairings of variables (the
    entire search space), then fitness maximised
    when all variables in the same group
  • If data is from mixed set of MTS, fitness
    maximised when variables in the same group have
    as many correlations as possible in Q and
    variables in different groups have as few
    correlations as possible in Q

9
Oil Refinery Data
  • Oil Refinery Process in Scotland
  • Data recorded every minute
  • Hundreds of variables
  • Years of data available on repository
  • Selected 50 interrelated variables over 10000
    time points
  • Large Time Lags (up to 120 minutes between some
    variables)

10
Visual Field Data
The interval between tests is about 6 months
5
6
6
6
5
5
6
6
7
5
Typically, 76 points are measured
5
5
5
5
5
6
7
7
4
4
4
3
2
2
4
6
7
8
Values Range Between 60 very good, 0 blind
4
3
3
2
2
1
1
B
8
8
13
14
14
15
15
1
1
B
9
9
The number of tests can range between 10 and 44
13
13
13
14
15
15
13
11
10
9
12
12
12
12
12
11
10
10
Nerve Fibre Bundle (Right Eye)
12
12
12
11
11
10
X
12
11
11
11
Usual Position of Blind Spot (Right Eye)
B
11
Oil Refinery Data - Results (1)
  • Very rapid generation of Groups (seconds)
  • 3 major groups discovered, 2 relating to the
    upper and lower trays of the column
  • Most of the single variables appear noisy
  • Used as a method for pre-processing data before
    model building where time is short

12
Oil Refinery Data - Results (2)
13
Visual Field Data - Results (1)- Patient Group
Comparison
  • Patients are ordered on Average Sensitivity
  • Patient 1 - lowest and Patient 82 - the highest
  • Graph goes from light (BRHC) to dark (TLHC)

14
Visual Field Data - Results (2)
  • High Sensitivity implies similar groups
  • Small groups in general
  • Points in the eye will be associated with similar
    nerve fibre bundles
  • Low Sensitivity implies dissimilar groups
  • Large groups in general
  • Different areas of the visual field may be
    deteriorating

15
Conclusions
  • Decomposing Large, High-Dimensional MTS is a
    challenging one
  • Proposed methodology very encouraging
  • Oil Refinery Data 3 relatively independent
    sub-systems rapidly identified
  • Visual Field Data Discovered groups offer ideal
    starting point for modelling as a VAR process

16
Future Work
  • Experimenting with new datasets
  • Gene Expression Data
  • EEG Data
  • Determining the ideal Parameters
  • e.g. Qlen is very influential on final groupings
  • Combining the two stages - correlation search
    and grouping into one incremental process

17
Acknowledgements
  • Engineering and Physical Sciences Research
    Council, UK
  • Moorfields Eye Hospital, UK
  • Honeywell Technology Centre, USA
  • Honeywell Hi-Spec Solutions, UK
  • BP-Amoco, UK
Write a Comment
User Comments (0)
About PowerShow.com