STIFF: A Forecasting Framework for Spatio-Temporal Data - PowerPoint PPT Presentation

About This Presentation
Title:

STIFF: A Forecasting Framework for Spatio-Temporal Data

Description:

STIFF: A Forecasting Framework for Spatio-Temporal Data Zhigang Li, Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist University – PowerPoint PPT presentation

Number of Views:202
Avg rating:3.0/5.0
Slides: 25
Provided by: Adm952
Learn more at: https://s2.smu.edu
Category:

less

Transcript and Presenter's Notes

Title: STIFF: A Forecasting Framework for Spatio-Temporal Data


1
STIFF A Forecasting Framework for
Spatio-Temporal Data
  • Zhigang Li, Margaret H. Dunham
  • Department of Computer Science and Engineering
  • Southern Methodist University
  • (Abbreviated Version from PAKDD 02)

2
Our goal
  • In this paper, we present a novel forecasting
    framework for spatio-temporal data, in which not
    only spatial but also temporal characteristics of
    the data are considered to obtain a more
    appropriate result.

3
Presentation Outline
  • Motivation
  • Our Approach STIFF
  • Combining two approaches to achieve better
    results Time Series Analysis and ANNs
  • Performance
  • Future Work

4
Why
  • There are many application fields which require
    spatio-temporal forecasting
  • river hydrology, biological patterns, housing
    price research, rainfall distribution, waste
    monitoring, fishery, hotel pickup rate, etc.
  • In spatio-temporal forecasting, both spatial and
    temporal properties, as well as their mutual
    correlation, are taken into account.

5
Flood Forecasting (Our Motivating Application)
  • Catchment
  • Many different types of sensors
  • Predict at one sensor location
  • Water level or Flow rate
  • May not be interested in actual prediction of
    value

6
Our approach Problem definition
  • ?a0, a1, a2, an is the research field,
    composed of n 1 spatially separated
    subcomponents, named by ai accordingly.
  • WLOG, a0 is assumed the target place where
    forecasting is about to be carried out.
  • For each ai in ?, there are j observations with
    equal time intervals between consecutive ones,
    denoted by ?iai1, ai2, ai3, aij.

7
Problem definition (Cont.)
  • Given ?a0, a1, a2, an, ??1, ?2, ?n, the
    length of observations j and the look-ahead steps
    of ?, we are expected to find an as good as
    possible forecasting relationship ƒ that is
    defined as follows.

8
Our approach Algorithm sketch
  • Describe the forecasting problem according the
    problem definition.
  • Build a time series (ARIMA) model for each ai.
    Name the forecasting from a0 time series model as
    ƒT.
  • Construct and train an ANN to capture the spatial
    correlation and influence over the target
    subcomponent a0. Name the forecasting from the
    neural network as ƒS.
  • Combine ƒT and ƒS via a statistical regression
    mechanism.

9
Find the spatial influence
  • Normally it is much harder to find than its
    temporal counterpart in the problem.
  • No precise way to convert from the spatial
    measurement to the value it may change.
  • Time is only 1 dimension while space is 3 (or 2)
    dimensions.
  • A simple distance measure is not enough, other
    factors are important.

10
Artificial Neural Network (ANN)
  • Why is ANN used for finding spatial influence?
  • Itself a black-box and non-linear technology
    used to find the hidden pattern.
  • Like human brain, it can self-adjust and learn
    automatically even if the problem is not defined
    very well.
  • Practice proves its usefulness
  • See,1997 found ANN was especially useful in
    situations where the underlying physical
    relationships are not fully understood

11
ANN Construction
  • Simple 3-layer back-propagation MLP
  • One input node for each sensor value except a0.
  • Actual input shifted by predicted time lag.
  • The hidden layer has a certain number of neurons
    that have to be decided by experiment.
  • The output layer has only one neuron that
    corresponds to the target subcomponent a0.
  • We also employ a kind of pruning strategy to
    achieve the most simplicity of ANN structure
    without harming the efficacy much.

12
Integrate the two forecasts
  • We have two forecasts so far at the target
    subcomponent a0. One is ƒT, from the time series
    model, and the other is ƒS, from ANN. We may
  • Either dynamically select one from the two as the
    current forecast
  • Or fuse them together since they contribute to
    the overall forecasting from two different
    aspects. (Thats what we take in the paper.)
  • The two forecasts are integrated via a very
    simple linear regression mechanism. Of course
    other more advanced alternatives can be used
    instead for better results.

13
A case study (National River Flow Archive Great
Britain)
  • Here we are going to present a practical case
    study to demonstrate how the framework works.
  • We will conduct the spatio-temporal forecasting
    at the outlet gauging station 28010 regarding the
    river water flow rate (m3/s). The basin is shown
    as follows.
  • The target station is 28010 while its siblings
    are lying upstream.
  • Derwent Catchment
  • Daily mean flow values

14
Data transformation
  • Checking the water flow rate data at station
    28010 tells us the data is not very stable. The
    abrupt change is obvious and present roughly
    about 25 of the whole time.
  • We therefore employ the data transformation first
    according to the proposed approach discussed
    before .
  • We empirically vary the value of ? from 1.0 to
    1.0 with the step of .1. It turns out ? 0.0 is
    the best (relatively). In other words, we will
    log-transform the original water flow rate data.

15
Actual Flow at Derwent
16
Case Study ANN
  • 6 input nodes
  • 1 output node
  • 6 chosen as number of hidden nodes based on
    experimentation
  • Number of links pruned based on river topology
  • Lag time used for input based on expected flow
    lag time

17
Building models
  • Following the framework specification, we then
    build a time series model based upon the dataset
    collected from each gauging station.
  • An ANN is constructed after that, with the
    spatially-induced pruning strategy applied to
    erase as many as possible unnecessary links while
    sacrificing little to the forecasting accuracy.
  • The final overall spatio-temporal forecasting is
    generated then following this simple regression

18
STIFF Model
x1 fT x2 fS C
19
Performance Analysis
  • Compared STIFF to pure time series (CTS) and pure
    ANN (CANN)
  • Data starting at 10/01/75
  • 30, 60, 120 days
  • Normalized Absolute Ratio Error (NARE)

20
Forecasting result
  • The forecasting comparison result, measured in
    NARE, is outlined in the following table. The
    other two models, built to our best knowledge,
    are used to compare with STIFF.
  • Here Over means overestimation while Under
    for underestimation.

21
Result 30 Days
22
Conclusion
  • STIFF has a better forecast accuracy than the
    normal single time series model and ANN model,
    and more balanced (over vs. under estimation).
  • Compared with other related work, it avoids the
    oversimplification.
  • Does not have the large variation problem.
  • STIFF requires much human intervention and
    interpretation.
  • STIFF is promising for future research.

23
Future work
  • Extend to multivariate forecasting
  • Use more sophisticated fusing techniques
  • Test on more flood data
  • Compare to other techniques
  • Examine different ANN structures
  • So far, it can only deal with univariate
    forecasting.
  • Extend to other application domains
  • ..

24
Thank you!
Write a Comment
User Comments (0)
About PowerShow.com