Title: Extremal cluster characteristics of a regime switching model, with hydrological applications
1Extremal cluster characteristics of a regime
switching model, with hydrological applications
- Péter Elek,
- Krisztina Vasas and András Zempléni
- Eötvös Loránd University, Budapest
- elekpeti_at_cs.elte.hu
- 4th Conference on Extreme Value Analysis
- Gothenburg, 2005
2Contents
- Outline of EVT for stationary series
- extremal index
- limiting cluster size distribution (e.g.
distribution of flood length) - distribution of aggregate excesses (e.g.
distribution of flood volume) - Two models
- a light-tailed conditionally heteroscedastic
model - a regime switching autoregressive model
- Extremal behaviour of the regime switching model
- Application to the study of flood dynamics
3Quantities of interest in the analysis of time
series extremes
- Some are determined by the marginal distribution
- probability of exceeding a high threshold
- distribution of exceedances of a high threshold
- Others are determined by the clustering dynamics
of extreme values - average length of an extremal event (e.g. length
of a flood) - distribution of the length of an extremal event
- distribution of aggregate excesses (e.g.
distribution of the flood volume)
4Extremal index
- Conditions D(un) or ?(un) are always assumed.
- A stationary series has extremal index ? if there
exists a real sequence un for which - n(1-F(un)) ? ?
- P(M1,n?un) ? exp(-??)
- where M1,n max(X1,X2,...,Xn)
- Under D(un) the extremal index can be estimated
as - ? lim P(M1,p(n) ? un X0gtun)
- where p(n) is an appropriately increasing
sequence - p(n) is regarded as the cluster size
5Cluster size distribution and point process
convergence
- Distribution of the number of exceedances in
1,pn - ?n(j) P( 1X1gtun... 1Xp(n)gtun j
M1,p(n)gtun ) - The point process of exceedances
- Nn(.) ? ?i/n(.)1Xigtun
- Under appropriate conditions
- ?n converges to some limiting distribution ?
- Nn(.) converges weakly to a compound Poisson
process - whose underlying Poisson process has intensity
?? - and whose i.i.d clusters are distributed as ?
- High-level exceedances occur in clusters, with
cluster size distribution ?. Moreover, E(?)1/?.
6Distribution of aggregate excess
- Aggregate excess above u in time interval k,l
- Wk,l(u) (Xk-u)(Xk1-u)...(Xl-u)
- This value (called flood volume in hydrology) is
a good indicator of the severity of extreme
events. - Under appropriate conditions (Smith et al.,
1997) - W1,n(un) ?d W1W2...WK
- where KPoisson(??) and the variables Wi are
i.i.d, independent of K. - The distribution of Wi can be regarded as the
limiting aggregate excess distribution during an
extremal event.
7Problems
- Estimation of limiting quantities (?, ?, W) is
difficult. - Often the subasymptotic behaviour is of interest,
too, since the convergence to the limit is very
slow. - To overcome these problems, one can restrict
attention to certain families of models. - A large class of Markov-chains behaves like a
random walk at extreme levels - which can be used to simulate extremal clusters
in a Markov-chain, see e.g. Smith et al. (1997)
8Water discharge series are non-Markovian even
above high thresholds
- If the series were Markovian,
- (Xt-Xt-1 Xt-1,Xt-1-Xt-2gt0) (Xt Xt-1
Xt-1,Xt-1-Xt-2lt0) would hold - The following plots show Xt-Xt-1 as a function of
Xt-1 (if Xt-1 is above the 98 quantile),
conditionally on the sign of Xt-1-Xt-2 - The two plots are not similar!
9A light-tailed conditionally heteroscedastic model
- Xt-ct ? ai(Xt-i-ct-i) ?t ? bj?t-j
- ?t ?t Zt
- ?t d0 d1(Xt-1-m)1/2
- Zt is an i.i.d. sequence with zero mean and unit
variance - ct describes the deterministic seasonal behaviour
in mean - If all moments of Zt are finite, then all moments
of Xt are finite - However, the exact tail behaviour is unknown (a
special case of a similar model has Weibull-like
tails, see Robert, 2000) - The model approximates the extremal properties of
water discharge series well (see Elek and Márkus,
2005)
10A regime switching (RS) autoregressive model
- Xt Xt-1 ?1t if It 1 (rising regime)
- Xt aXt-1 ?0t if It 0 (falling regime)
- ?1t is an i.i.d noise, distributed as Gamma(?,?)
- ?0t is an i.i.d noise, distributed as Normal(0,?)
- 0ltalt1
- Successive regime durations are independent and
distributed as - NegBinom(?1,p1) in the rising regime
- NegBinom(?0,p0) in the falling regime
11Properties of the RS-model
- Heuristic explanation
- Xt gets independent positive shocks in the rising
regime - it develops as a mean-reverting autoregression in
the falling regime - If ?1?01, then It is a Markov-chain and Xt is a
Markov-switching autoregression - The model is stationary by applying the result of
Brandt (1986) for stochastic difference equations - Regime switching models have deep roots in
hydrology (see e.g. Bálint and Szilágyi, 2005)
12The model gives back the asymmetric shape of the
hydrograph
13Tail behaviour of the stationary distribution
- Theorem The process has Gamma-like upper tail
- P( Xtgtu It1 ) K1 u?-1 exp-?u1-(1-p1)1/?
- P( Xtgtu It0 ) K0 u?-1 exp-?u1-(1-p1)1/?/a
- thus P( Xtgtu ) K1 u?-1 exp-?u1-(1-p1)1/?.
- The proof is based on the observations that
- the aggregate increment during a rising regime
has Gamma-like tail - which becomes negligible during the falling
regime. - Corollary Exceedances above high thresholds are
asymptotically exponentially distributed - limu?? P(Xtgtxu Xtgtu) exp-?x1-(1-p1)1/?
14Limiting cluster quantities in the model I.
- Even when the regime lengths are negative
binomial, - the extremal index is p1,
- and the limiting cluster size distribution is
geometric with parameter p1.
15Limiting cluster quantities in the model II.
- If ?1, the limiting aggregate excess
distribution is W E1 2E2 ... NEN - where N is geometric with parameter p1
- the variables Ei are exponential with parameter
?, independent from each other and from N - The exponential moments are infinite, but all
polynomial moments are finite. - Anderson and Dancy (1992) suggested to model the
aggregate excesses of a hydrological data set
with Weibull-distribution.
16Slow convergence to the limiting quantities
- ? limp?? limu?? P( M1,p?u X0gtu ) ?(u,p)
- The plot gives ?(u,p)
- if ?p10.5, p00.1, a0.5 and ??0?11
- for p100 and 200 and
- for u ranging from the 99 to the 99.99 quantile
17Parameter estimation
- Estimation of the whole model with hidden
regimes - (reversible jump) MCMC
- maximum likelihood if ?1?01 (i.e. in the
Markov-switching case) but it is
computationally infeasible - However, if we focus only on extremal dynamics
- and assume that the regime durations (at least
above a high level) are geometrically distributed
- we can write down the likelihood based solely on
data during floods (i.e. above a high threshold) - ?1 is also assumed (in accordance with the
empirical data)
18Exponential QQ-plot for the positive increments
above the threshold 900 m3/s
19Likelihood computations
- Likelihood can be determined recursively
- qtP( It1 Xt, Xt-1, )
- q1cond P( It1 Xt-1,) (1-p1)qt-1
p0(1-qt-1) - q0cond P( It0 Xt-1,) p1qt-1
(1-p0)(1-qt-1) - f1 f(Xt , It1 Xt-1,) q1cond fExp(?)
(Xt-Xt-1) - f0 f(Xt , It0 Xt-1,) q0cond fN(0,?)
(Xt-aXt-1) - f(Xt Xt-1,) f0 f1
- qt f1/(f0 f1)
- Some care is needed
- at the beginning of the floods qt is determined
from the tail behaviour of the model - at the end of the floods the observation is
censored
20Advantages of using only the data over a
threshold
- Model dynamics may be different at lower levels
- For physical reasons, the rate of decay in the
falling regime (characterised by a) is varying
over the decay - Fast maximum likelihood estimation
- Smaller sample size
- Regimes separate very well at high levels
21Application to flood analysis
- Data 50 years of daily water discharge series at
Tivadar (river Tisza) about 18000 observations - We assume ??0?11
- Threshold 900m3/s (about 98 quantile)
- Parameter estimates and asymptotic standard
errors - p10.598 (0.037)
- on average 1.7 days of further increase in
accordance with emp. value - p00.027 (0.011)
- has a negligible effect on the dynamics over the
threshold - a0.823 (0.007)
- high persistence even in the falling regime
- ?0.0044 (0.0003)
- ?137.1 (8.0)
22Empirical and simulated flood dynamics
- Shape of the empirical and simulated floods are
very similar. - Subasymptotic behaviour is important
- Simulated water discharge remains over the
threshold for 1.4 days in average after the peak
23Exceedances over a threshold
- Maximal exceedance over a threshold is
approximately exponential with parameter
?p11/392 in the model, - in good accordance with the empirical
distribution. - The plot shows the exceedance over the threshold
1250m3/s.
24Aggregate excess (flood volume)
- Threshold 1250 m3/s
- Operational definition two floods are separated
when the water discharge goes below a lower
threshold (900 m3/s) between them - There are only 48 such floods in 50 years
- Emp. mean 72.1 mill. m3
- Sim. mean 76.9 mill m3
- The QQ-plot shows the fit of the distribution,
too.
25Dependence of p1 on the threshold
26Conclusions
- The limiting cluster quantities can be determined
in our physically motivated regime switching
model - Simulations are still needed since the
subasymptotic behaviour is important at the
relevant thresholds - To determine return levels of, e.g., flood
volume, the occurence of extreme events should
also be modelled, by a Poisson-process. - Further work what parametric multivariate
extreme value distribution does a reasonable
multivariate regime switching model suggest?
27References
- Anderson, C.W. and Dancy, G.P. (1992) The
severity of extreme events, Research Report
409/92 University of Sheffield. - Bálint, G. and Szilágyi, J. (2005) A hybrid,
Markov-chain based model for daily streamflow
generation, Journal of Hydrol. Engineering, in
press. - Brandt, A. (1986) The stochastic equation
Yn1AnYnBn with stationary coefficients, Adv.
in Appl. Prob., 18, 211-220. - Elek, P. and Márkus, L. (2004) A long range
dependent model with nonlinear innovations for
simulating daily river flows, Natural Hazards and
Earth Systems Sciences, 4, 277-283. - Elek, P. and Márkus, L. (2005) A light-tailed
conditionally heteroscedastic model with
applications to river flows, in preparation. - Robert, C. (2000) Extremes of alpha-ARCH models,
in Measuring Risk in Complex Stochastic Systems
(ed. by Franke et al.), XploRe e-books. - Segers, J. (2003) Functionals of clusters of
extremes, Adv. in Appl. Prob., 35, 1028-1045. - Smith, R.L., Tawn, J.A. and Coles, S.G. (1997)
Markov chain models for threshold exceedances,
Biometrika, 84, 249-268.
28- Thank you for your attention!