Noise sensitivity of portfolio selection under various risk measures

About This Presentation

Title:

Noise sensitivity of portfolio selection under various risk measures

Description:

optimization vs. risk measurement, model-simulation approach, random-matrix ... Perron) eigenvalue, with the corresponding eigenvector having all positive components. ... – PowerPoint PPT presentation

Number of Views:115

Avg rating:3.0/5.0

Slides: 167

Provided by: col69

Category:

more less

Transcript and Presenter's Notes

Title: Noise sensitivity of portfolio selection under various risk measures

1
Noise sensitivity of portfolio selection under
various risk measures

Imre Kondor
Collegium Budapest and Eötvös
University, Budapest, Hungary
Risk Measurement and Management, Rome, June 9-17,
2005

2
Contents

I. Preliminaries
the problem of noise, risk measures, noisy
covariance matrices
II. Random matrices
Spectral properties of Wigner and Wishart
matrices
III. Filtering of normal portfolios
optimization vs. risk measurement,
model-simulation approach, random-matrix-theor
y-based filtering
IV. Beyond the stationary, Gaussian world
non-stationary case, alternative risk measures
(mean absolute deviation, expected shortfall,
worst loss), their sensitivity to noise, the
feasibility problem

3
Coworkers

Szilárd Pafka and Gábor Nagy (CIB Bank, Budapest,
a member of the Intesa Group), Marc Potters
(Capital Fund Management, Paris)
Richárd Karádi (Institute of Physics, Budapest
University of Technology, now at ProcterGamble)
Balázs Janecskó, András Szepessy, Tünde Ujvárosi
(Raiffeisen Bank, Budapest)
István Varga-Haszonits (Eötvös University,
Budapest)

4
I. PRELIMINARIES
5
Preliminary considerations

Portfolio selection vs. risk measurement of a
fixed portfolio
Portfolio selection a tradeoff between risk and
reward
There is a more or less general agreement on what
we mean by reward in a finance context, but the
status of risk measures is controversial
For optimal portfolio selection we have to know
what we want to optimize
The chosen risk measure should respect some
obvious mathematical requirements, must be
stable, and easy to implement in practice

6
The problem of noise

Even if returns formed a clean, stationary
stochastic process, we only could observe finite
time segments, therefore we never have sufficient
information to completely reconstruct the
underlying process. Our estimates will always be
noisy.
Mean returns are particularly hard to measure on
the market with any precision
Even if we disregard returns and go for the
minimal risk portfolio, lack of sufficient
information will introduce noise, i. e. error,
into correlations and covariances, hence into our
decision.
The problem of noise is more severe for large
portfolios (size N) and relatively short time
series (length T) of observations, and different
risk measures are sensitive to noise to a
different degree.
We have to know how the decision error depends on
N and T for a given risk measure

7
Some elementary criteria on risk measures

A risk measure is a quantitative characterization
of our intuitive risk concept (fear of
uncertainty and loss).
Risk is related to the stochastic nature of
returns. It is a functional of the pdf of
returns.
Any reasonable risk measure must satisfy
- convexity
- invariance under addition of risk free asset
- monotonicity and assigning zero risk to a zero
position
The appropriate choice may depend on the nature
of data (e.g. on their asymptotics) and on the
context (investment, risk management,
benchmarking, tracking, regulation, capital
allocation)

8
A more elaborate set of risk measure axioms

Coherent risk measures (P. Artzner, F. Delbaen,
J.-M. Eber, D. Heath, Risk, 10, 33-49 (1997)
Mathematical Finance,9, 203-228 (1999)) Required
properties monotonicity, subadditivity, positive
homogeneity, and translational invariance.
Subadditivity and homogeneity imply convexity.
(Homogeneity is questionable for very large
positions. Multiperiod risk measures?)
Spectral measures (C. Acerbi, in Risk Measures
for the 21st Century, ed. G. Szegö, Wiley, 2004)
a special subset of coherent measures, with an
explicit representation. They are parametrized by
a spectral function that reflects the risk
aversion of the investor.

9
Convexity

Convexity is extremely important.
A non-convex risk measure
- penalizes diversification (without convexity
risk
can be reduced by splitting the portfolio in
two
or more parts)
- does not allow risk to be correctly aggregated
- cannot provide a basis for rational pricing of
risk
(the efficient set may not be not convex)
- cannot serve as a basis for a consistent limit
system
In short, a non-convex risk measure is really not
a risk measure at all.

10
A classical risk measure the variance

When we use variance as a risk measure we assume
that the underlying statistics is essentially
multivariate normal or close to it.

11
Portfolios

Consider a linear combination of returns
with weights . The
weights add up to unity . The
portfolios expectation value is
with variance ,
where is the covariance matrix, and
the standard deviation of return .

12
Level surfaces of risk measured in variance

The covariance matrix is positive definite. It
follows that the level surfaces (iso-risk
surfaces) of variance are (hyper)ellipsoids in
the space of weights. The convex iso-risk
surfaces reflect the fact that the variance is a
convex measure.
The principal axes are inversely proportional to
the square root of the eigenvalues of the
covariance matrix.
Small eigenvalues thus correspond to long
axes.
The risk free asset would correspond to and
infinite axis, and the correspondig ellipsoid
would be deformed into an elliptical cylinder.

13
The Markowitz problem

According to Markowitz classical theory the
tradeoff between risk and reward can be realized
by minimizing the variance
over the weights, for a given expected return
and budget

Geometrically, this means that we have to blow up
the risk ellipsoid until it touches the
intersection of the two planes corresponding to
the return and budget constraints, respectively.
The point of tangency is the solution to the
problem.
As the solution is the point of tangency of a
convex surface with a linear one, the solution is
unique.
There is a certain continuity or stability in the
solution A small miss-specification of the risk
ellipsoid leads to a small shift in the solution.

Covariance matrices corresponding to real markets
tend to have mostly positive elements.
A large, complicated matrix with nonzero average
elements will have a large (Frobenius-Perron)
eigenvalue, with the corresponding eigenvector
having all positive components. This will be the
direction of the shortest principal axis of the
risk ellipsoid.
Then the solution also will have all positive
components. Even large fluctuations in the small
eigenvalue sectors may have a relatively mild
effect on the solution.

16
The minimal risk portfolio

Expected returns are hardly possible (on
efficient markets, impossible) to determine with
any precision.
In order to get rid of the uncertainties in the
returns, we confine ourselves to considering the
minimal risk portfolio only, that is, for the
sake of simplicity, we drop the return
constraint.
Minimizing the variance of a portfolio without
considering return does not, in general, make
much sense. In some cases (index tracking,
benchmarking), however, this is precisely what
one has to do.

17
Benchmark tracking

The goal can be (e.g. in benchmark tracking or
index replication) to minimize the risk (e.g.
standard deviation) relative to a benchmark
Portfolio
Benchmark
Relative portfolio

Therefore the relevant problems are of similar
structure but with returns relative to the
benchmark
For example, to minimize risk relative to the
benchmark means minimizing the standard deviation
of
with the usual budget contraint (no condition on
expected returns!)

19
The weights of the minimal risk portfolio

Analytically, the minimal variance portfolio
corresponds to the weights for which
is minimal, given .
The solutions is .
Geometrically, the minimal risk portfolio is the
point of tangency between the risk ellipsoid and
the plane of he budget constraint.

20
Empirical covariance matrices

The covariance matrix has to be determined from
measurements on the market. From the returns
observed at time t we get the estimator
For a portfolio of N assets the covariance matrix
has O(N²) elements. The time series of length T
for N assets contain NT data. In order for the
measurement be precise, we need N ltltT. Bank
portfolios may contain hundreds of assets, and it
is hardly meaningful to use time series longer
than 4 years (T1000). Therefore, N/T ltlt 1 rarely
holds in practice. As a result, there will be a
lot of noise in the estimate, and the error will
scale in N/T.

21
Fighting the curse of dimensions

Economists have been struggling with this problem
for ages. Since the root of the problem is lack
of sufficient information, the remedy is to
inject external info into the estimate. This
means imposing some structure on s. This
introduces bias, but beneficial effect of noise
reduction may compensate for this.
Examples
single-index models (ßs) All these help
to
multi-index models various degrees.
grouping by sectors Most studies are
based
principal component analysis on
empirical data
Baysian shrinkage estimators, etc.

22
An intriguing observation

L.Laloux, P. Cizeau, J.-P. Bouchaud, M. Potters,
PRL 83 1467 (1999) and Risk 12 No.3, 69 (1999)
and to
V. Plerou, P. Gopikrishnan, B. Rosenow, L.A.N.
Amaral, H.E. Stanley, PRL 83 1471 (1999)
noted that there is such a huge amount of noise
in empirical covariance matrices that it may be
enough to make them useless.
A paradox Covariance matrices are in widespread
use and banks still survive ?!

23
Laloux et al. 1999
The spectrum of the covariance matrix obtained
from the time series of SP 500 with N406,
T1308, i.e. N/T 0.31, compared with that of a
completely random matrix (solid curve). Only
about 6 of the eigenvalues lie beyond the random
band.
24
Remarks on the paradox

The number of junk eigenvalues may not
necessarily be a proper measure of the effect of
noise The small eigenvalues and their
eigenvectors fluctuate a lot, indeed, but perhaps
they have a relatively minor effect on the
optimal portfolio, whereas the large eigenvalues
and their eigenvectors are fairly stable.
The investigated portfolio was too large compared
with the length of the time series.
Working with real, empirical data, it is hard to
distinguish the effect of insufficient
information from other parasitic effects, like
nonstationarity.

25
A historical remark

Random matrices first appeared in a finance
context in G. Galluccio, J.-P. Bouchaud, M.
Potters, Physica A 259 449 (1998). In this paper
they show that the optimization of a margin
account (where, due to the obligatory deposit
proportional to the absolute value of the
positions, a nonlinear constraint replaces the
budget constraint) is equivalent to finding the
ground state configuration of what is called a
spin glass in statistical physics. This task is
known to be NP-complete, with an exponentially
large number of solutions.
Problems of a similar structure would appear if
one wanted to optimize the capital requirement of
a bond portfolio under the rules stipulated by
the Capital Adequacy Directive of the EU (see
below)

26
A filtering procedure suggested by RMT

The appearence of random matrices in the context
of portfolio selection triggered a lot of
activity, mainly among physicists. Laloux et al.
and Plerou et al. proposed a filtering method
based on random matrix theory (RMT) subsequently.
This has been further developed and refined by
many workers.
The proposed filtering consists basically in
discarding as pure noise that part of the
spectrum that falls below the upper edge of the
random spectrum. Information is carried only by
the eigenvalues and their eigenvectors above this
edge. Optimization should be carried out by
projecting onto the subspace of large
eigenvalues, and replacing the small ones by a
constant chosen so as to preserve the trace. This
would then drastically reduce the effective
dimensionality of the problem.

Interpretation of the large eigenvalues The
largest one is the market, the other big
eigenvalues correspond to the main industrial
sectors.
The method can be regarded as a systematic
version of principal component analysis, with an
objective criterion on the number of principal
components.
In order to better understand this novel
filtering method, we have to recall a few results
from Random Matrix Theory (RMT)

28
II. RANDOM MATRICES
29
Origins of random matrix theory (RMT)

Wigner, Dyson 1950s
Originally meant to describe (to a zeroth
approximation) the spectral properties of (heavy)
atomic nuclei
- on the grounds that something that is
sufficiently complex is almost random
- fits into the picture of a complex system, as
one with a large number of degrees of freedom,
without symmetries, hence irreducible, quasi
random.
- markets, by the way, are considered stochastic
for similar reasons
Later found applications in a wide range of
problems, from quantum gravity through quantum
chaos, mesoscopics, random systems, etc. etc.

30
RMT

Has developed into a rich field with a huge set
of results for the spectral properties of various
classes of random matrices
They can be thought of as a set of central limit
theorems for matrices

31
Wigner semi-circle law

Mij symmetrical NxN matrix with i.i.d. elements
(the distribution has 0 mean and finite second
moment)
?k eigenvalues of Mij
The density of eigenvalues ?k (normed by N) goes
to the Wigner semi-circle for N?8 with prob. 1
,
, otherwise

32
Remarks on the semi-circle law

Can be proved by the method of moments (as done
originally by Wigner) or by the resolvent method
(Marchenko and Pastur and countless others)
Holds also for slightly dependent or
non-homogeneous entries (e.g. for the association
matrix in networks theory)
The convergence is fast (believed to be of 1/N,
but proved only at a lower rate), especially what
concerns the support

Convergence to the semi-circle as N increases

34
N20
Elements of M are distributed normally
35
N50
36
N100
37
N200
38
N500
39
N1000
40

If the matrix elements are not centered but have
a common mean, one large eigenvalue breaks away,
the rest stay in the semi-circle

41
If the matrix elements are not centered
N1000
42
N1000
43

For fat-tailed (but finite variance)
distributions the theorem still holds, but the
convergence is slow

44
Sample from Student t (freedom3) distribution
N20
45
N50
46
N100
47
N200
48
N500
49
N1000
50

There is a lot of fluctuation, level crossing,
random rotation of eigenvectors taking place in
the bulk

51
Illustration of the instability of the
eigenvectors, although the distribution of the
eigenvalues is the same. Sample 1 Matrix elements
normally distributed N1000
52
Sample 2
53
Sample k
54
Scalar product of the eigenvectors assigned to
the j. eigenvalue of the matrix.
55

The eigenvector belonging to the large
eigenvalue (when there is one) is much more
stable. The larger the eigenvalue, the more so.

56
Illustration of the stability of the largest
eigenvector Sample 1 Matrix elements are normally
distributed, but the sum of the elements in the
rows is not zero. N1000
57
Sample 2
58
Sample k
59
Scalar product of the eigenvectors belonging to
the largest eigenvalue of the matrix. The larger
the first eigenvalue, the closer the scalar
products to 1 or -1.
60
The eigenvector components

A lot less is known about the eigenvectors.
Those in the bulk have random components
The one belonging to the large eigenvalue (when
there is one) is completely delocalized

61
Wishart matrices random sample covariance
matrices

Let Aij NxT matrix with i.i.d. elements (0 mean
and finite second moment)
s 1/T AA where A is the transpose
Wishart or Marchenko-Pastur spectrum (eigenvalue
distribution)
where

62
Remarks

The theorem also holds when EA is of finite
rank
The assumption that the entries are identically
distributed is not necessary
If T lt N the distribution is the same with and
extra point of mass 1 T/N at the origin
If T N the Marchenko-Pastur law is the squared
Wigner semi-circle
The proof extends to slightly dependent and
inhomogeneous entries
The convergence is fast, believed to be of 1/N ,
but proved only at a lower rate

Convergence in N, with T/N 2 fixed

64
N20 T/N2
The red curve is the limit Wishart distribution
65
N50 T/N2
66
N100 T/N2
67
N200 T/N2
68
N500 T/N2
69
N1000 T/N2
70

Evolution of the distribution with T/N, with N
1000 fixed

71
The quadratic limit
N1000
T/N1
72
N1000 T/N1.2
73
N1000 T/N2
74
N1000 T/N3
75
N1000 T/N5
76
N1000 T/N10
77
Scalar product of the eigenvectors belonging to
the j eigenvalue of the matrices for different
samples.
78
Eigenvector components

The same applies as in the Wigner case the
eigenvectors in the bulk are random, the one
outside is delocalized

79
Distribution of the eigenvector components, if no
dominant eigenvalue exists.
80
Market model
Underlying distribution is Wishart
N100 T/N2 Rho0.1
81
N200 T/N2
82
N500 T/N2
83
N1000 T/N2
84
Scalar product of the eigenvectors belonging to
the largest eigenvalue of the matrix. The larger
the first eigenvalue, the closer the scalar
products to 1.
85
Distribution of the eigenvector components, if no
dominant eigenvalue exists.
N1000 T/N2 Rho0.1
86
Distribution of the eigenvector components, if
one of the eigenvalues is not typical for random
matrixes.
N1000 T/N2 Rho0.1
87
N1000 T/N2 Rho0.1
Distribution of the eigenvector components, if
one of the eigenvalues is not typical for random
matrixes.
88
N1000 T/N2 Rho0.5
89
N1000 T/N2 Rho0.9
The interval becomes narrower as correlation
increases.
90
III. FILTERING OF NORMAL PORTFOLIOS
91
Some key points

Laloux et al. and Plerou et al. demonstrate the
effect of noise on the spectrum of the
correlation matrix C. This is not directly
relevant for the risk in the portfolio. We wanted
to study the effect of noise on a measure of
risk.

92
Optimization vs. risk management

There is a fundamental difference between the two
kinds of uses of the covariance matrix s for
optimization resp. risk measurement.
Where do people use s for portfolio selection at
all?
- GoldmanSachs technical document
- tracking portfolios, benchmarking, shrinkage
- capital allocation (EWRM)
- hidden in softwares

93
Optimization

When s is used for optimization, we need a lot
more information, because we are comparing
different portfolios.
To get optimal portfolio, we need to invert s,
and as it has small eigenvalues, error gets
amplified.

94
Risk measurement management - regulatory
capital calculation

Assessing risk in a given portfolio no need to
invert s the problem of measurement error is
much less serious

95
A measure of the effect of noise

Assume we know the true covariance matrix and
the noisy one . Then a natural, though not
unique,
measure of the impact of noise is
where w are the optimal weights corresponding
to and , respectively.

96
We will mostly use simulated data

The rationale behind this is that in order to be
able to compare the efficiency of filtering
methods (and later also the sensitivity of risk
measures to noise) we better get rid of other
sources of uncertainty, like non-stationarity.
This can be achieved by using artificial data
where we have total control over the underlying
stochastic process

97
The model-simulation approach

Our strategy is to choose various model
covariance matrices and generate N long
simulated time series by them. Then we cut
segments of length T from these time series, as
if observing them on the market, and try to
reconstruct the covariance matrices from them. We
optimize a portfolio both with the true and
with the observed covariance matrix and
determine the measure .

The models are chosen to mimic at least some of
the characteristic features of real markets. Four
simple models of slightly increasing complexity
will be considered

99
Model 1 the unit matrix

Spectrum
? 1, N-fold degenerate
Noise will split this
into band

1
0
C
100
Model 2 single-index

Singlet ?11?(N-1) O(N)
eigenvector (1,1,1,)
?2 1- ? O(1)
(N-1) fold degenerate

?
1
101
The economic content of the single-index model

return market return with
standard deviation s
The covariance matrix implied by the above
The assumed structure reduces of parameters to
N.
If nothing depends on i then this is just the
caricature Model 2.

102
Model 3 market sectors

singlet
- fold degenerate
1
This structure has also been studied by economists
- fold degenerate
103
Model 4 Semi-empirical

Suppose we have very long time series (T) for
many assets (N).
Choose N lt N time series randomly and derive Cº
from these data. Generate time series of length
T ltlt T from Cº.
The error due to T is much larger than that due
to T.

104
How to generate time series?

Given independent standard normal
Given
Define L (real, lower triangular) matrix such
that
(Cholesky)
Get
Empirical covariance matrix will be different
from . For fixed N, and T ? ? ,

105

We look for the minimal risk portfolio for both
the true and the empirical covariances and
determine the measure

106
We get numerically for Model 1 the following
scaling result
107
This confirms the expected scaling in N/T. The
corresponding analytic result

can easily be derived for Model 1. It is valid
within O(1/N) corrections also for more general
models.

108
The same in a risk measurement context

Given fixed wis. Choose to generate data.
Measure from finite T time series.
Calculate
It can be shown that , for

109
Filtering

Single-index filter
Spectral decomposition of correlation matrix
to be chosen so as to
preserve trace

110
Random matrix filter

where to be chosen to preserve trace
again
and - the upper edge of
the random band.

111
Covariance estimates

after filtering we get
and
Silarly for the other models. We compare results
on the following figures

112
Results for the market sectors model
113
Results for the semi-empirical model
114
Comments on the efficiency of filtering techniques

Results depend on the model used for Cº.
Market model still scales with T/N,
singular at T/N1
much improved (filtering
technique matches structure), can go even below
TN.
Market sectors strong dependence on parameters
RMT filtering outperforms the other two
Semi-empirical data are scattered, RMT wins in
most cases

115

Filtering is very powerful in supressing noise,
particularly when it matches the underlying
structure.
Is there information buried in the random band?
With T increasing more and more eigenvalues
crawl out of from below the upper random band
edge.
How to dig out information buried in the random
band?
Promising steps by various groups (Z. Burda, A.
Görlich, A. Jarosz and J. Jurkiewicz,
cond-mat/0305627 and Z. Burda and J. Jurkiewicz,
cond-mat/0312496, Jagellonian University, Cracow
Th. Guhr, Lund University P. Repetowicz, P.
Richmond and S. Hutzler, Trinity College, Dublin
G. Papp, Sz. Pafka, M.A. Nowak, and I.K.,
Budapest and Cracow, etc.)

116
IV. BEYOND THE STATIONARY GAUSSIAN WORLD
117

Real-life time series are neither stationary
(volatility clustering, changing economic or
legal environment, etc.), nor Gaussian (fat
tails)
For long-tailed distributions the variance is not
an appropriate risk measure (even when it
exists) minimizing the variance may actually
increase rather than decrease risk.

118
One step towards reality Non-stationary case

Volatility clustering ?ARCH, GARCH, integrated
GARCH?EWMA (Exponentially Weighted Moving
Averages) in RiskMetrics
t actual time
T window
a attenuation factor ( Teff -1/log a), the
rate of
forgetting

119

RiskMetrics aoptimal 0.94
memory of a few months, total weight of data
preceding the last 75 days is lt 1.
Because of the short effective time cutoff,
filtering is even more important than before.
Carol Alexander applied standard principal
component analysis.
RMT helps choosing the number of principal
components in an objective manner.
For the application of RMT we need the upper edge
of the random band for exponentially weighted
random matrices

120
Exponentially weighted Wishart matrices
121
Sz. Pafka, M. Potters, and I.K. submitted to
Quantitative Finance, e-print cond-mat/0402573

Density of eigenvalues
where v is the solution to

122
Spectra of exponentially weighted and standard
Wishart matrices
123

The RMT filtering wins again better than plain
EWMA and better than plain MA.
There is an optimal a (too long memory will
include nonstationary effects, too short memory
looses data).
The optimal a (for N 100) is 0.996
gtgtRiskMetrics a.

124
Alternative risk measures
125
Risk measures in practice VaR

VaR (Value at Risk) is a high (95, or 99)
quantile, a threshold beyond which a given
fraction (5 or 1) of the statistical weight
resides.
Its merits (relative to the Greeks, e.g.)
- universal can be applied to any portfolio
- probabilistic content associated to the
distribution
- expressed in money
Wide spread across the whole industry and
regulation. Has been promoted from a diagnostic
tool to a decision tool.
Its lack of convexity promted search for coherence

126
Risk measures implied by regulation

Banks are required to set aside capital as a
cushion against risk
Minimal capital requirements are fixed by
international regulation (Basel I and II, Capital
Adequacy Directive of the EEC) the magic 8
Standard model vs. internal models
Capital charges assigned to various positions in
the standard model purport to cover the risk in
those positions, therefore, they must be regarded
as some kind of implied risk measures
These measures are trying to mimic variance by
piecewise linear approximants. They are quite
arbitrary, sometimes concave and unstable

127
An example Specific risk of bonds
Specific ri
CAD, Annex I, 14 The capital requirement of
the specific risk (due to issuer) of bonds is
Iso-risk surface of the specific risk of bonds
128
Another example Foreign exchange
According to Annex III, 1, (CAD 1993, Official
Journal of the European Communities, L14, 1-26)
the capital requirement is given as
,
,
in terms of the gross
.
and the net position
The iso-risk surface of the foreign exchange
portfolio
129
Mean absolute deviation (MAD)
Some methodologies (e.g. Algorithmics) use the
mean absolute deviation rather than the standard
deviation to characterize the fluctuation of
portfolios. The objective function to minimize is
then
instead of
The iso-risk surfaces of MAD are polyhedra again.
130
Effect of noise on absolute deviation-optimized
portfolios
We generate artificial time series (say iid
normal), determine the true abs. deviation and
compare it to the measured one
We get
131
Noise sensitivity of MAD

The result scales in T/N (same as with the
variance). The optimal portfolio other things
being equal - is more risky than in the
variance-based optimization.
Geometrical interpretation The level surfaces of
the variance are ellipsoids.The optimal portfolio
is found as the point where this risk-ellipsoid
first touches the plane corresponding to the
budget constraint. In the absolute deviation case
the ellipsoid is replaced by a polyhedron, and
the solution occurs at one of its corners. A
small error in the specification of the
polyhedron makes the solution jump to another
corner, thereby increasing the fluctuation in the
portfolio.

132
(No Transcript)
133
Filtering for MAD (??)

The absolute deviation-optimized portfolios can
be filtered, by associating a covariance matrix
with the time series, then filtering this matrix
(by RMT, say), and generating a new time series
via this reduced matrix. This (admittedly
fortuitous) procedure significantly reduces the
noise in the absolute deviation.
Note that this risk measure can be used in the
case of non-Gaussian portfolios as well.

134
Expected shortfall (ES) optimization

ES is the mean loss beyond a high threshold
defined in probability (not in money). For
continuous pdfs it is the same as the
conditional expectation beyond the VaR quantile.
ES is coherent (in the sense of Artzner et al.)
and as such it is strongly promoted by a group of
academics. In addition, Uryasev and Rockefellar
have shown that its optimizaton can be reduced to
linear programming for which extremely fast
algorithms exist.
ES-optimized portfolios tend to be much noisier
than either of the previous ones. One reason is
the instability related to the (piecewise) linear
risk measure, the other is that a high quantile
sacrifices most of the data.
In addition, ES optimization is not always
feasible!

135
Before turning to the discussion of the
feasibility problem, let us compare the noise
sensitivity of the following risk measures
standard deviation, absolute deviation and
expected shortfall (at 95). For the sake of
comparison we use the same (Gaussian) input data
of length T for each, determine the minimal risk
portfolio under these risk measures and compare
the error due to noise.
136
The next slides show

plots of wi (porfolio weights) as a function of i
display of q0 (ratio of risk of optimal portfolio
determined from time series information vs full
information)
results show that the effect of estimation noise
can be significant and more advanced risk
measures are more demanding for information (in
portfolio optimization context)

137
(No Transcript)
138
(No Transcript)
139
(No Transcript)
140
(No Transcript)
141

the suboptimality (q0) scales in T/N (for large N
and T)

142
Risk measures in risk measurement (as opposed to
portfolio optimization)

in the context of risk measurement of given
(fixed) portfolios, the estimation error is much
smaller, it scales usually as
independently of N !
see next slides show the histogram of measured
risk/true risk for different risk measures
(T500,1000), the mean is 1 and the estimation
error is usually within 5-10, i.e. negligible if
compared to the portfolio optimization context

143
(No Transcript)
144
(No Transcript)
145
The essence of the feasibility problem

For T lt N, there is no solution to the portfolio
optimization problem under any of the risk
measures considered here.
For T gt N, there always is a solution under
the variance and MAD, even if it is bad for T not
large enough. In contrast, under ES (and WL to be
considered later), there may or may not be a
solution for T gt N, depending on the sample. The
probability of the existence of a solution goes
to 1 only for T/N going to infinity.
The problem does not appear if short selling is
banned

146
Feasibility of optimization under ES
Probability of the existence of an optimum under
CVaR. F is the standard normal distribution. Note
the scaling in N/vT.
147
A pessimistic risk measure worst loss

In order to better understand the feasibility
problem, select the worst return in time and
minimize this over the weights
subject to
This risk measure is coherent, one of Acerbis
spectral measures.
For T lt N there is no solution
The existence of a solution for T gt N is a
probabilistic issue again, depending on the time
series sample

148
Why is the existence of an optimum a random event?

To get a feeling, consider NT2.
The two planes
intersect the plane of the budget constraint in
two straight lines. If one of these is
decreasing, the other is increasing with ,
then there is a solution, if both increase or
decrease, there is not. It is easy to see that
for elliptical distributions the probability of
there being a solution is ½.

149
Probability of the feasibility of the minimax
problem

For TgtN the probability of a solution (for an
elliptical underlying pdf) is
(The problem is isomorphic to some problems in
operations research and random geometry.)
For N and T large, p goes over into the error
function and scales in N/vT.
For T? infinity, p ?1.

150
Probability of the existence of a solution under
maximum loss. F is the standard normal
distribution. Scaling is in N/vT again.
151
(No Transcript)
152
(No Transcript)
153
(No Transcript)
154
(No Transcript)
155
(No Transcript)
156
(No Transcript)
157
(No Transcript)
158
(No Transcript)
159
(No Transcript)
160
(No Transcript)
161
(No Transcript)
162
(No Transcript)
163
(No Transcript)
164
(No Transcript)
165
Concluding remarks

Due to the large number of assets in typical bank
portfolios and the limited amount of data, noise
is an all pervasive problem in portfolio theory.
It can be efficiently filtered by a variety of
techniques from portfolios optimized under
variance.
RMT is (one of) the latest of these filtering or
dimensional reduction techniques. It is quite
competitive with existing alternatives already,
shows enhanced performance when applied in
conjunction with extra information about the
structure of the market, and holds great promise
for resolving the spectrum under the upper edge
of the random band.
Unfortunately, variance is not an adequate risk
measure for fat-tailed pdfs.
Piecewise linear risk measures show instability
(jumps) in a noisy environment.
Risk measures focusing on the far tails show
additional sensitivity to noise, due to loss of
data.
The two coherent measures we have studied display
large sample-to-sample fluctuations and
feasibility problems under noise. This may cast a
shade of doubt on their applications.