Loading...

PPT – THE INSTABILITY OF RISK MEASURES The problem of estimation error in complex systems PowerPoint presentation | free to view - id: 18207f-Mjk2Y

The Adobe Flash plugin is needed to view this content

THE INSTABILITY OF RISK MEASURESThe problem of

estimation error in complex systems

- Imre Kondor
- Collegium Budapest and Eötvös University,

Budapest - European Conference on Complex Systems 2009 (ECCS

09) - University of Warwick, Coventry, UK
- 21-25 September, 2009

Coworkers

- Szilárd Pafka (ELTE PhD student ? CIB Bank,

?Paycom.net, California) - Gábor Nagy (Debrecen University PhD student and

CIB Bank, Budapest) - Richárd Karádi (Technical University MSc student

?ProcterGamble) - Nándor Gulyás (ELTE PhD student ? Budapest Bank

?Lombard Leasing ?private enterpreneur) - István Varga-Haszonits (ELTE PhD student

?Morgan-Stanley)

Contents

- The subject of the talk lies at the crossroads

of finance, statistical physics and computer

science - I. The investment problem portfolios, rational

portfolio selection, risk measures, the

problem of estimation error (noise), noise

sensitivity of risk measures, instability of

risk measures - II. The wider context having ramifications in

model building for complex systems,

computational complexity, critical phenomena,

estimation error as a critical phenomenon,

machine learning, statistics in high dimensions

Contents

- The subject of the talk lies at the crossroads

of finance, statistical physics and computer

science - I. The investment problem portfolios, rational

portfolio selection, risk measures, the

problem of estimation error (noise), noise

sensitivity of risk measures, instability of

risk measures - II. The wider context having ramifications in

model building for complex systems,

computational complexity, critical phenomena,

estimation error as a critical phenomenon,

machine learning, statistics in high dimensions

Contents

- The subject of the talk lies at the crossroads

of finance, statistical physics and computer

science - I. The investment problem portfolios, rational

portfolio selection, risk measures, the

problem of estimation error (noise), noise

sensitivity of risk measures, instability of

risk measures - II. The wider context having ramifications in

model building for complex systems,

computational complexity, critical phenomena,

estimation error as a critical phenomenon,

machine learning, statistics in high dimensions

- I. THE INVESTMENT PROBLEM

A portfolio

- is a combination of assets or investment

instruments (shares, bonds, foreign exchange,

derivatives, precious metals, commodities,

artworks, property, etc.). More generally, the

various business lines of a big firm, or even the

economy as a whole, can also be regarded as a

portfolio. The generic problem is how to allocate

available resources.

A portfolio

- is a combination of assets or investment

instruments (shares, bonds, foreign exchange,

derivatives, precious metals, commodities,

artworks, property, etc.). More generally, the

various business lines of a big firm, or even the

economy as a whole, can also be regarded as a

portfolio. The generic problem is how to allocate

available resources.

Rational portfolio selection

- The value of assets fluctuates.
- It is dangerous to invest all our money into a

single asset. - Investment should be diversified, distributed

among the various assets. - More risky assets tend to yield higher return.
- Some assets tend to fluctuate together, some

others in an opposite way. - Rational portfolio selection seeks a tradeoff

between risk and reward.

Rational portfolio selection

- The value of assets fluctuates.
- It is dangerous to invest all our money into a

single asset. - Investment should be diversified, distributed

among the various assets. - More risky assets tend to yield higher return.
- Some assets tend to fluctuate together, some

others in an opposite way. - Rational portfolio selection seeks a tradeoff

between risk and reward.

Rational portfolio selection

- The value of assets fluctuates.
- It is dangerous to invest all our money into a

single asset. - Investment should be diversified, distributed

among the various assets. - More risky assets tend to yield higher return.
- Some assets tend to fluctuate together, some

others in an opposite way. - Rational portfolio selection seeks a tradeoff

between risk and reward.

Rational portfolio selection

- The value of assets fluctuates.
- It is dangerous to invest all our money into a

single asset. - Investment should be diversified, distributed

among the various assets. - More risky assets tend to yield higher return.
- Some assets tend to fluctuate together, some

others in an opposite way. - Rational portfolio selection seeks a tradeoff

between risk and reward.

Rational portfolio selection

- The value of assets fluctuates.
- It is dangerous to invest all our money into a

single asset. - Investment should be diversified, distributed

among the various assets. - More risky assets tend to yield higher return.
- Some assets tend to fluctuate together, some

others in an opposite way. - Rational portfolio selection seeks a tradeoff

between risk and reward.

Rational portfolio selection

- The value of assets fluctuates.
- It is dangerous to invest all our money into a

single asset. - Investment should be diversified, distributed

among the various assets. - More risky assets tend to yield higher return.
- Some assets tend to fluctuate together, some

others in an opposite way. - Rational portfolio selection seeks a tradeoff

between risk and reward.

Risk and reward

- Financial reward can be measured in terms of the

return (relative price change) - or the log return
- The characterization of risk is more controversial

Risk measures

- A risk measure is a quantitative characterization

of our intuitive concept of risk (fear of

uncertainty and loss). - Risk is related to the stochastic nature of

returns. Mathematically, it is (or should be) a

convex functional of the pdf of returns. - The appropriate choice may depend on the nature

of data (e.g. on their asymptotics) and on the

context (investment, risk management,

benchmarking, tracking, regulation, capital

allocation)

Risk measures

- A risk measure is a quantitative characterization

of our intuitive concept of risk (fear of

uncertainty and loss). - Risk is related to the stochastic nature of

returns. Mathematically, it is (or should be) a

convex functional of the pdf of returns. - The appropriate choice may depend on the nature

of data (e.g. on their asymptotics) and on the

context (investment, risk management,

benchmarking, tracking, regulation, capital

allocation)

Risk measures

- A risk measure is a quantitative characterization

of our intuitive concept of risk (fear of

uncertainty and loss). - Risk is related to the stochastic nature of

returns. Mathematically, it is (or should be) a

convex functional of the pdf of returns. - The appropriate choice may depend on the nature

of data (e.g. on their asymptotics) and on the

context (investment, risk management,

benchmarking, tracking, regulation, capital

allocation)

The most obvious choice for a risk measure

Variance

- Variance is the square of the average quadratic

deviation from the average a time honoured

statistical tool - Its use assumes that the probability distribution

of the returns is sufficiently concentrated

around the average, that there are no large

fluctuations - This is true in several instances, but we often

encounter fat tails, huge deviations with a

non-negligible probability (e.g. the Black

Monday).

The most obvious choice for a risk measure

Variance

- Variance is the square of the average quadratic

deviation from the average a time honoured

statistical tool - Its use assumes that the probability distribution

of the returns is sufficiently concentrated

around the average, that there are no large

fluctuations - This is true in several instances, but we often

encounter fat tails, huge deviations with a

non-negligible probability (e.g. the Black

Monday).

The most obvious choice for a risk measure

Variance

- Variance is the square of the average quadratic

deviation from the average a time honoured

statistical tool - Its use assumes that the probability distribution

of the returns is sufficiently concentrated

around the average, that there are no large

fluctuations - This is true in several instances, but we often

encounter fat tails, huge deviations with a

non-negligible probability (e.g. the Black

Monday).

Alternative risk measures

- There are several alternative risk measures in

the academic literature, practice, and regulation - Value at risk (VaR) the best among the p

worst losses (not convex, punishes

diversification) - Mean absolute deviation (MAD) Algorithmics
- Coherent risk measures (promoted by academics)
- Expected shortfall (ES) average loss beyond a

high threshold - Maximal loss (ML) the single worst case

Alternative risk measures

- There are several alternative risk measures in

the academic literature, practice, and regulation - Value at risk (VaR) the best among the p

worst losses (not convex, punishes

diversification) - Mean absolute deviation (MAD) Algorithmics
- Coherent risk measures (promoted by academics)
- Expected shortfall (ES) average loss beyond a

high threshold - Maximal loss (ML) the single worst case

Alternative risk measures

- There are several alternative risk measures in

the academic literature, practice, and regulation - Value at risk (VaR) the best among the p

worst losses (not convex, punishes

diversification) - Mean absolute deviation (MAD) Algorithmics
- Coherent risk measures (promoted by academics)
- Expected shortfall (ES) average loss beyond a

high threshold - Maximal loss (ML) the single worst case

Alternative risk measures

- There are several alternative risk measures in

the academic literature, practice, and regulation - Value at risk (VaR) the best among the p

worst losses (not convex, punishes

diversification) - Mean absolute deviation (MAD) Algorithmics
- Coherent risk measures (promoted by academics)
- Expected shortfall (ES) average loss beyond a

high threshold - Maximal loss (ML) the single worst case

Portfolios

- A portfolio is a linear combination (a weighted

average) of assets, with a set of weights wi that

add up to unity (the budget constraint). - The weights are not necessarily positive short

selling - The fact that the weights can be negative means

that the region over which we are trying to

determine the optimal portfolio is not bounded

Portfolios

- A portfolio is a linear combination (a weighted

average) of assets, with a set of weights wi that

add up to unity (the budget constraint). - The weights are not necessarily positive short

selling - The fact that the weights can be negative means

that the region over which we are trying to

determine the optimal portfolio is not bounded

Portfolios

- A portfolio is a linear combination (a weighted

average) of assets, with a set of weights wi that

add up to unity (the budget constraint). - The weights are not necessarily positive short

selling - The fact that the weights can be negative means

that the region over which we are trying to

determine the optimal portfolio is not bounded

The variance of a portfolio

- - a quadratic form of the weights. The

coefficients of this form are the elements of the

covariance matrix that measures the co-movements

between the various assets.

Markowitz portfolio selection theory

- Rational portfolio selection realizes the

tradeoff between risk and reward by minimizing

the risk functional (e.g. the variance) over the

weights, given the expected return, the budget

constraint, and possibly other costraints. Here

we will consider the global minimum risk

portfolio, omitting the constraint on the

expected return.

Information parsimony

- If we do not have enough information, we cannot

make a good decision. - In the context of portfolio selection this

embarrassing truism translates into the

requirement that the sample size (the length of

the time series) T must be much larger than the

size of the portfolio (the number of assets) N,

in order for us to be able to construct a good

portfolio. - For a large portfolio this condition is not easy

to satisfy (the sampling frequency cannot be

high, T cannot be very large). - Therefore, in real life N and T may well be of

the same order of magnitude, and it is

appropriate to consider the limit where N/T is of

the order of unity, while both N and T are large

(go to infinity). - In this limit we should expect large estimation

errors The optimal portfolio weights will be

very unstable, there will be huge sample to

sample fluctuations, and huge prediction errors.

Information parsimony

- If we do not have enough information, we cannot

make a good decision. - In the context of portfolio selection this

embarrassing truism translates into the

requirement that the sample size (the length of

the time series) T must be much larger than the

size of the portfolio (the number of assets) N,

in order for us to be able to construct a good

portfolio. - For a large portfolio this condition is not easy

to satisfy (the sampling frequency cannot be

high, T cannot be very large). - Therefore, in real life N and T may well be of

the same order of magnitude, and it is

appropriate to consider the limit where N/T is of

the order of unity, while both N and T are large

(go to infinity). - In this limit we should expect large estimation

errors The optimal portfolio weights will be

very unstable, there will be huge sample to

sample fluctuations, and huge prediction errors.

Information parsimony

- If we do not have enough information, we cannot

make a good decision. - In the context of portfolio selection this

embarrassing truism translates into the

requirement that the sample size (the length of

the time series) T must be much larger than the

size of the portfolio (the number of assets) N,

in order for us to be able to construct a good

portfolio. - For a large portfolio this condition is not easy

to satisfy (the sampling frequency cannot be

high, T cannot be very large). - Therefore, in real life N and T may well be of

the same order of magnitude, and it is

appropriate to consider the limit where N/T is of

the order of unity, while both N and T are large

(go to infinity). - In this limit we should expect large estimation

errors The optimal portfolio weights will be

very unstable, there will be huge sample to

sample fluctuations, and huge prediction errors.

Information parsimony

- If we do not have enough information, we cannot

make a good decision. - In the context of portfolio selection this

embarrassing truism translates into the

requirement that the sample size (the length of

the time series) T must be much larger than the

size of the portfolio (the number of assets) N,

in order for us to be able to construct a good

portfolio. - For a large portfolio this condition is not easy

to satisfy (the sampling frequency cannot be

high, T cannot be very large). - Therefore, in real life N and T may well be of

the same order of magnitude, and it is

appropriate to consider the limit where N/T is of

the order of unity, while both N and T are large

(go to infinity). - In this limit we should expect large estimation

errors The optimal portfolio weights will be

very unstable, there will be huge sample to

sample fluctuations, and huge prediction errors.

Information parsimony

- If we do not have enough information, we cannot

make a good decision. - In the context of portfolio selection this

embarrassing truism translates into the

requirement that the sample size (the length of

the time series) T must be much larger than the

size of the portfolio (the number of assets) N,

in order for us to be able to construct a good

portfolio. - For a large portfolio this condition is not easy

to satisfy (the sampling frequency cannot be

high, T cannot be very large). - Therefore, in real life N and T may well be of

the same order of magnitude, and it is

appropriate to consider the limit where N/T is of

the order of unity, while both N and T are large

(go to infinity). - In this limit we should expect large estimation

errors The optimal portfolio weights will be

very unstable, there will be huge sample to

sample fluctuations, and huge prediction errors.

- In I. K., Sz. Pafka, G. Nagy Noise sensitivity

of portfolio selection under various risk

measures, Journal of Banking and Finance, 31,

1545-1573 (2007) we found - If there are no constraints on the portfolio

weights other than the budget constraint, the

fluctuations actually diverge, that is the

estimation error becomes infinite, at a critical

value of the ratio N/T. - The critical value of N/T depends on the risk

measure in question. - For the variance and the mean absolute deviation

the critical ratio is (N/T)crit 1, for Maximal

Loss (ML, the best combination of the worst

losses) (N/T)crit ½. - If the risk measure in question depends on a

parameter a, then the critical N/T value will

also depend on that parameter, and we obtain a

critical curve (a phase diagram) on the a N/T

plane. - For example, Expected Shortfall is the average

loss above a high threshold a. (ML is the a ? 1

limit of ES.) The phase boundary for ES runs

below ½ (the critical ratio N/T lt ½ for any a).

- In I. K., Sz. Pafka, G. Nagy Noise sensitivity

of portfolio selection under various risk

measures, Journal of Banking and Finance, 31,

1545-1573 (2007) we found - If there are no constraints on the portfolio

weights other than the budget constraint, the

fluctuations actually diverge, that is the

estimation error becomes infinite, at a critical

value of the ratio N/T. - The critical value of N/T depends on the risk

measure in question. - For the variance and the mean absolute deviation

the critical ratio is (N/T)crit 1, for Maximal

Loss (ML, the best combination of the worst

losses) (N/T)crit ½. - If the risk measure in question depends on a

parameter a, then the critical N/T value will

also depend on that parameter, and we obtain a

critical curve (a phase diagram) on the a N/T

plane. - For example, Expected Shortfall is the average

loss above a high threshold a. (ML is the a ? 1

limit of ES.) The phase boundary for ES runs

below ½ (the critical ratio N/T lt ½ for any a).

- In I. K., Sz. Pafka, G. Nagy Noise sensitivity

of portfolio selection under various risk

measures, Journal of Banking and Finance, 31,

1545-1573 (2007) we found - If there are no constraints on the portfolio

weights other than the budget constraint, the

fluctuations actually diverge, that is the

estimation error becomes infinite, at a critical

value of the ratio N/T. - The critical value of N/T depends on the risk

measure in question. - For the variance and the mean absolute deviation

the critical ratio is (N/T)crit 1, for Maximal

Loss (ML, the best combination of the worst

losses) (N/T)crit ½. - If the risk measure in question depends on a

parameter a, then the critical N/T value will

also depend on that parameter, and we obtain a

critical curve (a phase diagram) on the a N/T

plane. - For example, Expected Shortfall is the average

loss above a high threshold a. (ML is the a ? 1

limit of ES.) The phase boundary for ES runs

below ½ (the critical ratio N/T lt ½ for any a).

- In addition, for finite N and T, the portfolio

optimization problem for ES and ML does not

always have a solution even below the critical

N/T ratio! (These risk measures may become

unbounded.) - For finite N and T, the existence of the optimum

is a probabilistic issue, it depends on the

sample. The probability of the existence of the

solution has been determined analytically for ML,

and numerically for ES. - As N and T ? 8 with N/T fixed, this probability

goes to 1 resp. 0, according to whether N/T is

below, or above (N/T)crit.

- In addition, for finite N and T, the portfolio

optimization problem for ES and ML does not

always have a solution even below the critical

N/T ratio! (These risk measures may become

unbounded.) - For finite N and T, the existence of the optimum

is a probabilistic issue, it depends on the

sample. The probability of the existence of the

solution has been determined analytically for ML,

and numerically for ES. - As N and T ? 8 with N/T fixed, this probability

goes to 1 resp. 0, according to whether N/T is

below, or above (N/T)crit.

- In addition, for finite N and T, the portfolio

optimization problem for ES and ML does not

always have a solution even below the critical

N/T ratio! (These risk measures may become

unbounded.) - For finite N and T, the existence of the optimum

is a probabilistic issue, it depends on the

sample. The probability of the existence of the

solution has been determined analytically for ML,

and numerically for ES. - As N and T ? 8 with N/T fixed, this probability

goes to 1 resp. 0, according to whether N/T is

below, or above (N/T)crit.

Illustration the case of Maximal Loss

- Definition of the problem (for simplicity, we are

looking for the global minimum and allow

unlimited short selling) - where the ws are the portfolio weights and the

xs the returns.

Probability of finding a solution for the minimax

problem (for elliptic underlying distributions)

In the limit N,T ? 8, with N/T fixed, the

transition becomes sharp at N/T ½.

- The phase boundary for ES has been obtained

numerically by I. K., Sz. Pafka, G. Nagy Noise

sensitivity of portfolio selection under various

risk measures, Journal of Banking and Finance,

31, 1545-1573 (2007) and calculated analytically

in A. Ciliberti, I. K., and M. Mézard On the

Feasibility of Portfolio Optimization under

Expected Shortfall, Quantitative Finance, 7,

389-396 (2007)

The estimation error diverges as one

approaches the phase boundary from below

- The intuitive explanation for the instability of

ES and ML is that for a given finite sample there

may exist a dominant item (or a dominant

combination of items) that produces a larger

return at each time point than any of the others,

even if no such dominance relationship exist

between them on very large samples. This leads

the investor to believe that if she goes

extremely long in the dominant item and extremely

short in the rest, she can produce an arbitrarily

large return on the portfolio, at a risk that

goes to minus infinity (i.e. no risk). - The same consideration can be extended to any

coherent risk measure. - Evidently, the effect critically depends on the

weights being unbounded. Constraints on short

selling and other limits will be considered later.

- The intuitive explanation for the instability of

ES and ML is that for a given finite sample there

may exist a dominant item (or a dominant

combination of items) that produces a larger

return at each time point than any of the others,

even if no such dominance relationship exist

between them on very large samples. This leads

the investor to believe that if she goes

extremely long in the dominant item and extremely

short in the rest, she can produce an arbitrarily

large return on the portfolio, at a risk that

goes to minus infinity (i.e. no risk). - The same consideration can be extended to any

coherent risk measure. - Evidently, the effect critically depends on the

weights being unbounded. Constraints on short

selling and other limits will be considered later.

- The intuitive explanation for the instability of

ES and ML is that for a given finite sample there

may exist a dominant item (or a dominant

combination of items) that produces a larger

return at each time point than any of the others,

even if no such dominance relationship exist

between them on very large samples. This leads

the investor to believe that if she goes

extremely long in the dominant item and extremely

short in the rest, she can produce an arbitrarily

large return on the portfolio, at a risk that

goes to minus infinity (i.e. no risk). - The same consideration can be extended to any

coherent risk measure. - Evidently, the effect critically depends on the

weights being unbounded. Constraints on short

selling and other limits will be considered later.

Coherent measures on a given sample

- Such apparent arbitrage can show up for any

coherent risk measure. (I.K. and I.

Varga-Haszonits Feasibility of portfolio

optimization under coherent risk measures,

submitted to Quantitative Finance) - Assume that the finite sample estimator

of our risk measure satisfies the coherence

axioms (Ph. Artzner, F. Delbaen, J. M. Eber, and

D. Heath, Coherent measures of risk, Mathematical

Finance, 9, 203-228, (1999) -

The formal statements corresponding to the above

intuition

- Proposition 1. If there exist two portfolios u

and v so that then the portfolio

optimisation task has no solution under any

coherent measure. - Proposition 2. Optimisation under ML has no

solution, if and only if there exists a pair of

portfolios such that one of them strictly

dominates the other. - Neither of these theorems assumes anything about

the underlying distribution.

The formal statements corresponding to the above

intuition

- Proposition 1. If there exist two portfolios u

and v so that then the portfolio

optimisation task has no solution under any

coherent measure. - Proposition 2. Optimisation under ML has no

solution, if and only if there exists a pair of

portfolios such that one of them strictly

dominates the other. - Neither of these theorems assumes anything about

the underlying distribution.

The formal statements corresponding to the above

intuition

- Proposition 1. If there exist two portfolios u

and v so that then the portfolio

optimisation task has no solution under any

coherent measure. - Proposition 2. Optimisation under ML has no

solution, if and only if there exists a pair of

portfolios such that one of them strictly

dominates the other. - Neither of these theorems assumes anything about

the underlying distribution.

Further generalization

- As a matter of fact, this type of instability

appears even beyond the set of coherent risk

measures, and may appear in downside risk

measures in general. - By far the most widely used risk measure today is

Value at Risk (VaR). It is a downside measure. It

is not convex, therefore the stability problem of

its historical estimator is ill-posed. - Parametric VaR, however, is convex, and this

allows us to study the stability problem. Along

with VaR, we also look into the closely related

parametric estimates for two other downside risk

measures ES and semi variance. - Parametric estimates are expected to be more

stable than historical ones. We will then be able

to compare the phase diagrams for the historical

and parametric ES.

Further generalization

- As a matter of fact, this type of instability

appears even beyond the set of coherent risk

measures, and may appear in downside risk

measures in general. - By far the most widely used risk measure today is

Value at Risk (VaR). It is a downside measure. It

is not convex, therefore the stability problem of

its historical estimator is ill-posed. - Parametric VaR, however, is convex, and this

allows us to study the stability problem. Along

with VaR, we also look into the closely related

parametric estimates for two other downside risk

measures ES and semi variance. - Parametric estimates are expected to be more

stable than historical ones. We will then be able

to compare the phase diagrams for the historical

and parametric ES.

Further generalization

- As a matter of fact, this type of instability

appears even beyond the set of coherent risk

measures, and may appear in downside risk

measures in general. - By far the most widely used risk measure today is

Value at Risk (VaR). It is a downside measure. It

is not convex, therefore the stability problem of

its historical estimator is ill-posed. - Parametric VaR, however, is convex, and this

allows us to study the stability problem. Along

with VaR, we also look into the closely related

parametric estimates for two other downside risk

measures ES and semi variance. - Parametric estimates are expected to be more

stable than historical ones. We will then be able

to compare the phase diagrams for the historical

and parametric ES.

Further generalization

- As a matter of fact, this type of instability

appears even beyond the set of coherent risk

measures, and may appear in downside risk

measures in general. - By far the most widely used risk measure today is

Value at Risk (VaR). It is a downside measure. It

is not convex, therefore the stability problem of

its historical estimator is ill-posed. - Parametric VaR, however, is convex, and this

allows us to study the stability problem. Along

with VaR, we also look into the closely related

parametric estimates for two other downside risk

measures ES and semi variance. - Parametric estimates are expected to be more

stable than historical ones. We will then be able

to compare the phase diagrams for the historical

and parametric ES.

Parametric estimation of VaR, ES, and

semi-variance

- For simplicity, we assume that the historical

data are fitted to a Gaussian underlying process. - For a Gaussian process all three risk measures

can be written as - ,
- where

- Here is the error function.
- The condition for the existence of an optimum for

VaR and ES is - ,
- where

- Note that there is no unconditional optimum even

if we know the underlying process exactly. - It can be shown that the meaning of the condition

is similar to the previous one (think e.g. of a

portfolio with one exceptionally high return item

that has a variance comparable to the others). - If we do not know the true process, but assume it

is, say, a Gaussian, we may estimate its mean

returns and covariances from the observed finite

time series as - and

- Assume, for simplicity, that all the mean returns

are zero. After a long and tedious application of

the replica method imported from the theory of

random systems, the solvability condition works

out to be - lt
- for all three risk measures. Note that this is

stronger than the solvability condition for the

exactly known process. - For the semivariance where the critical
- N/T ratio is , which means that for the

parametrically estimated semivariance we need at

least three times larger samples than the size of

the portfolio.

For the parametric VaR and ES the result is shown

in the figure

- In the region above the respective phase

boundaries the optimization problem does not have

a solution. - In the region below the phase boundary there is a

solution, but for it to be a good approximation

to the true risk we must go deep into the

feasible region. If we go to the phase boundary

from below, the estimation error diverges. - The phase boundary for ES runs above that of VaR,

so for a given confidence level a the critical

ratio for ES is larger than for VaR (we need less

data in order to have a solution). For

practically important values of a (95-99) the

difference is not significant.

- In the region above the respective phase

boundaries the optimization problem does not have

a solution. - In the region below the phase boundary there is a

solution, but for it to be a good approximation

to the true risk we must go deep into the

feasible region. If we go to the phase boundary

from below, the estimation error diverges. - The phase boundary for ES runs above that of VaR,

so for a given confidence level a the critical

ratio for ES is larger than for VaR (we need less

data in order to have a solution). For

practically important values of a (95-99) the

difference is not significant.

- In the region above the respective phase

boundaries the optimization problem does not have

a solution. - In the region below the phase boundary there is a

solution, but for it to be a good approximation

to the true risk we must go deep into the

feasible region. If we go to the phase boundary

from below, the estimation error diverges. - The phase boundary for ES runs above that of VaR,

so for a given confidence level a the critical

ratio for ES is larger than for VaR (we need less

data in order to have a solution). For

practically important values of a (95-99) the

difference is not significant.

Parametric vs. historical estimates

- The parametric ES curve runs above the historical

one we need less data to have a solution when

the risk is estimated parametrically than when we

use raw historical data. It seems as if we had

some additional information in the parametric

approach. - Where does this information come from?
- It is injected into the calculation by hand

when fitting the data to an independently chosen

probability distribution.

Parametric vs. historical estimates

- The parametric ES curve runs above the historical

one we need less data to have a solution when

the risk is estimated parametrically than when we

use raw historical data. It seems as if we had

some additional information in the parametric

approach. - Where does this information come from?
- It is injected into the calculation by hand

when fitting the data to an independently chosen

probability distribution.

Adding linear constraints

- In practice, portfolio optimization is always

subject to some constraints on the allowed range

of the weights, such as a ban on short selling

and/or limits on various assets, industrial

sectors, regions, etc. These constraints restrict

the region over which the optimum is sought to a

finite volume where no infinite fluctuations can

appear. One might then think that under such

constraints the instability discussed above

disappears completely.

- This is not so. If we work in the vicinity of the

phase boundary, sample to sample fluctuations in

the weights will still be large, but the

constraints will prevent the solution from

running away to infinity. Instead, it will stick

to the walls of the allowed region. - For example, for a ban on short selling (wi gt 0)

these walls will be the coordinate planes, and as

N/T increases, more and more of the weights will

become zero. This phenomenon is well known in

portfolio optimization. (B. Scherer, R. D.

Martin, - Introduction to Modern Portflio Optimization

with NUOPT and S-PLUS, Springer, New York (2005))

- This is not so. If we work in the vicinity of the

phase boundary, sample to sample fluctuations in

the weights will still be large, but the

constraints will prevent the solution from

running away to infinity. Instead, it will stick

to the walls of the allowed region. - For example, for a ban on short selling (wi gt 0)

these walls will be the coordinate planes, and as

N/T increases, more and more of the weights will

become zero. This phenomenon is well known in

portfolio optimization. (B. Scherer, R. D.

Martin, - Introduction to Modern Portflio Optimization

with NUOPT and S-PLUS, Springer, New York (2005))

- This spontaneous reduction of diversification is

entirely due to estimation error and does not

reflect any real structure of the objective

function. - In addition, for the next sample a completely

different set of weights will become zero the

solution keeps jumping about on the walls of the

allowed region. - Clearly, in this situation the solution reflects

the structure of the limit system (i.e. the

portfolio managers beliefs), rather than the

structure of the market. Therefore, whenever we

are working in or close to the unstable region

(which is almost always), the constraints only

mask rather than cure the instability.

- This spontaneous reduction of diversification is

entirely due to estimation error and does not

reflect any real structure of the objective

function. - In addition, for the next sample a completely

different set of weights will become zero the

solution keeps jumping about on the walls of the

allowed region. - Clearly, in this situation the solution reflects

the structure of the limit system, (i.e. the

portfolio managers beliefs), rather than the

structure of the market. Therefore, whenever we

are working in or close to the unstable region

(which is almost always), the constraints only

mask rather than cure the instability.

- This spontaneous reduction of diversification is

entirely due to estimation error and does not

reflect any real structure of the objective

function. - In addition, for the next sample a completely

different set of weights will become zero the

solution keeps jumping about on the walls of the

allowed region. - Clearly, in this situation the solution reflects

the structure of the limit system (i.e. the

portfolio managers beliefs), rather than the

structure of the market. Therefore, whenever we

are working in or close to the unstable region

(which is almost always), the constraints only

mask rather than cure the instability.

Closing remarks on portfolio selection

- Given the nature of the portfolio optimization

task, one will typically work in that region of

parameter space where sample fluctuations are

large. Since the critical point where these

fluctuations diverge depends on the risk measure,

the confidence level, and on the method of

estimation, one must be aware of how close ones

working point is to the critical boundary,

otherwise one will be grossly misled by the

unstable algorithm.

- Downside risk measures have been introduced,

because they ignore positive fluctuations that

investors are not supposed to be afraid of.

Perhaps they should be the downside risk

measures display the instability described here

which is basically due to a false arbitrage alert

and may induce an investor to take very large

positions on the basis of fragile information

stemming from finite samples. In a way, the

global disaster engulfing us is a macroscopic

example of such a folly.

II. THE PROBLEM OF ESTIMATION ERROR IN MODEL

BUILDING

Portfolio optimization is equivalent to Linear

Regression

- Linear regression is a standard framework in

which to attempt to construct a first statistical

model. - It is ubiquitous (microarrays, medical sciences,

epidemology, sociology, macroeconomics, etc.) - It has a time-honored history and works fine

especially if the independent variables are few,

there are enough data, and they are drawn from a

tight distribution (such as a Gaussian) - Complications arise if we have a large number of

explicatory variables (their number grows at a

rate of 5 per decade), and a limited number of

data (as almost always). - Then we face a serious estimation error problem.

- Linear regression is a standard framework in

which to attempt to construct a first statistical

model. - It is ubiquitous (microarrays, medical sciences,

epidemology, sociology, macroeconomics, etc.) - It has a time-honored history and works fine

especially if the independent variables are few,

there are enough data, and they are drawn from a

tight distribution (such as a Gaussian) - Complications arise if we have a large number of

explicatory variables (their number grows at a

rate of 5 per decade), and a limited number of

data (as almost always). - Then we face a serious estimation error problem.

- Linear regression is a standard framework in

which to attempt to construct a first statistical

model. - It is ubiquitous (microarrays, medical sciences,

epidemology, sociology, macroeconomics, etc.) - It has a time-honored history and works fine

especially if the independent variables are few,

there are enough data, and they are drawn from a

tight distribution (such as a Gaussian) - Complications arise if we have a large number of

explicatory variables (their number grows at a

rate of 5 per decade), and a limited number of

data (as almost always). - Then we face a serious estimation error problem.

- Linear regression is a standard framework in

which to attempt to construct a first statistical

model. - It is ubiquitous (microarrays, medical sciences,

epidemology, sociology, macroeconomics, etc.) - It has a time-honored history and works fine

especially if the independent variables are few,

there are enough data, and they are drawn from a

tight distribution (such as a Gaussian) - Complications arise if we have a large number of

explicatory variables (their number grows at a

rate of 5 per decade), and a limited number of

data (as almost always). - Then we face a serious estimation error problem.

Assume we know the underlying process and

minimize the residual error for an infinitely

large sample

In practice we can only minimize the residual

error for a sample of length T

The relative error

- This is a measure of the estimation error.
- It is a random variable, depends on the sample
- Its distribution strongly depends on the ratio

N/T, where N is the number of dimensions and T

the sample size. - The average of qo diverges at a critical value of

N/T!

Critical behaviour for N,T large, with N/Tfixed

- The average of qo diverges at the critical point

N/T1, just as in portfolio theory.

The regression coefficients fluctuate wildly

unless N/T 1. Geometric interpretation one

cannot fit a plane to one point.

CONCLUDING REMARKS ON MODELING COMPLEX SYSTEMS

- Normally, one is supposed to work in the NltltT

limit, i.e. with low dimensional problems and

plenty of data. - Complex systems are very high dimensional and

irreducible (incompressible), they require a

large number of explicatory variables for their

faithful representation. - Therefore, we have to face the unconventional

situation in the regression problem that NT, or

even NgtT, and then the error in the regression

coefficients will be large.

- Normally, one is supposed to work in the NltltT

limit, i.e. with low dimensional problems and

plenty of data. - Complex systems are very high dimensional and

irreducible (incompressible), they require a

large number of explicatory variables for their

faithful representation. - Therefore, we have to face the unconventional

situation in the regression problem that NT, or

even NgtT, and then the error in the regression

coefficients will be large.

- Normally, one is supposed to work in the NltltT

limit, i.e. with low dimensional problems and

plenty of data. - Complex systems are very high dimensional and

irreducible (incompressible), they require a

large number of explicatory variables for their

faithful representation. - Therefore, we have to face the unconventional

situation in the regression problem that NT, or

even NgtT, and then the error in the regression

coefficients will be large.

- If the number of explicatory variables is very

large and they are all of the same order of

magnitude, then there is no structure in the

system, it is just noise (like a completely

random string). So we have to assume that some of

the variables have a larger weight than others,

but we do not have a natural cutoff beyond which

it would be safe to forget about the higher order

variables. This leads us to the assumption that

the regression coefficients must have a scale

free, power law like distribution for complex

systems.

- How can we understand that, in the social

sciences, medical sciences, etc., we are getting

away with insufficient statistics, even with NgtT? - We are projecting external information into our

statistical assessments. (I can draw a

well-determined straight line across even a

single point, if I know that it must be parallel

to another line.) - Humans do not optimize, but use quick and dirty

heuristics. This has an evolutionary meaning if

something looks vaguely like a leopard, one

jumps, rather than trying to seek the optimal fit

to the observed fragments of the picture to a

leopard.

- How can we understand that, in the social

sciences, medical sciences, etc., we are getting

away with insufficient statistics, even with NgtT? - We are projecting external information into our

statistical assessments. (I can draw a

well-determined straight line across even a

single point, if I know that it must be parallel

to another line.) - Humans do not optimize, but use quick and dirty

heuristics. This has an evolutionary meaning if

something looks vaguely like a leopard, one

jumps, rather than trying to seek the optimal fit

to the observed fragments of the picture to a

leopard.

- How can we understand that, in the social

sciences, medical sciences, etc., we are getting

away with insufficient statistics, even with NgtT? - We are projecting external information into our

statistical assessments. (I can draw a

well-determined straight line across even a

single point, if I know that it must be parallel

to another line.) - Humans do not optimize, but use quick and dirty

heuristics. This has an evolutionary meaning if

something looks vaguely like a leopard, one

jumps, rather than trying to seek the optimal fit

to the observed fragments of the picture to a

leopard.

- Prior knowledge, the larger picture, values,

deliberate or unconscious bias, etc. are

essential features of model building. - When we have a chance to check this prior

knowledge millions of times in carefully designed

laboratory experiments, this is a well-justified

procedure. - In several applications (macroeconomics, medical

sciences, epidemology, etc.) there is no way to

perform these laboratory checks, and errors may

build up as one uncertain piece of knowledge

serves as a prior for another uncertain

statistical model. This is how we construct

myths, ideologies and social theories.

- Prior knowledge, the larger picture, values,

deliberate or unconscious bias, etc. are

essential features of model building. - When we have a chance to check this prior

knowledge millions of times in carefully designed

laboratory experiments, this is a well-justified

procedure. - In several applications (macroeconomics, medical

sciences, epidemology, etc.) there is no way to

perform these laboratory checks, and errors may

build up as one uncertain piece of knowledge

serves as a prior for another uncertain

statistical model. This is how we construct

myths, ideologies and social theories.

- Prior knowledge, the larger picture, values,

deliberate or unconscious bias, etc. are

essential features of model building. - When we have a chance to check this prior

knowledge millions of times in carefully designed

laboratory experiments, this is a well-justified

procedure. - In several applications (macroeconomics, medical

sciences, epidemology, etc.) there is no way to

perform these laboratory checks, and errors may

build up as one uncertain piece of knowledge

serves as a prior for another uncertain

statistical model. This is how we construct

myths, ideologies and social theories.

- It is conceivable that theory building (in the

sense of constructing a low dimensional model)

for social phenomena will prove to be impossible,

and the best we will be able to do is to build a

life-size computer model of the system, a kind of

gigantic Simcity, or Borges map. - By playing and experimenting with these models we

may develop an intuition about its complex

behaviour that we couldnt gain by observing the

single sample of a society or economy.