THE INSTABILITY OF RISK MEASURES The problem of estimation error in complex systems - PowerPoint PPT Presentation

1 / 95
About This Presentation
Title:

THE INSTABILITY OF RISK MEASURES The problem of estimation error in complex systems

Description:

Coherent risk measures (promoted by academics) ... The same consideration can be extended to any coherent risk measure. ... Coherent measures on a given sample ... – PowerPoint PPT presentation

Number of Views:88
Avg rating:3.0/5.0
Slides: 96
Provided by: col69
Category:

less

Transcript and Presenter's Notes

Title: THE INSTABILITY OF RISK MEASURES The problem of estimation error in complex systems


1
THE INSTABILITY OF RISK MEASURESThe problem of
estimation error in complex systems
  • Imre Kondor
  • Collegium Budapest and Eötvös University,
    Budapest
  • European Conference on Complex Systems 2009 (ECCS
    09)
  • University of Warwick, Coventry, UK
  • 21-25 September, 2009


2
Coworkers
  • Szilárd Pafka (ELTE PhD student ? CIB Bank,
    ?Paycom.net, California)
  • Gábor Nagy (Debrecen University PhD student and
    CIB Bank, Budapest)
  • Richárd Karádi (Technical University MSc student
    ?ProcterGamble)
  • Nándor Gulyás (ELTE PhD student ? Budapest Bank
    ?Lombard Leasing ?private enterpreneur)
  • István Varga-Haszonits (ELTE PhD student
    ?Morgan-Stanley)

3
Contents
  • The subject of the talk lies at the crossroads
    of finance, statistical physics and computer
    science
  • I. The investment problem portfolios, rational
    portfolio selection, risk measures, the
    problem of estimation error (noise), noise
    sensitivity of risk measures, instability of
    risk measures
  • II. The wider context having ramifications in
    model building for complex systems,
    computational complexity, critical phenomena,
    estimation error as a critical phenomenon,
    machine learning, statistics in high dimensions

4
Contents
  • The subject of the talk lies at the crossroads
    of finance, statistical physics and computer
    science
  • I. The investment problem portfolios, rational
    portfolio selection, risk measures, the
    problem of estimation error (noise), noise
    sensitivity of risk measures, instability of
    risk measures
  • II. The wider context having ramifications in
    model building for complex systems,
    computational complexity, critical phenomena,
    estimation error as a critical phenomenon,
    machine learning, statistics in high dimensions

5
Contents
  • The subject of the talk lies at the crossroads
    of finance, statistical physics and computer
    science
  • I. The investment problem portfolios, rational
    portfolio selection, risk measures, the
    problem of estimation error (noise), noise
    sensitivity of risk measures, instability of
    risk measures
  • II. The wider context having ramifications in
    model building for complex systems,
    computational complexity, critical phenomena,
    estimation error as a critical phenomenon,
    machine learning, statistics in high dimensions

6
  • I. THE INVESTMENT PROBLEM

7
A portfolio
  • is a combination of assets or investment
    instruments (shares, bonds, foreign exchange,
    derivatives, precious metals, commodities,
    artworks, property, etc.). More generally, the
    various business lines of a big firm, or even the
    economy as a whole, can also be regarded as a
    portfolio. The generic problem is how to allocate
    available resources.

8
A portfolio
  • is a combination of assets or investment
    instruments (shares, bonds, foreign exchange,
    derivatives, precious metals, commodities,
    artworks, property, etc.). More generally, the
    various business lines of a big firm, or even the
    economy as a whole, can also be regarded as a
    portfolio. The generic problem is how to allocate
    available resources.

9
Rational portfolio selection
  • The value of assets fluctuates.
  • It is dangerous to invest all our money into a
    single asset.
  • Investment should be diversified, distributed
    among the various assets.
  • More risky assets tend to yield higher return.
  • Some assets tend to fluctuate together, some
    others in an opposite way.
  • Rational portfolio selection seeks a tradeoff
    between risk and reward.

10
Rational portfolio selection
  • The value of assets fluctuates.
  • It is dangerous to invest all our money into a
    single asset.
  • Investment should be diversified, distributed
    among the various assets.
  • More risky assets tend to yield higher return.
  • Some assets tend to fluctuate together, some
    others in an opposite way.
  • Rational portfolio selection seeks a tradeoff
    between risk and reward.

11
Rational portfolio selection
  • The value of assets fluctuates.
  • It is dangerous to invest all our money into a
    single asset.
  • Investment should be diversified, distributed
    among the various assets.
  • More risky assets tend to yield higher return.
  • Some assets tend to fluctuate together, some
    others in an opposite way.
  • Rational portfolio selection seeks a tradeoff
    between risk and reward.

12
Rational portfolio selection
  • The value of assets fluctuates.
  • It is dangerous to invest all our money into a
    single asset.
  • Investment should be diversified, distributed
    among the various assets.
  • More risky assets tend to yield higher return.
  • Some assets tend to fluctuate together, some
    others in an opposite way.
  • Rational portfolio selection seeks a tradeoff
    between risk and reward.

13
Rational portfolio selection
  • The value of assets fluctuates.
  • It is dangerous to invest all our money into a
    single asset.
  • Investment should be diversified, distributed
    among the various assets.
  • More risky assets tend to yield higher return.
  • Some assets tend to fluctuate together, some
    others in an opposite way.
  • Rational portfolio selection seeks a tradeoff
    between risk and reward.

14
Rational portfolio selection
  • The value of assets fluctuates.
  • It is dangerous to invest all our money into a
    single asset.
  • Investment should be diversified, distributed
    among the various assets.
  • More risky assets tend to yield higher return.
  • Some assets tend to fluctuate together, some
    others in an opposite way.
  • Rational portfolio selection seeks a tradeoff
    between risk and reward.

15
Risk and reward
  • Financial reward can be measured in terms of the
    return (relative price change)
  • or the log return
  • The characterization of risk is more controversial

16
Risk measures
  • A risk measure is a quantitative characterization
    of our intuitive concept of risk (fear of
    uncertainty and loss).
  • Risk is related to the stochastic nature of
    returns. Mathematically, it is (or should be) a
    convex functional of the pdf of returns.
  • The appropriate choice may depend on the nature
    of data (e.g. on their asymptotics) and on the
    context (investment, risk management,
    benchmarking, tracking, regulation, capital
    allocation)

17
Risk measures
  • A risk measure is a quantitative characterization
    of our intuitive concept of risk (fear of
    uncertainty and loss).
  • Risk is related to the stochastic nature of
    returns. Mathematically, it is (or should be) a
    convex functional of the pdf of returns.
  • The appropriate choice may depend on the nature
    of data (e.g. on their asymptotics) and on the
    context (investment, risk management,
    benchmarking, tracking, regulation, capital
    allocation)

18
Risk measures
  • A risk measure is a quantitative characterization
    of our intuitive concept of risk (fear of
    uncertainty and loss).
  • Risk is related to the stochastic nature of
    returns. Mathematically, it is (or should be) a
    convex functional of the pdf of returns.
  • The appropriate choice may depend on the nature
    of data (e.g. on their asymptotics) and on the
    context (investment, risk management,
    benchmarking, tracking, regulation, capital
    allocation)

19
The most obvious choice for a risk measure
Variance
  • Variance is the square of the average quadratic
    deviation from the average a time honoured
    statistical tool
  • Its use assumes that the probability distribution
    of the returns is sufficiently concentrated
    around the average, that there are no large
    fluctuations
  • This is true in several instances, but we often
    encounter fat tails, huge deviations with a
    non-negligible probability (e.g. the Black
    Monday).

20
The most obvious choice for a risk measure
Variance
  • Variance is the square of the average quadratic
    deviation from the average a time honoured
    statistical tool
  • Its use assumes that the probability distribution
    of the returns is sufficiently concentrated
    around the average, that there are no large
    fluctuations
  • This is true in several instances, but we often
    encounter fat tails, huge deviations with a
    non-negligible probability (e.g. the Black
    Monday).

21
The most obvious choice for a risk measure
Variance
  • Variance is the square of the average quadratic
    deviation from the average a time honoured
    statistical tool
  • Its use assumes that the probability distribution
    of the returns is sufficiently concentrated
    around the average, that there are no large
    fluctuations
  • This is true in several instances, but we often
    encounter fat tails, huge deviations with a
    non-negligible probability (e.g. the Black
    Monday).

22
Alternative risk measures
  • There are several alternative risk measures in
    the academic literature, practice, and regulation
  • Value at risk (VaR) the best among the p
    worst losses (not convex, punishes
    diversification)
  • Mean absolute deviation (MAD) Algorithmics
  • Coherent risk measures (promoted by academics)
  • Expected shortfall (ES) average loss beyond a
    high threshold
  • Maximal loss (ML) the single worst case

23
Alternative risk measures
  • There are several alternative risk measures in
    the academic literature, practice, and regulation
  • Value at risk (VaR) the best among the p
    worst losses (not convex, punishes
    diversification)
  • Mean absolute deviation (MAD) Algorithmics
  • Coherent risk measures (promoted by academics)
  • Expected shortfall (ES) average loss beyond a
    high threshold
  • Maximal loss (ML) the single worst case

24
Alternative risk measures
  • There are several alternative risk measures in
    the academic literature, practice, and regulation
  • Value at risk (VaR) the best among the p
    worst losses (not convex, punishes
    diversification)
  • Mean absolute deviation (MAD) Algorithmics
  • Coherent risk measures (promoted by academics)
  • Expected shortfall (ES) average loss beyond a
    high threshold
  • Maximal loss (ML) the single worst case

25
Alternative risk measures
  • There are several alternative risk measures in
    the academic literature, practice, and regulation
  • Value at risk (VaR) the best among the p
    worst losses (not convex, punishes
    diversification)
  • Mean absolute deviation (MAD) Algorithmics
  • Coherent risk measures (promoted by academics)
  • Expected shortfall (ES) average loss beyond a
    high threshold
  • Maximal loss (ML) the single worst case

26
Portfolios
  • A portfolio is a linear combination (a weighted
    average) of assets, with a set of weights wi that
    add up to unity (the budget constraint).
  • The weights are not necessarily positive short
    selling
  • The fact that the weights can be negative means
    that the region over which we are trying to
    determine the optimal portfolio is not bounded

27
Portfolios
  • A portfolio is a linear combination (a weighted
    average) of assets, with a set of weights wi that
    add up to unity (the budget constraint).
  • The weights are not necessarily positive short
    selling
  • The fact that the weights can be negative means
    that the region over which we are trying to
    determine the optimal portfolio is not bounded

28
Portfolios
  • A portfolio is a linear combination (a weighted
    average) of assets, with a set of weights wi that
    add up to unity (the budget constraint).
  • The weights are not necessarily positive short
    selling
  • The fact that the weights can be negative means
    that the region over which we are trying to
    determine the optimal portfolio is not bounded

29
The variance of a portfolio
  • - a quadratic form of the weights. The
    coefficients of this form are the elements of the
    covariance matrix that measures the co-movements
    between the various assets.

30
Markowitz portfolio selection theory
  • Rational portfolio selection realizes the
    tradeoff between risk and reward by minimizing
    the risk functional (e.g. the variance) over the
    weights, given the expected return, the budget
    constraint, and possibly other costraints. Here
    we will consider the global minimum risk
    portfolio, omitting the constraint on the
    expected return.

31
Information parsimony
  • If we do not have enough information, we cannot
    make a good decision.
  • In the context of portfolio selection this
    embarrassing truism translates into the
    requirement that the sample size (the length of
    the time series) T must be much larger than the
    size of the portfolio (the number of assets) N,
    in order for us to be able to construct a good
    portfolio.
  • For a large portfolio this condition is not easy
    to satisfy (the sampling frequency cannot be
    high, T cannot be very large).
  • Therefore, in real life N and T may well be of
    the same order of magnitude, and it is
    appropriate to consider the limit where N/T is of
    the order of unity, while both N and T are large
    (go to infinity).
  • In this limit we should expect large estimation
    errors The optimal portfolio weights will be
    very unstable, there will be huge sample to
    sample fluctuations, and huge prediction errors.

32
Information parsimony
  • If we do not have enough information, we cannot
    make a good decision.
  • In the context of portfolio selection this
    embarrassing truism translates into the
    requirement that the sample size (the length of
    the time series) T must be much larger than the
    size of the portfolio (the number of assets) N,
    in order for us to be able to construct a good
    portfolio.
  • For a large portfolio this condition is not easy
    to satisfy (the sampling frequency cannot be
    high, T cannot be very large).
  • Therefore, in real life N and T may well be of
    the same order of magnitude, and it is
    appropriate to consider the limit where N/T is of
    the order of unity, while both N and T are large
    (go to infinity).
  • In this limit we should expect large estimation
    errors The optimal portfolio weights will be
    very unstable, there will be huge sample to
    sample fluctuations, and huge prediction errors.

33
Information parsimony
  • If we do not have enough information, we cannot
    make a good decision.
  • In the context of portfolio selection this
    embarrassing truism translates into the
    requirement that the sample size (the length of
    the time series) T must be much larger than the
    size of the portfolio (the number of assets) N,
    in order for us to be able to construct a good
    portfolio.
  • For a large portfolio this condition is not easy
    to satisfy (the sampling frequency cannot be
    high, T cannot be very large).
  • Therefore, in real life N and T may well be of
    the same order of magnitude, and it is
    appropriate to consider the limit where N/T is of
    the order of unity, while both N and T are large
    (go to infinity).
  • In this limit we should expect large estimation
    errors The optimal portfolio weights will be
    very unstable, there will be huge sample to
    sample fluctuations, and huge prediction errors.

34
Information parsimony
  • If we do not have enough information, we cannot
    make a good decision.
  • In the context of portfolio selection this
    embarrassing truism translates into the
    requirement that the sample size (the length of
    the time series) T must be much larger than the
    size of the portfolio (the number of assets) N,
    in order for us to be able to construct a good
    portfolio.
  • For a large portfolio this condition is not easy
    to satisfy (the sampling frequency cannot be
    high, T cannot be very large).
  • Therefore, in real life N and T may well be of
    the same order of magnitude, and it is
    appropriate to consider the limit where N/T is of
    the order of unity, while both N and T are large
    (go to infinity).
  • In this limit we should expect large estimation
    errors The optimal portfolio weights will be
    very unstable, there will be huge sample to
    sample fluctuations, and huge prediction errors.

35
Information parsimony
  • If we do not have enough information, we cannot
    make a good decision.
  • In the context of portfolio selection this
    embarrassing truism translates into the
    requirement that the sample size (the length of
    the time series) T must be much larger than the
    size of the portfolio (the number of assets) N,
    in order for us to be able to construct a good
    portfolio.
  • For a large portfolio this condition is not easy
    to satisfy (the sampling frequency cannot be
    high, T cannot be very large).
  • Therefore, in real life N and T may well be of
    the same order of magnitude, and it is
    appropriate to consider the limit where N/T is of
    the order of unity, while both N and T are large
    (go to infinity).
  • In this limit we should expect large estimation
    errors The optimal portfolio weights will be
    very unstable, there will be huge sample to
    sample fluctuations, and huge prediction errors.

36
  • In I. K., Sz. Pafka, G. Nagy Noise sensitivity
    of portfolio selection under various risk
    measures, Journal of Banking and Finance, 31,
    1545-1573 (2007) we found
  • If there are no constraints on the portfolio
    weights other than the budget constraint, the
    fluctuations actually diverge, that is the
    estimation error becomes infinite, at a critical
    value of the ratio N/T.
  • The critical value of N/T depends on the risk
    measure in question.
  • For the variance and the mean absolute deviation
    the critical ratio is (N/T)crit 1, for Maximal
    Loss (ML, the best combination of the worst
    losses) (N/T)crit ½.
  • If the risk measure in question depends on a
    parameter a, then the critical N/T value will
    also depend on that parameter, and we obtain a
    critical curve (a phase diagram) on the a N/T
    plane.
  • For example, Expected Shortfall is the average
    loss above a high threshold a. (ML is the a ? 1
    limit of ES.) The phase boundary for ES runs
    below ½ (the critical ratio N/T lt ½ for any a).

37
  • In I. K., Sz. Pafka, G. Nagy Noise sensitivity
    of portfolio selection under various risk
    measures, Journal of Banking and Finance, 31,
    1545-1573 (2007) we found
  • If there are no constraints on the portfolio
    weights other than the budget constraint, the
    fluctuations actually diverge, that is the
    estimation error becomes infinite, at a critical
    value of the ratio N/T.
  • The critical value of N/T depends on the risk
    measure in question.
  • For the variance and the mean absolute deviation
    the critical ratio is (N/T)crit 1, for Maximal
    Loss (ML, the best combination of the worst
    losses) (N/T)crit ½.
  • If the risk measure in question depends on a
    parameter a, then the critical N/T value will
    also depend on that parameter, and we obtain a
    critical curve (a phase diagram) on the a N/T
    plane.
  • For example, Expected Shortfall is the average
    loss above a high threshold a. (ML is the a ? 1
    limit of ES.) The phase boundary for ES runs
    below ½ (the critical ratio N/T lt ½ for any a).

38
  • In I. K., Sz. Pafka, G. Nagy Noise sensitivity
    of portfolio selection under various risk
    measures, Journal of Banking and Finance, 31,
    1545-1573 (2007) we found
  • If there are no constraints on the portfolio
    weights other than the budget constraint, the
    fluctuations actually diverge, that is the
    estimation error becomes infinite, at a critical
    value of the ratio N/T.
  • The critical value of N/T depends on the risk
    measure in question.
  • For the variance and the mean absolute deviation
    the critical ratio is (N/T)crit 1, for Maximal
    Loss (ML, the best combination of the worst
    losses) (N/T)crit ½.
  • If the risk measure in question depends on a
    parameter a, then the critical N/T value will
    also depend on that parameter, and we obtain a
    critical curve (a phase diagram) on the a N/T
    plane.
  • For example, Expected Shortfall is the average
    loss above a high threshold a. (ML is the a ? 1
    limit of ES.) The phase boundary for ES runs
    below ½ (the critical ratio N/T lt ½ for any a).

39
  • In addition, for finite N and T, the portfolio
    optimization problem for ES and ML does not
    always have a solution even below the critical
    N/T ratio! (These risk measures may become
    unbounded.)
  • For finite N and T, the existence of the optimum
    is a probabilistic issue, it depends on the
    sample. The probability of the existence of the
    solution has been determined analytically for ML,
    and numerically for ES.
  • As N and T ? 8 with N/T fixed, this probability
    goes to 1 resp. 0, according to whether N/T is
    below, or above (N/T)crit.

40
  • In addition, for finite N and T, the portfolio
    optimization problem for ES and ML does not
    always have a solution even below the critical
    N/T ratio! (These risk measures may become
    unbounded.)
  • For finite N and T, the existence of the optimum
    is a probabilistic issue, it depends on the
    sample. The probability of the existence of the
    solution has been determined analytically for ML,
    and numerically for ES.
  • As N and T ? 8 with N/T fixed, this probability
    goes to 1 resp. 0, according to whether N/T is
    below, or above (N/T)crit.

41
  • In addition, for finite N and T, the portfolio
    optimization problem for ES and ML does not
    always have a solution even below the critical
    N/T ratio! (These risk measures may become
    unbounded.)
  • For finite N and T, the existence of the optimum
    is a probabilistic issue, it depends on the
    sample. The probability of the existence of the
    solution has been determined analytically for ML,
    and numerically for ES.
  • As N and T ? 8 with N/T fixed, this probability
    goes to 1 resp. 0, according to whether N/T is
    below, or above (N/T)crit.

42
Illustration the case of Maximal Loss
  • Definition of the problem (for simplicity, we are
    looking for the global minimum and allow
    unlimited short selling)
  • where the ws are the portfolio weights and the
    xs the returns.

43
Probability of finding a solution for the minimax
problem (for elliptic underlying distributions)

In the limit N,T ? 8, with N/T fixed, the
transition becomes sharp at N/T ½.
44
  • The phase boundary for ES has been obtained
    numerically by I. K., Sz. Pafka, G. Nagy Noise
    sensitivity of portfolio selection under various
    risk measures, Journal of Banking and Finance,
    31, 1545-1573 (2007) and calculated analytically
    in A. Ciliberti, I. K., and M. Mézard On the
    Feasibility of Portfolio Optimization under
    Expected Shortfall, Quantitative Finance, 7,
    389-396 (2007)

The estimation error diverges as one
approaches the phase boundary from below
45
  • The intuitive explanation for the instability of
    ES and ML is that for a given finite sample there
    may exist a dominant item (or a dominant
    combination of items) that produces a larger
    return at each time point than any of the others,
    even if no such dominance relationship exist
    between them on very large samples. This leads
    the investor to believe that if she goes
    extremely long in the dominant item and extremely
    short in the rest, she can produce an arbitrarily
    large return on the portfolio, at a risk that
    goes to minus infinity (i.e. no risk).
  • The same consideration can be extended to any
    coherent risk measure.
  • Evidently, the effect critically depends on the
    weights being unbounded. Constraints on short
    selling and other limits will be considered later.

46
  • The intuitive explanation for the instability of
    ES and ML is that for a given finite sample there
    may exist a dominant item (or a dominant
    combination of items) that produces a larger
    return at each time point than any of the others,
    even if no such dominance relationship exist
    between them on very large samples. This leads
    the investor to believe that if she goes
    extremely long in the dominant item and extremely
    short in the rest, she can produce an arbitrarily
    large return on the portfolio, at a risk that
    goes to minus infinity (i.e. no risk).
  • The same consideration can be extended to any
    coherent risk measure.
  • Evidently, the effect critically depends on the
    weights being unbounded. Constraints on short
    selling and other limits will be considered later.

47
  • The intuitive explanation for the instability of
    ES and ML is that for a given finite sample there
    may exist a dominant item (or a dominant
    combination of items) that produces a larger
    return at each time point than any of the others,
    even if no such dominance relationship exist
    between them on very large samples. This leads
    the investor to believe that if she goes
    extremely long in the dominant item and extremely
    short in the rest, she can produce an arbitrarily
    large return on the portfolio, at a risk that
    goes to minus infinity (i.e. no risk).
  • The same consideration can be extended to any
    coherent risk measure.
  • Evidently, the effect critically depends on the
    weights being unbounded. Constraints on short
    selling and other limits will be considered later.

48
Coherent measures on a given sample
  • Such apparent arbitrage can show up for any
    coherent risk measure. (I.K. and I.
    Varga-Haszonits Feasibility of portfolio
    optimization under coherent risk measures,
    submitted to Quantitative Finance)
  • Assume that the finite sample estimator
    of our risk measure satisfies the coherence
    axioms (Ph. Artzner, F. Delbaen, J. M. Eber, and
    D. Heath, Coherent measures of risk, Mathematical
    Finance, 9, 203-228, (1999)




49
The formal statements corresponding to the above
intuition
  • Proposition 1. If there exist two portfolios u
    and v so that then the portfolio
    optimisation task has no solution under any
    coherent measure.
  • Proposition 2. Optimisation under ML has no
    solution, if and only if there exists a pair of
    portfolios such that one of them strictly
    dominates the other.
  • Neither of these theorems assumes anything about
    the underlying distribution.

50
The formal statements corresponding to the above
intuition
  • Proposition 1. If there exist two portfolios u
    and v so that then the portfolio
    optimisation task has no solution under any
    coherent measure.
  • Proposition 2. Optimisation under ML has no
    solution, if and only if there exists a pair of
    portfolios such that one of them strictly
    dominates the other.
  • Neither of these theorems assumes anything about
    the underlying distribution.

51
The formal statements corresponding to the above
intuition
  • Proposition 1. If there exist two portfolios u
    and v so that then the portfolio
    optimisation task has no solution under any
    coherent measure.
  • Proposition 2. Optimisation under ML has no
    solution, if and only if there exists a pair of
    portfolios such that one of them strictly
    dominates the other.
  • Neither of these theorems assumes anything about
    the underlying distribution.

52
Further generalization
  • As a matter of fact, this type of instability
    appears even beyond the set of coherent risk
    measures, and may appear in downside risk
    measures in general.
  • By far the most widely used risk measure today is
    Value at Risk (VaR). It is a downside measure. It
    is not convex, therefore the stability problem of
    its historical estimator is ill-posed.
  • Parametric VaR, however, is convex, and this
    allows us to study the stability problem. Along
    with VaR, we also look into the closely related
    parametric estimates for two other downside risk
    measures ES and semi variance.
  • Parametric estimates are expected to be more
    stable than historical ones. We will then be able
    to compare the phase diagrams for the historical
    and parametric ES.

53
Further generalization
  • As a matter of fact, this type of instability
    appears even beyond the set of coherent risk
    measures, and may appear in downside risk
    measures in general.
  • By far the most widely used risk measure today is
    Value at Risk (VaR). It is a downside measure. It
    is not convex, therefore the stability problem of
    its historical estimator is ill-posed.
  • Parametric VaR, however, is convex, and this
    allows us to study the stability problem. Along
    with VaR, we also look into the closely related
    parametric estimates for two other downside risk
    measures ES and semi variance.
  • Parametric estimates are expected to be more
    stable than historical ones. We will then be able
    to compare the phase diagrams for the historical
    and parametric ES.

54
Further generalization
  • As a matter of fact, this type of instability
    appears even beyond the set of coherent risk
    measures, and may appear in downside risk
    measures in general.
  • By far the most widely used risk measure today is
    Value at Risk (VaR). It is a downside measure. It
    is not convex, therefore the stability problem of
    its historical estimator is ill-posed.
  • Parametric VaR, however, is convex, and this
    allows us to study the stability problem. Along
    with VaR, we also look into the closely related
    parametric estimates for two other downside risk
    measures ES and semi variance.
  • Parametric estimates are expected to be more
    stable than historical ones. We will then be able
    to compare the phase diagrams for the historical
    and parametric ES.

55
Further generalization
  • As a matter of fact, this type of instability
    appears even beyond the set of coherent risk
    measures, and may appear in downside risk
    measures in general.
  • By far the most widely used risk measure today is
    Value at Risk (VaR). It is a downside measure. It
    is not convex, therefore the stability problem of
    its historical estimator is ill-posed.
  • Parametric VaR, however, is convex, and this
    allows us to study the stability problem. Along
    with VaR, we also look into the closely related
    parametric estimates for two other downside risk
    measures ES and semi variance.
  • Parametric estimates are expected to be more
    stable than historical ones. We will then be able
    to compare the phase diagrams for the historical
    and parametric ES.

56
Parametric estimation of VaR, ES, and
semi-variance
  • For simplicity, we assume that the historical
    data are fitted to a Gaussian underlying process.
  • For a Gaussian process all three risk measures
    can be written as
  • ,
  • where

57
  • Here is the error function.
  • The condition for the existence of an optimum for
    VaR and ES is
  • ,
  • where

58
  • Note that there is no unconditional optimum even
    if we know the underlying process exactly.
  • It can be shown that the meaning of the condition
    is similar to the previous one (think e.g. of a
    portfolio with one exceptionally high return item
    that has a variance comparable to the others).
  • If we do not know the true process, but assume it
    is, say, a Gaussian, we may estimate its mean
    returns and covariances from the observed finite
    time series as
  • and

59
  • Assume, for simplicity, that all the mean returns
    are zero. After a long and tedious application of
    the replica method imported from the theory of
    random systems, the solvability condition works
    out to be
  • lt
  • for all three risk measures. Note that this is
    stronger than the solvability condition for the
    exactly known process.
  • For the semivariance where the critical
  • N/T ratio is , which means that for the
    parametrically estimated semivariance we need at
    least three times larger samples than the size of
    the portfolio.

60
For the parametric VaR and ES the result is shown
in the figure
61
  • In the region above the respective phase
    boundaries the optimization problem does not have
    a solution.
  • In the region below the phase boundary there is a
    solution, but for it to be a good approximation
    to the true risk we must go deep into the
    feasible region. If we go to the phase boundary
    from below, the estimation error diverges.
  • The phase boundary for ES runs above that of VaR,
    so for a given confidence level a the critical
    ratio for ES is larger than for VaR (we need less
    data in order to have a solution). For
    practically important values of a (95-99) the
    difference is not significant.

62
  • In the region above the respective phase
    boundaries the optimization problem does not have
    a solution.
  • In the region below the phase boundary there is a
    solution, but for it to be a good approximation
    to the true risk we must go deep into the
    feasible region. If we go to the phase boundary
    from below, the estimation error diverges.
  • The phase boundary for ES runs above that of VaR,
    so for a given confidence level a the critical
    ratio for ES is larger than for VaR (we need less
    data in order to have a solution). For
    practically important values of a (95-99) the
    difference is not significant.

63
  • In the region above the respective phase
    boundaries the optimization problem does not have
    a solution.
  • In the region below the phase boundary there is a
    solution, but for it to be a good approximation
    to the true risk we must go deep into the
    feasible region. If we go to the phase boundary
    from below, the estimation error diverges.
  • The phase boundary for ES runs above that of VaR,
    so for a given confidence level a the critical
    ratio for ES is larger than for VaR (we need less
    data in order to have a solution). For
    practically important values of a (95-99) the
    difference is not significant.

64
Parametric vs. historical estimates
  • The parametric ES curve runs above the historical
    one we need less data to have a solution when
    the risk is estimated parametrically than when we
    use raw historical data. It seems as if we had
    some additional information in the parametric
    approach.
  • Where does this information come from?
  • It is injected into the calculation by hand
    when fitting the data to an independently chosen
    probability distribution.

65
Parametric vs. historical estimates
  • The parametric ES curve runs above the historical
    one we need less data to have a solution when
    the risk is estimated parametrically than when we
    use raw historical data. It seems as if we had
    some additional information in the parametric
    approach.
  • Where does this information come from?
  • It is injected into the calculation by hand
    when fitting the data to an independently chosen
    probability distribution.

66
Adding linear constraints
  • In practice, portfolio optimization is always
    subject to some constraints on the allowed range
    of the weights, such as a ban on short selling
    and/or limits on various assets, industrial
    sectors, regions, etc. These constraints restrict
    the region over which the optimum is sought to a
    finite volume where no infinite fluctuations can
    appear. One might then think that under such
    constraints the instability discussed above
    disappears completely.

67
  • This is not so. If we work in the vicinity of the
    phase boundary, sample to sample fluctuations in
    the weights will still be large, but the
    constraints will prevent the solution from
    running away to infinity. Instead, it will stick
    to the walls of the allowed region.
  • For example, for a ban on short selling (wi gt 0)
    these walls will be the coordinate planes, and as
    N/T increases, more and more of the weights will
    become zero. This phenomenon is well known in
    portfolio optimization. (B. Scherer, R. D.
    Martin,
  • Introduction to Modern Portflio Optimization
    with NUOPT and S-PLUS, Springer, New York (2005))

68
  • This is not so. If we work in the vicinity of the
    phase boundary, sample to sample fluctuations in
    the weights will still be large, but the
    constraints will prevent the solution from
    running away to infinity. Instead, it will stick
    to the walls of the allowed region.
  • For example, for a ban on short selling (wi gt 0)
    these walls will be the coordinate planes, and as
    N/T increases, more and more of the weights will
    become zero. This phenomenon is well known in
    portfolio optimization. (B. Scherer, R. D.
    Martin,
  • Introduction to Modern Portflio Optimization
    with NUOPT and S-PLUS, Springer, New York (2005))

69
  • This spontaneous reduction of diversification is
    entirely due to estimation error and does not
    reflect any real structure of the objective
    function.
  • In addition, for the next sample a completely
    different set of weights will become zero the
    solution keeps jumping about on the walls of the
    allowed region.
  • Clearly, in this situation the solution reflects
    the structure of the limit system (i.e. the
    portfolio managers beliefs), rather than the
    structure of the market. Therefore, whenever we
    are working in or close to the unstable region
    (which is almost always), the constraints only
    mask rather than cure the instability.

70
  • This spontaneous reduction of diversification is
    entirely due to estimation error and does not
    reflect any real structure of the objective
    function.
  • In addition, for the next sample a completely
    different set of weights will become zero the
    solution keeps jumping about on the walls of the
    allowed region.
  • Clearly, in this situation the solution reflects
    the structure of the limit system, (i.e. the
    portfolio managers beliefs), rather than the
    structure of the market. Therefore, whenever we
    are working in or close to the unstable region
    (which is almost always), the constraints only
    mask rather than cure the instability.

71
  • This spontaneous reduction of diversification is
    entirely due to estimation error and does not
    reflect any real structure of the objective
    function.
  • In addition, for the next sample a completely
    different set of weights will become zero the
    solution keeps jumping about on the walls of the
    allowed region.
  • Clearly, in this situation the solution reflects
    the structure of the limit system (i.e. the
    portfolio managers beliefs), rather than the
    structure of the market. Therefore, whenever we
    are working in or close to the unstable region
    (which is almost always), the constraints only
    mask rather than cure the instability.

72
Closing remarks on portfolio selection
  • Given the nature of the portfolio optimization
    task, one will typically work in that region of
    parameter space where sample fluctuations are
    large. Since the critical point where these
    fluctuations diverge depends on the risk measure,
    the confidence level, and on the method of
    estimation, one must be aware of how close ones
    working point is to the critical boundary,
    otherwise one will be grossly misled by the
    unstable algorithm.

73
  • Downside risk measures have been introduced,
    because they ignore positive fluctuations that
    investors are not supposed to be afraid of.
    Perhaps they should be the downside risk
    measures display the instability described here
    which is basically due to a false arbitrage alert
    and may induce an investor to take very large
    positions on the basis of fragile information
    stemming from finite samples. In a way, the
    global disaster engulfing us is a macroscopic
    example of such a folly.

74
II. THE PROBLEM OF ESTIMATION ERROR IN MODEL
BUILDING
75
Portfolio optimization is equivalent to Linear
Regression
76
  • Linear regression is a standard framework in
    which to attempt to construct a first statistical
    model.
  • It is ubiquitous (microarrays, medical sciences,
    epidemology, sociology, macroeconomics, etc.)
  • It has a time-honored history and works fine
    especially if the independent variables are few,
    there are enough data, and they are drawn from a
    tight distribution (such as a Gaussian)
  • Complications arise if we have a large number of
    explicatory variables (their number grows at a
    rate of 5 per decade), and a limited number of
    data (as almost always).
  • Then we face a serious estimation error problem.

77
  • Linear regression is a standard framework in
    which to attempt to construct a first statistical
    model.
  • It is ubiquitous (microarrays, medical sciences,
    epidemology, sociology, macroeconomics, etc.)
  • It has a time-honored history and works fine
    especially if the independent variables are few,
    there are enough data, and they are drawn from a
    tight distribution (such as a Gaussian)
  • Complications arise if we have a large number of
    explicatory variables (their number grows at a
    rate of 5 per decade), and a limited number of
    data (as almost always).
  • Then we face a serious estimation error problem.

78
  • Linear regression is a standard framework in
    which to attempt to construct a first statistical
    model.
  • It is ubiquitous (microarrays, medical sciences,
    epidemology, sociology, macroeconomics, etc.)
  • It has a time-honored history and works fine
    especially if the independent variables are few,
    there are enough data, and they are drawn from a
    tight distribution (such as a Gaussian)
  • Complications arise if we have a large number of
    explicatory variables (their number grows at a
    rate of 5 per decade), and a limited number of
    data (as almost always).
  • Then we face a serious estimation error problem.

79
  • Linear regression is a standard framework in
    which to attempt to construct a first statistical
    model.
  • It is ubiquitous (microarrays, medical sciences,
    epidemology, sociology, macroeconomics, etc.)
  • It has a time-honored history and works fine
    especially if the independent variables are few,
    there are enough data, and they are drawn from a
    tight distribution (such as a Gaussian)
  • Complications arise if we have a large number of
    explicatory variables (their number grows at a
    rate of 5 per decade), and a limited number of
    data (as almost always).
  • Then we face a serious estimation error problem.

80
Assume we know the underlying process and
minimize the residual error for an infinitely
large sample
81
In practice we can only minimize the residual
error for a sample of length T
82
The relative error
  • This is a measure of the estimation error.
  • It is a random variable, depends on the sample
  • Its distribution strongly depends on the ratio
    N/T, where N is the number of dimensions and T
    the sample size.
  • The average of qo diverges at a critical value of
    N/T!

83
Critical behaviour for N,T large, with N/Tfixed
  • The average of qo diverges at the critical point
    N/T1, just as in portfolio theory.

The regression coefficients fluctuate wildly
unless N/T 1. Geometric interpretation one
cannot fit a plane to one point.
84
CONCLUDING REMARKS ON MODELING COMPLEX SYSTEMS
85
  • Normally, one is supposed to work in the NltltT
    limit, i.e. with low dimensional problems and
    plenty of data.
  • Complex systems are very high dimensional and
    irreducible (incompressible), they require a
    large number of explicatory variables for their
    faithful representation.
  • Therefore, we have to face the unconventional
    situation in the regression problem that NT, or
    even NgtT, and then the error in the regression
    coefficients will be large.

86
  • Normally, one is supposed to work in the NltltT
    limit, i.e. with low dimensional problems and
    plenty of data.
  • Complex systems are very high dimensional and
    irreducible (incompressible), they require a
    large number of explicatory variables for their
    faithful representation.
  • Therefore, we have to face the unconventional
    situation in the regression problem that NT, or
    even NgtT, and then the error in the regression
    coefficients will be large.

87
  • Normally, one is supposed to work in the NltltT
    limit, i.e. with low dimensional problems and
    plenty of data.
  • Complex systems are very high dimensional and
    irreducible (incompressible), they require a
    large number of explicatory variables for their
    faithful representation.
  • Therefore, we have to face the unconventional
    situation in the regression problem that NT, or
    even NgtT, and then the error in the regression
    coefficients will be large.

88
  • If the number of explicatory variables is very
    large and they are all of the same order of
    magnitude, then there is no structure in the
    system, it is just noise (like a completely
    random string). So we have to assume that some of
    the variables have a larger weight than others,
    but we do not have a natural cutoff beyond which
    it would be safe to forget about the higher order
    variables. This leads us to the assumption that
    the regression coefficients must have a scale
    free, power law like distribution for complex
    systems.

89
  • How can we understand that, in the social
    sciences, medical sciences, etc., we are getting
    away with insufficient statistics, even with NgtT?
  • We are projecting external information into our
    statistical assessments. (I can draw a
    well-determined straight line across even a
    single point, if I know that it must be parallel
    to another line.)
  • Humans do not optimize, but use quick and dirty
    heuristics. This has an evolutionary meaning if
    something looks vaguely like a leopard, one
    jumps, rather than trying to seek the optimal fit
    to the observed fragments of the picture to a
    leopard.

90
  • How can we understand that, in the social
    sciences, medical sciences, etc., we are getting
    away with insufficient statistics, even with NgtT?
  • We are projecting external information into our
    statistical assessments. (I can draw a
    well-determined straight line across even a
    single point, if I know that it must be parallel
    to another line.)
  • Humans do not optimize, but use quick and dirty
    heuristics. This has an evolutionary meaning if
    something looks vaguely like a leopard, one
    jumps, rather than trying to seek the optimal fit
    to the observed fragments of the picture to a
    leopard.

91
  • How can we understand that, in the social
    sciences, medical sciences, etc., we are getting
    away with insufficient statistics, even with NgtT?
  • We are projecting external information into our
    statistical assessments. (I can draw a
    well-determined straight line across even a
    single point, if I know that it must be parallel
    to another line.)
  • Humans do not optimize, but use quick and dirty
    heuristics. This has an evolutionary meaning if
    something looks vaguely like a leopard, one
    jumps, rather than trying to seek the optimal fit
    to the observed fragments of the picture to a
    leopard.

92
  • Prior knowledge, the larger picture, values,
    deliberate or unconscious bias, etc. are
    essential features of model building.
  • When we have a chance to check this prior
    knowledge millions of times in carefully designed
    laboratory experiments, this is a well-justified
    procedure.
  • In several applications (macroeconomics, medical
    sciences, epidemology, etc.) there is no way to
    perform these laboratory checks, and errors may
    build up as one uncertain piece of knowledge
    serves as a prior for another uncertain
    statistical model. This is how we construct
    myths, ideologies and social theories.

93
  • Prior knowledge, the larger picture, values,
    deliberate or unconscious bias, etc. are
    essential features of model building.
  • When we have a chance to check this prior
    knowledge millions of times in carefully designed
    laboratory experiments, this is a well-justified
    procedure.
  • In several applications (macroeconomics, medical
    sciences, epidemology, etc.) there is no way to
    perform these laboratory checks, and errors may
    build up as one uncertain piece of knowledge
    serves as a prior for another uncertain
    statistical model. This is how we construct
    myths, ideologies and social theories.

94
  • Prior knowledge, the larger picture, values,
    deliberate or unconscious bias, etc. are
    essential features of model building.
  • When we have a chance to check this prior
    knowledge millions of times in carefully designed
    laboratory experiments, this is a well-justified
    procedure.
  • In several applications (macroeconomics, medical
    sciences, epidemology, etc.) there is no way to
    perform these laboratory checks, and errors may
    build up as one uncertain piece of knowledge
    serves as a prior for another uncertain
    statistical model. This is how we construct
    myths, ideologies and social theories.

95
  • It is conceivable that theory building (in the
    sense of constructing a low dimensional model)
    for social phenomena will prove to be impossible,
    and the best we will be able to do is to build a
    life-size computer model of the system, a kind of
    gigantic Simcity, or Borges map.
  • By playing and experimenting with these models we
    may develop an intuition about its complex
    behaviour that we couldnt gain by observing the
    single sample of a society or economy.
Write a Comment
User Comments (0)
About PowerShow.com