Robust and Rich Roy Welsch Xinfeng Zhou Massachusetts Institute of Technology

About This Presentation

Title:

Robust and Rich Roy Welsch Xinfeng Zhou Massachusetts Institute of Technology

Description:

T = number of time periods for estimating . Note: wi could be negative for short sales. ... number of assets, N, far smaller than the number of time periods, T) ... – PowerPoint PPT presentation

Number of Views:84

Avg rating:3.0/5.0

Slides: 27

Provided by: MIT12

Category:

more less

Transcript and Presenter's Notes

Title: Robust and Rich Roy Welsch Xinfeng Zhou Massachusetts Institute of Technology

1
Robust and Rich Roy WelschXinfeng
ZhouMassachusetts Institute of Technology

email rwelsch_at_mit.edu
International Conference on Robust Statistics
Technical University of Lisbon
17 July 2006

2
(No Transcript)
3
Notation

N number of risky assets
Ri return of the ith asset in the portfolio
wi weight of the ith asset in the portfolio,
?i expected return of the ith asset
? covariance matrix of the returns of N assets
T number of time periods for estimating ?.
Note wi could be negative for short sales.

4
Mean-Variance Portfolio Optimization

Portfolio return
Rp w1R1 w2R2 . . . wnRn
Expected portfolio return
E(Rp) wT?
Variance of the portfolio return
Var(Rp) wT?w
Mean-variance portfolio optimization minimizes
the variance of a portfolio return for a given
level of expected return ?p
subject to wT? ?p, wTe 1
where e is the n ? 1 column vector with all
elements 1.

5
Problems with Mean-Variance

Static just one period.
Sensitive to inputs which are, in turn, subject
to random errors in the estimation of expected
return and variance which are usually obtained
from historical return data.
This sensitivity often leads to extreme portfolio
weights and dramatic swings in weights with only
minor changes in expected returns or the
covariance matrix. This can lead to frequent
rebalancing and excessive transaction costs.
For stable covariance estimation, we prefer long
historical time series (the number of assets, N,
far smaller than the number of time periods, T).
However, old historical data may not reflect
current market dynamics.
Underlying multivariate normal assumption may not
be right.

6
Some Solutions

Factor models (CAPM, etc.), Bayesian shrinkage,
GARCH models.
Regularization (penalty) methods.
Robust estimation of the expected return and the
covariance matrix. We will focus on this.
Combinations of the above methods.

7
Fast-MCD

The minimum covariance determinant (MCD) proposed
by Rousseeuw (1985) looks for the covariance
matrix of h data points (T / 2 ? h lt T) with the
smallest determinant. The breakdown is (T ? h) /
T. The resulting covariance matrix is biased
(and can be adjusted to be unbiased), but this
multiplicative factor has no effect on portfolio
weight allocation. MCD is not feasible for N gt
20 in our situation. Fast-MCD proposed by
Rousseeuw and Van Driessen (1999) makes large N
(51 in our data) feasible. MCD retains affine
equivariance.

8
Pairwise Robust Covariance

If the affine equivariance assumption is
dropped, faster robust pairwise covariance
estimators are available.
Khan et.al. (2005) compared several approaches
to robust pairwise covariance estimation while
investigating ways to make least-angle regression
(LARS) (Efron, et. al., 2003) robust. They found
a two-step, two-dimensional Winsorization method
to be effective and fast. We use a modified form
of their idea with adjustment to insure a
positive definite covariance matrix.

9
Huber Winsorization
For each (time) vector of returns xi, i 1, . .
. , N, the transformation is used to
shrink outliers towards the median with the
Huber function Hc min max(c, x), c, c gt 0.
10
Bivariate Winsorization

Huber Winsorization fails to take the orientation
of the bivariate data into consideration.
Bivariate Winsorization (after centering) sets
with xt (xti, xtj)T and D(xt) the Mahalanobis
distance based on some initial ?0, ?0 and
constant c.

11
Iterated Bivariate Winsorization

For each pair of variables xi, xj compute
Let xt xti, xtjT.
For each ?k, ?k, calculate the Mahalanobis
distance for each return pair
and weight

Update ?k1, ?k1
until convergence.
All pairwise covariances are combined to form an
initial covariance matrix. This is converted to
positive semi-definite using a method due to
Maronna and Zamar (2002).
We call this I2D-Winsor.

13
Fast 2-D Winsorization

Iteration is expensive and Khan, et. al. (2005)
proposed taking one bivariate Winsorization step
from an improved starting ?0. They start with
univariate Huber Winsorization but use two tuning
constants, c1 and c2. The constant c1 (chosen to
be 2) is used in the two quadrants with the most
data (n1) and the second constant c2 n2 / n1
with n2 T n1 is used in the remaining two
quadrants. This pulls the Huber Winsorization
boundary in where there is less data and a higher
chance of data not following the ellipsoidal
pattern for the main part of the data.

14
(No Transcript)
15
Fast 2-D

The classical correlation is computed on this
Winsorized data and all pairwise correlations
form the full initial correlation matrix which,
if necessary, is made positive definite. Then one
step of bivariate Winsorization is used and this
new matrix is again made positive definite. We
call this F2D-Winsor.

16
Historical Data

Daily returns on 51 MSCI US Industry sector
indexes from 01/03/1995 to 02/07/2005 (2600 days
of data). Broader than the SP 500. We need to
find the weights to use on each of the n 51
indexes in our portfolio.
Rebalance as follows Estimate sector weights
using most recent T 100 daily returns,
rebalance every 5 trading days. With 2600 days
there are 500 rebalances. Trading costs (when
used) are 5 cents for each 100 bought or sold.
We use the following constraints
wT? ?p, wTe 1, ?1 ? wi ? 1.
The market portfolio consists of all individual
stocks (about 700) in the 51 indexes weighted by
market capitalization.

17
Financial Performance Measures

mean the sample mean of weekly ex-post returns.
STD the sample standard deviation of weekly
ex-post returns.
Information ratio (annualized)
?-VaR (? 5, 1) the sample ?-quantile of the
weekly ex-post returns distribution.
?-CVaR (? 5, 1) the sample conditional mean
of the weekly ex-post returns distribution, given
the returns are below the ?th quantile.
Max DD the maximum drawdown, which is the
maximum loss in a week.
CRet cumulative return.
Turnover weekly asset turnover, defined as the
mean of the absolute
weight changes for 500 updates.
CRet_cost cumulative return with transaction
costs.
IRcost information ratio with transaction costs.

18
Winsorization Results
19
Contamination Models

MCD
Each row (time observation) either from F0 or H.
Implies either a bad day on the market (all
stocks) or a high correlation among stocks. In
fact, rarely true.
Pairwise
Pairwise correlation permits a more flexible
error model. Unusual market returns only explain
a small part of observed outliers. Industrial
factors and idiosyncratic risk specific to
individual stocks or groups of stocks explain a
majority of the outlying data.

20
Too Much Turnover?

The mean-variance portfolio optimization problem
can be re-expressed as
subject to w?? ?p and w?e 1. One way to
possibly reduce turnover would be to penalize
deviations from the market weights, mj and, at
the same time, look for sparse solutions that do
not invest any funds in some securities. The
LASSO (Tibshirani, 1996) does exactly this. More
robust loss functions such as L1 and Huber may
also be used instead of least-squares, but did
not change the results significantly.

21
Penalization

To implement this we solve (Laupréte, 2001)
and use 5-fold cross-validation to find ? based
on prediction error for the out-of-sample data.
The recently developed LARS (least-angle
regression) algorithm (Efron, et. al. 2004)
greatly speeds up computations for the Lasso
since solutions for all ? can be found in about
the same time as one least-squares regression.
This removes the need for a (non-specific) grid
search on ?.

22
Penalty Results

We end up with slightly better performance and
dramatically lower turnover.

23
Run Times

500 Rebalancings
V 40 seconds
F2D-Winsor 35 minutes
I2D-Winsor 3 hours
FAST-MCD 10 hours
V1 4 hours

24
Next Steps

Combine robust covariance and penalty approaches.
Use individual stocks (about 700) instead of 51
sector index funds. Then T 100 lt lt N 700.
Fast algorithms.

25
References

Alqallaf, F.A., et al., (2002) Scalable robust
covariance and correlation estimates for data
mining. Proceedings of the eighth ACM SIGKDD
international conference on Knowledge discovery
and data mining, Statistical methods, 1423.
Efron, B., Hastie, T., Johnstone, I., Tibshirani,
R., (2003) Least Angle Regression, Annals of
Statistics, 32, 407499.
Khan, J., Van Aelst, S. and Zamar, R., (2005)
Robust linear model selection based on Least
Angle Regression, Technical Report, Department of
Statistics, University of British Columbia.
Laupréte, G.J., Portfolio risk minimization under
departures from normality. MIT PhD Thesis, 2001.
Maronna, R.A. and R.H. Zamar, (2002) Robust
estimates of location and dispersion for
high-dimensional datasets, Technometrics, 44(4),
307317.

Rousseeuw, P.J. and K. Van Driessen, (1999) A
fast algorithm for the minimum covariance
determinant estimator, Technometrics, 41(3),
212223.
Tibshirani, R., (1996) Regression Shrinkage and
Selection via the Lasso, Journal of the Royal
Statistical Society, Series B, 5, 267288.
Zhou, X., (2006) Application of Robust Statistics
to Asset Allocation Models, MIT, M.S. Thesis.

Write a Comment

User Comments (0)