An advanced application of Black-Litterman

(1)

Libera Universit`

a Internazionale degli Studi Sociali

FACOLT `A DI ECONOMIA E FINANZA

Corso di Laurea in Finance

Tesi di Laurea

An Advanced Application of Black-Litterman

Candidato:

Luca Simone

Matricola 699841

Relatore:

Prof. Saverio Massi Benedetti

Correlatore:

(2)

(3)

An Advanced Application of Black-Litterman

Luca Simone

October 6, 2019

(4)

4 An Advanced Application of Black-Litterman 18 4.1 ABL . . . 19 4.2 ARMA . . . 20 4.3 GARCH . . . 20 4.4 Skewed t-Distribution . . . 21 4.5 Copula . . . 23 4.5.1 ABL vs. BL . . . 26 5 Implementation 28 5.1 Data . . . 28 5.2 Procedure . . . 30 6 Results 31 6.1 ABL vs. BL (continued) . . . 39 6.1.1 Basic Scenario . . . 41 6.1.2 Advantageous Scenario . . . 43 6.1.3 Disadvantageous Scenario . . . 46 6.2 Performance . . . 50 7 Conclusion 55 8 Acknowledgements 56 9 Code 57

(5)

1 Introduction

The investment process has changed dramatically over time, along with the development of new tools in Finance. We have observed a gradual switch from the so-called naïve investing to fully automated quantitative strategies. Nowadays, most types of strategies that one can think of can be coded using computer software and turned to reality. The strategies that involve investing are many and dier based on time horizons, risk prole of the investor, market conditions, wealth available.

The starting point of modern portfolio construction is the basic mean-variance framework (or Markowitz model), dating back to the 1950's. It was a major breakthrough in Finance and introduced a new way of rethinking investments, but still contained many limitations. Research was sparked by this innovation and this clever yet rudimental model has been en-hanced, modied and ne-tuned. The products of these changes have in turn been subject to similar transformations, leading to a great multitude of models with dierent characteristics. One of such models is called Black-Litterman Model and, when it was developed in the 1990's, constituted another major breakthrough in the eld. In particular, it solved major problems of the Markowitz model and it produced a quantitative way for an investor to express his own personal beliefs over future outcomes. This model came along with the mathematics and theory backing it up, making, on the one hand, huge progress in the investment scene but also making, on the other hand, the portfolio construction process more complex. However, this didn't constitute a major applicability problem, because the model provides a closed-form solution. This means that it can be applied by anyone, without necessarily understanding the mathematics or economics behind it.

As every other novelty, the Black-Litterman Model was studied in-depth and modied in multiple ways to t the specic needs of investors and overcome some limitations. The ways it has been modied are multiple, but they give up the advantage of having a closed-form solution. This means that the various steps taken in the actual construction of new models, often highly demanding in terms of knowledge of maths or coding, have to be made by the individual investor without the help of some universal formula. For this reason, most of these new models could never be accessible to some investors.

Among the assumptions overcome by the modied versions of Black-Litterman, there is normality of returns. A complex model produced by Attilio Meucci overcomes this as-sumption and some others, but causes non-professional investors to suer from its extreme technicality.

Today we propose a model which overcomes the same assumptions in a more heuristic and accessible way. This model will be constructed step-by-step and compared to a benchmark in order to measure its performance and discover its strengths and weaknesses.

(6)

2 Background

2.1 Portfolio Construction Process

As a rst thing, we try to answer the following question: What is an investment? A simplistic way to dene it is a way to move resources across time so that we can have more tomorrow by giving up something today (or viceversa). The ways in which we can invest are innite and the rst problem imposed on us is how to allocate our wealth available for investment. This process is called Asset Allocation and concerns the choice of the asset classes to invest in. An asset class can be dened as a group of securities with a set of characteristics that are considered to be similar (e.g. all US equities, Hedge Funds, Italian Corporate Bonds). We notice that in asset allocation we are also deciding where we are investing in a geographical sense (in Italy, in the US, Russia, and so on...). Moreover, notice that an asset class is broadly dened and it can set its "boundaries" at dierent levels, leaving room to interpretation. An investor should make considerations over the outlook of the market and the individual asset classes before constructing its investment portfolio.

In reality, some strategies don't require any qualitative analysis and can be applied in "mathematical" way. These strategies usually belong to what is called technical analysis and select the individual securities, regardless of their asset class, based on some indicator. Technical analysis is beyond the scope of this discussion but, to mention some common examples, includes mean-reverting strategies or price momentum.

Another way in which we can allocate wealth is dened as sector rotation strategy. This strategy tries to look at the business cycle and predict its next moves. This allows an investor to choose a sector whose performance is positively correlated to the particular phase of the cycle. For example, at a market peak we could invest in IT and industrials whereas in a market trough we look more at consumer staples.

These dynamic strategies can be considered short term as they change over time to adapt to market changes. In our discussion though we will focus on more long-term strategies.

For an investor with a long-term horizon, it is the study of the overall market risk-return framework which leads to a rst allocation of resources among asset classes, and this process is called Strategic Asset Allocation (SAA). In this part of the process, portfolio managers decide the weights assigned to each asset class and then proceed to selecting the best securities within each one of them (in accordance with the asset class weights). The SAA is usually reviewed annually but, in normal times, it is not aected by recent market changes and has an horizon of more or less 5 years. Due to its long-term horizon, the portfolio is usually not adjusted to contemporaneous news and is held until the predened horizon. Whenever the investor (or management) decides to take short- or medium-term bets, an alternative solution is given by Tactical Asset Allocation (TAA) in which the initial investment weights can be changed more frequently to adapt to temporary market changes. An SAA and TAA can coexist in the following way. The initial strategic allocation is tilted in favor or against the assets and/or asset classes on which we wish to bet, creating a tactical bet. This weights are reverted back to the SAA (or to a new TAA) as soon as the bet is realized. Which of the strategies we use depends on the conditions of the market. In a

(7)

variable market it may be more logical to use a TAA whereas in a trending and predictable market it can be more ecient to rely on an SAA. The mix of the two can also be employed whenver it suits at best the needs of the investor.

A portfolio can also be constructed in a naïve way, that is, just through common sense diversication and intuition over the future. For instance, an investor can decide a priori that he wants to be invested in:

40% US equity 30% Global Bonds 30% Real Estate

These numbers are purely discretionary, but this can actually constitute a real-life ex-ample. We can reach a similar result, with percentages of allocation across asset classes in more advanced ways. The strategies we will consider involve quantitative optimization.

2.2 Markowitz and the Mean-Variance framework

The models used nowadays are many, but not so long ago, the naïve way was the only valid solution. It is thanks to the mean-variance optimization process, rst proposed by Markowitz (1952), that the world of optimization developed so widely.

As we know, the driver of return is risk. Meaning that there is "no free lunch" and that we cannot have the former without the latter. This rst optimization created an unpreceded mathematical way to deal with the trade-o between the risk and return by formalizing their measures for all individual assets:

Risk is now measured by standard deviation (for asset i, σi)

Returns are proxied by their expectations (for asset i, µi)

The optimization accounts for the multitude of assets in the market and requires these two measures for all of them. Moreover, the way in which the assets behave in relation with one another is measured by their covariance (for assets i and j, σij).

With the use of this information, Markowitz provides the formulas for the expected return and risk of a portfolio given the vector of weights (wi) invested in each of the n assets at

disposal: E(Rp) = n X i=1 wi· µi σ_p2 = n X i=1 n X j=1 wiwjρijσiσj

(8)

The formula for portfolio variance relies on the statistical knowledge that the covariance of two assets is the product of their standard deviations times their correlation:

σij = ρijσiσj

In the summation, when i = j, ρij becomes ρii= 1 (an asset has perfect correlation with

itself). Therefore, σii= σ2i (the covariance of an asset with itself is its variance).

Imagine an investing universe made up of n assets; we wish to assign to each one of them an expected return and quantify its risk. Moreover, we can estimate the correlation among the n assets to understand how they perform related to each other. The concept of diversication in nance concerns the fact that assets don't covary perfectly and that there is an optimal way to take advantage of this. Notice from the formula for the variance of the portfolio that a value of correlation ρij equal to 0 eliminates the whole contribution

of the two assets to the risk of the portfolio. A negative correlation does even more, and reduces the overall risk of the portfolio. The secret of diversication, as we can see, lies in the correlation among the assets.

Markowitz provided, for the rst time, a mathematical way to consider the ndings above and produce ecient portfolios. Before we look at the way it actually works, let us consider more in-depth the concepts we are dealing with: risk and expected returns.

An expected return is the forecast of the realized return of an asset which we cannot see today but we will observe tomorrow. We need an a priori measure to quantify the possibility of the realized return to be dierent from our expectation. This possibility can be interpreted as "risk" and many ways to measure it are available. The most simply understood one, and also widely used, is standard deviation (also known as volatility). Standard deviation, as we have seen, is represented by σ, and it is the positive square root of the variance of the returns time series of an asset; it can be easily computed through historical data and it can proxy the standard deviation of future returns.

Among the issues with this measunre there is the fact that standard deviation is sym-metric. This means that risk concerns downside events as well as upside events (in practical words, an asset that performs much better than expected would still be considered risky).

Extending the concept of standard deviation to a multivariate framework, it is convenient to introduce vector notation. If we are dealing with n assets, the covariance matrix (notation: Σ) is the matrix in which the diagonal represents the variances of the n assets (σ2

i), and the

o-diagonal elements their covariances (σij).

The structure of a covariance matrix Σ is the following:

Σ =      σ₁2 σ21 . . . σn1 σ12 σ22 . . . ... ... ... ... ... σ1n . . . d σn2     

(9)

EWMA stands for "Exponentially Weighted Moving Average" and is a common and straight-forward way to measure volatility. The box below provides its description.

EWMA volatility

The main advantage of this method is that it enables us to dierentiate between recent market events and events of a very distant past. The formula of this method is the following:

σ2_t = λσ_t−12 + (1 − λ)r_t−12 What the formula tells us is that the variance today (σ2

t) is a function of yesterday's variance

(σ2

t−1) and yesterday's squared return (r2t−1). If we look at the formula in recursive way, at

any time t, σ2

t incorporates all the squared returns of the series up to that date, excluding

the contemporaneous one. We understand now that the key parameter is the "memory" coecient, λ, which weights the last return versus all the previous ones.

For this reason, we will have a more or less "reactive" volatility based on the weights assigned to the last squared return (1 − λ). Meaning that for λ close to 0 volatility is very responsive and for λ close to 1 volatility it is more "sticky". For our purposes we will refer to RiskMetrics, which calibrates λ = 0.94.

Switching to the other main ingredient of Markowitz's model, expected returns, we have a very basic approach to quantify them, using the average return of the dierent assets. This average can serve as proxy for future returns. Of course, there are ways to generate expected returns with more sophisticated procedures but the arithmetic (or, alternatively, geometric) mean represents an instantaneous and simple approach.

Note that for both risk and expected returns we are using an historical approach. The quantities used today stem out of the past observations. The underlying assumption is that the past is representative of the future. This is not usually a desired result in Finance, and can bring to misleading decisions. To use a typical nancial expression, it is like driving a car just by looking at the rear mirror.

The risk-return trade-o in creating a portfolio marks the birth of what is called "Mod-ern Portfolio Theory" or simply MPT. More returns require more risk and, viceversa, if we don't want to face risk we cannot expect high returns.

Going back to the framework with n assets available, we can invest our budget across them in many possible ways (assigning all possible weights to all assets). Every portfolio we can construct is dened as achievable. If we plot all the achievable portfolios based on their risk and return, we obtain a graph similar to the one below. In the graph the blue shaded area represents all achievable portfolios but only the ones on the so-called "Ecient Frontier" are ecient. The envelope that contains all possible portfolios is considered ecient because the portfolios in this set are the ones that have, for their levels of risk, the highest expected return.

(10)

Figure 1: An example of the Ecient Frontier

Ecient, in Markowitz terms, means that there is no portfolio that has higher expected returns for that level of risk (standard deviation) or, the other way around, there is no portfolio that achieves the same expected return but faces less risk. As a consequence of this, all rational investors are going to construct one of the portfolios that lie on the Ecient Frontier.

Note that this breakthrough concept introduced by Markowitz can nd applications else-where. As a matter of fact, later on, we will use the same concept while looking at other models; that is, the Ecient Frontier will be constructed given new ways to specify risk and returns, but producing similar results.

Each point on the frontier is a portfolio in which all assets have a weight. For these portfolios we can compute as before the expected return and variance. Using matrix notation:

µp = α1· µ1+ · · · + αi· µi+ · · · + αn· µn = α0µ

σ2_p = α0Σα This notation uses the following elements: α is the ordered vector of assets' weights (αi).

µ is the ordered vector of assets' expected returns (µi).

Σ is the covariance matrix of the assets (containing σ2

i's and σij).

The Ecient Frontier is a continuous function in the graph provided, but if we are only allowed to invest in discrete weights (for example, we can change weights by 1 basis point each time) it becomes a step function. Later on, we will construct our Ecient Frontiers using 25 points (25 portfolios).

(11)

2.3 Utility functions

But how do we choose the best point to be at on the frontier? To answer this question we need some additional information, specically, an utility function. Utility can be seen as the personal satisfaction of an individual. Because it is not easily quantiable, utility functions have an ambitious goal of trying to assign a value to this degree of satisfaction. What we know is that satisfaction, in this framework, is a fucntion of two inputs, risk and return. Inevitably, more returns imply more satisfaction but, at the same time, more risk. Bearing risk, on the other hand, lowers our level of satisfaction, leading us once more to the risk-return trade-o. What a utility function does in practice is:

1. Provide for a rule for the trade-o between risk and return;

2. Take the ecient frontier portfolios and, through their expected returns and variances, compute the utility of each of them using such rule;

3. Return the portfolio with the highest utility;

In nancial litterature there are dierent utility function families which dier signicantly in their implications. They are not cited in this paper because we will use, in accordance with common practice in Finance, an exponential utility function, which is dened as follows:

U (Wt+1) = − exp(−λWt+1)

where Wt+1 is next period's wealth and λ > 0 (dierent from the λ we have encountered

in EWMA) is the absolute risk aversion coecient.

The aim of the optimization is now to maximize a priori the expected utility of tomor-row's wealth:

E[U (Wt+1]

To do this we must rst quantify tomorrow's wealth. Wt+1 depends on the outcome of

our investments today. We have, as before n assets to invest in, plus, we introduce now a risk-free asset. The assumption that there is a risk-free asset means that we can invest as much as we want (where negative investment is also allowed and means borrowing) and obtain a return rf which has, by denition, zero risk (in the case in which we borrow we

simply return after one period the amount borrowed times 1 + rf ).

Based on how much we allocate across the risk-free and risky assets today we generate a wealth tomorrow equal to:

(12)

We now look for optimal weights assigned to risk-free and all other assets (the vector α). The maximization problem, subject to α can be written as:

max

α E[U (Wt+1]

It can be shown that, using exponential utility, the optimal solution to this problem is equivalent to the optimal solution of the following simpler problem:

max α µp− λ 2σ 2 p

As we know, we can rewrite µp and σp2 as:

µp = α0µ

σ2_p = α0Σα

In this way we can express returns and variance as a function of α. The coecient λ is, therefore, the only unknown that prevents us from using rst order conditions and solving for α. This parameter λ is strictly specic to every dierent investor and tries to quantify how much the investor is averse to taking risk. Assigning a value to this coecient can be very hard and is a topic which resides under the scope of behavioral nance. Let us say that the value for this coecient ranges between 1 and 8 and that, on average, it is approximately 2.5/3. A higher value of λ implies more risk aversion and, as consequence less risky portfolios. Lower λ means higher risk tolerance and riskier portfolios.

The solution, α∗_{,to the problem above is given by the following closed-form solution:}

α∗ = 1 λΣ

−1

µ

Substituting for the individual risk aversion coecient of an individual provides the optimal weights for such investor.

Given that the average coecient λ in the overall market is approximately 2.5, one can plug-in λ = 2.5 in order to nd the weights α which constitute current market capitalization. This procedure will nd an application later on in our discussion.

2.4 Criticism of Markowitz and newer ndings

Although Markowitz remains a milestone in Finance, it has some limitations which have been studied and, in some cases, surpassed by more modern research.

A major critique of this model is given by Michaud (1989) in which the author claims that mean-variance optimization is actually an "estimation-error maximizer". Meaning that the inputs estimated historically through Markowitz generate portfolios which tend to overweight assets with high estimation error in returns and low estimation error in risk (and viceversa). Michaud claims that the optimization proposed by Markowitz is really optimal only when

(13)

the true population parameters are know. As we have seen though, in Markowitz, these inputs are produced using an historical method. The consequences of this can be disastrous for an investor, as there are high chances that the returns from the market will be at least less advantageous than the expected ones for the portfolio.

Moreover, especially when short selling is allowed, we face a new problem called "corner solution". The issue is that, given some inputs for returns and volatility, we may obtain a set of optimal weights which is very unstable. For unstable we mean that the model heavily relies on the preciseness of the inputs and, if the latter are slightly changed, great changes happen to the portfolio composition. This translates into high model risk, on top of "normal" market risk. Not only we face the risk of market uctuations when we use our investment model, we also face the risk that the model itself does not produce results that match our intents.

Because the main issue is that in Markowitz the inputs are taken as 100% certain there two possibilities available:

1. Heuristic approach, in which we take not just one set of inputs, rather, we take many diernt inputs and average them out.

2. Bayesian approach, in which we take the inputs as given on one hand and then we make use of Bayesian statistics to mix them with our own views.

The rst method is primarily represented by the Resampled FrontierTM _{proposed by}

Michaud in 1998 and takes a bootstrapping approach. Michaud, through his licensed method-ology, resamples thousands of times the datasets and produces thousands of Ecient Fron-tiers which are then averaged. This produces a nal Ecient Frontier which does not present the problems of mean-variance described above.

The second option, which makes use of Bayesian statistics, is covered by the famous Black-Litterman (BL) model, developed in 1990. Because we are going to deal with this latter approach, let us, before we introduce Black-Litterman formally, introduce Bayesian statistics.

2.5 Bayesian Statistics

Under the mathematical perspective, it is worthwhile briey discussing the maths underlying the construction of the BL model. In the proposed model for portfolio construction we make use of the theory provided by Bayesian statistics. In general Bayesian probability theory is based on the principle that observed data should somehow help in estimating probabilities of future events. Using this approach, we will update the information given to us by markets with our private information.

From probability theory we know that the joint probability of two events can be decom-posed as:

(14)

or alternatively:

p(x, y) = p(x|y)p(y)

By equalizing the two expressions above we obtain the most important result in this eld of statistics, Bayes's Theorem. The latter is formulated as:

p(x|y) = p(y|x)p(x)

p(y) ∝ p(y|x)p(x)

The ∝ symbol means "proportional to" and allows us to disregard the denominator of the expression.

In Bayesian statistics, usually, x is an event and y is some observed data, the probabilities acquire the following interpretations:

p(x) is the prior probability of event x.

p(y|x) is the likelihood function. The probability of the "evidence" y given that x is true.

p(x|y) is the posterior probability of x. The new probability we assign to event x given that we observed y.

The posterior p(x|y), is proportional to the likelihood, p(y|x) times the prior, p(x). To reconnect with the criticisms of Markowitz, this methodology can be used to mix the inputs we have with our views to construct more reliable results.

(15)

3 Black-Litterman

The Bayesian approach can be used to incorporate beliefs about many market outcomes into our estimation. In particular, we place ourselves in the shoes of an asset manager who wants to base his asset allocation on his beliefs. We could start by using the market portfolio as a beginning point and then tilting the weights in each asset class according to our views, overweighting whenever we have positive views and underweighting in the opposite situation. This approach could in principle be used for investing across all securities, expressing views on each, but could lead to dimensionality issues whenever we are dealing with many assets. The Black-Litterman model was developed in 1990 by Fisher Black and Robert Litterman at Goldman Sachs and it can be easily applied by means of a closed-form solution. The result of the model, which we will present later on, is a vector of expected returns, produced by the views expressed.

The reason why the mix between views and sample information will lead to better results than plain Mean-Variance is related to the concept of shrinkage. We describe this concept by taking a small step behind and presenting a fundamental fact:

Stein's Paradox (1956)

Consider the estimation of a mean of n multivariate normal random variables Xi ∼ N (µ, 1).

The paradox tells us that:

- If n = 1 and we obtain a mean of X1, the best estimator for µ is ˆµ = X1;

- If n = 2 and we obtain a mean of (X1, X2), the best estimator for µ is ˆµ =

X1

X2

; - If n = 3 and we obtain a mean of (X1, X2, X3), the best estimator for µ

IS NOT ˆµ =   X1 X2 X3  ;

Meaning that estimating the expected returns µi separately is not appropriate for n > 2.

A better estimator has instead the following form: ˆ

µshrinkage_i = δ · µ0+ (1 − δ) · ˆµi

The intuition is that simple averages, µi, are inecient estimates, therefore, they are

"shrunk" towards a non-sample target values, µ0, which can be determined in dierent

ways. The paradox also states that any shrinkage target leads to better estimation of means. Commonly used targets, µ0, are:

zero

(16)

The same concept is applied when estimating a covariance matrix. Imagine we wish to estimate a covariance matrix, we can take these two following paths:

1. Use a single factor model covariance, call it F (the CAPM could be an example of single-factor model);

2. Use sample covariance matrix, Σ;

The rst one is a highly structured estimator, which assumes a single factor explains all there is to know and has no estimation error as long as the model is true, and the second one has no structure but is subject to a lot of estimation error. A shrinkage estimator tries to mix the two by taking the structure imposed by the model and combining it with the sample estimate to reduce its estimation error. The two estimators are combined by means of a third ingredient, the shrinkage constant, δ. This constant has the purpose of minimizing the distance between the true covariance matrix and the shrinkage estimator:

ˆ

Σshrinkage = δF + (1 − δ)Σ

Because we cannot observe the true covariance matrix, we cannot compute the distance between the estimated and the true values. For this reason also the shrinkage constant, δ, must be estimated. There are many possible ways in which this can happen, in our framework, ˆδ is computed based on uncertainty of estimates (more uncertainty reduces the weight assigned to the shrinkage component).

To summarize, for the Black-Litterman model, shrinkage is achieved for both the vector of returns and the covariance matrix. The interesting thing of the model is the choice of the shrinkage elements. The shrinkage target is derived from an equilibrium condition (which, we will see, are market neutral weights) and the shrinkage constant is linked to our condence over the outcomes.

In the attempt to mitigate the estimation error and model risk of the traditional mean-variance optimization, BL provides a new vector of expected returns which will be a linear combination of the priors (market information) and the views, with the condence in the latter variables (measured by their standard deviation) as a scaling factor. As we mentioned, the model leads to a closed-form solution. The latter is:

E[µ|v] = [(τΣ)−1_{+ P}0_Ω−1_{P ]}−1_{[(τ Σ)}−1_µ

eq+ P0Ω−1v]

The variables which enter the formula are the following. Some may be familiar already, other will be discussed soon:

µ: the nal expected returns over the available assets; v: vector of views over portfolios of assets;

τ: interpreted as condence over the estimated covariance matrix; Σ: covariance matrix;

(17)

P : selection matrix, used to describe our views; µeq: market neutral expected returns;

Ω: condence over view;

In practice the model combines the implied expected returns of the market and the personal views of the investor. We obtain a nal result which can be used for an ecient mean-variance optimization, the same as Markowitz but with dierent "ingredients".

We will consider the information derived from the market as our prior, we will then mix with our subjective views to obtain our posterior. In particular, we extract from the current market capitalization the vector of implied returns (µeq) for a given risk aversion and

considering the historical covariance matrix.

The reason why this model is considered a break-through in Finance is that it combines a versatile yet rational way to apply what is known as shrinkage. We will now address the description of the missing elements which we have mentioned above: v, µeq, τ, P and Ω.

3.1 Neutral Expected Returns

Let us start o from µeq. Black and Litterman don't take as input the vector of average

returns, they take instead the equilibrium vector of expected returns (µeq). They do this to

avoid the noise generated by sample means. Moreover, using a sample mean would favor past winners over past losers, disregarding that we are interested instead in the future. The way they obtain a vector, µeq, of equilibrium returns is by reverse optimization. We know

from Markowitz that the equilibrium weights are computed using the formula: w = 1

λΣ

−1

µ

From the above we reverse engineer the vector of market equilibrium returns: =⇒ µeq= λmktwmktΣ

Because we observe the weights (market capitalization) we can simply reverse compute the equilibrium vector of expected returns. The only thing we are missing is the value of λmkt we need to use.

To calibrate the parameter we can take advantage of some formula manipulation: take the result we found above for µeq and premultiply by w0mkt. This leads to:

w0_mktµeq= λw0mktΣwmkt =⇒ λ = w_mkt0 µeq w0 mktΣwmkt = µmkt σ2 mkt

Using this trick to calibrate λ, we reach the conclusion that the historical Sharpe ratio (dened as average returns over volatility) of the market can be used as proxy for γ. In practice, this value is close to the value of 2.5. This value is a proxy of the average risk aversion in the market.

(18)

As our prior we have that returns are distributed with mean equal to the implied vector of returns. As variance of the prior we take the scaled covariance matrix of returns. The way we scale the covariance matrix is based on the condence that we have over the accuracy of market expectations, and this is where τ comes in. Normally, a value given to the scaling parameter τ, is 1/T , one over the number of observations, and this is going to be also our approach. Because Black-Litterman also assumes normality of returns, we end up with the following result for the market prior:

µ ∼MVN(µeq, τ Σ)

To briey comment on the above, we use a reverse engineering technique to obtain a prior input. This prior corresponds to the equilibrium condition extracted from current market conditions.

3.2 Subjective Views

Once the investor has the market neutral vector of expected returns at disposal, he can point out some dierences between what the market believes and what he believes as an individual. More specically, with n assets at disposal, the investor can express relative or absolute views over a subset of k 6 n linear combinations of the returns of the assets. His views are expressed as follows:

v =Pµ

The selection matrix P , is used to choose the assets subject over our views, v denes the view itself and is the dierence between the prior expected return contained in µ and the investor's personal view. The way in which the various components interact is best described through the use of an example.

Example

There are 3 assets each with its own expected return. The investor has an absolute view on the rst two which states that the return of the rst will be 15% and the return of the second will be 10%. This can be summarized in the following way:

P =1 0 0 0 1 0 and v =15% 10%

Imagine now that the investor has a relative view that the dierence in the returns of the rst and third asset will be 7%, the information will be summarized as follows:

P = 1 0 −1 and v =7%

Because the views are not precise we also need a precision matrix, Ω, which determines the level of uncertainty in our views. For example, we provide two examples of such a precision

(19)

matrix, the left one stands for high condence in our views and the one the right stands for lower condence: Ω =(1%) 2 ₀ 0 (1%)2 and Ω = (10%) 2 ₀ 0 (10%)2

The diagonal elements tell us, respectively, the uncertainty we have on the views expressed. The o-diagonal elements represent the covariance between forecast uncertainty and are usually set to 0, meaning that a view's certainty is not correlated with another view's. To set the condence of the investor over the view is a hard job and this cannot be achieved in a fully scientical way, the numbers are usually discretionary.

Now that we have our prior (market neutral condition) and our views dened, by use of Bayesian methods we can mix the two and obtain the closed-form solution shown before. To summarize, the Black-Litterman (BL) model is constructed as follows:

Figure 2: BL Procedure

From a methodological point of view, BL mitigates some of the problems that come from the plain mean-variance optimization. As we have seen, the investor uses market equilibrium weights as a starting point and then applies his views, if any. As a consequence, portfolio concentration is avoided automatically because we start o with a prior which is already diversied. The investor has, on top of this, the opportunity to incorporate his views in the investment strategy, making the method versatile and practical. Still, it seems like there is margin for improvement.

(20)

Two of the issues with Black-Litterman that we will discuss in our model are:

1. The assumption of normality of returns which is in conict with some basic results in Finance.

2. The construction of the condence matrix Ω which is completely arbitrary and does not account for interdependence of the condence over dierent views.

There has been a wide research over these two issues and some authors have proposed complex models overcoming them. To cite the most relevant, see the following approaches:

Entropy Pooling approach from Meucci (2008) COP from Meucci (2006)

(21)

4 An Advanced Application of Black-Litterman

Before we present the new model proposed by this thesis, we briey describe the advances happening regarding Black-Litterman and Bayesian Asset Allocation. One very prolic researcher in the eld is Attilio Meucci, who has produced dierent types of models incorpo-rating macro factors, non-normality of the market and entropy pooling (see Meucci (2009), Meucci (2006),Meucci (2008)). The problem with some of these models is that they are dicult to be implemented and used in practice, due to the complexity of the theory and the computations required for their implementation. Moreover, they cannot be applied directly because they do not present any closed-form solution.

Some of the things that have been considered by Meucci are the properties of the marginal distributions of returns (the individual assets) and their dependence structure. In Finance, it is common knowledge that asset returns have negative skewness and excess kurtosis. This implies that a time series should be not be assumed to be unconditionally normal, rather, to have another type of distribution.

Starting from this point we try to formulate a simpler approach to tackle the same issues as the ones considered by Meucci. In this paper we will assume a distribution of the marginal which is the Skewed t-Distribution, the reasons for this will be provided. We will model the individual assets after clearing any form of predictability, by imposing a model for both mean and volatility. This also allows us to consider the fact that volatility is not constant in time, rather, as we would intuitively observe empirically, time varying. To model means and volatilities we will introduce, respectively, ARMA and GARCH dynamics in the return series.

Because of this previous consideration we will obtain marginal distributions for assets which are going to be non-normal (Skewed t-Distribution). Moreover, the dependence struc-ture which links the returns of these assets is not going to be a multivariate normal as assumed by BL, it will be instead derived from a copula. A copula allows to generate an ad hoc joint distribution between assets, so to reect observed data. We will choose a Student t copula, which allows for high dependence in the tails and reects market behavior. (We tend to see cocrashes and cobooms in the market).

In the following sections we will proceed by steps, initially we will consider the time series of the assets independently and then we will consider the dependence structure that links them. This process, we will show, allows us to infer an empirical correlation matrix which we will be key in the advanced model we are proposing. We start by imposing an ARMA(1,1) to the individual time series so to remove any serial correlation. Then, we proceed by applying a GARCH(1,1) to remove additional dependence in the residuals. The choice of these processes is in line with common practice in the industry. The residuals obtained at the end of the process will then be subject of the distribution tting, in particular the Skewed t-Distribution.

(22)

We present a compact description of the general model which describes link between the dynamics involved. In the notation rt is the process for returns (ARMA) and σt is the

process for volatility (GARCH):

rt= µt(θ) + εt

εt= (θ)zt

zt ∼ g(zt|η)

where zt= (rt−µt(θ))/σt(θ)represents the residuals after imposing the mean and

volatil-ity models.

In this framework, θ is a vector containing all the parameters of the mean and variance processes. Moreover, the innovations have zero mean and unit variance and are distributed as a conditional distribution g(·) with shape parameters η. As previously mentioned, we take the mean process to be an ARMA and the variance process to be a GARCH. The distribution of the innovations will be the Skewed t-Distribution.

To summarize:

Thanks to the Skewed t-Distribution we eliminate the assumption of normality in the individual returns series.

Using a copula produces a dependence structure of a Student t, eliminating the as-sumption of normal interdependence of the assets.

The correlation found through the copula will be used to model the interdependence in the uncertainty over the views (the o-diagonal elements of Ω).

4.1 ABL

The combination of the elements we have seen above brings us to the creation of a new model which we call Advanced Black-Litterman approach (ABL). This model tries to solve the issues of BL and accounts for non-normality of returns by combining the market prior of BL with a modied version of the views. In particular, we perform the following steps:

1. Model the mean process for the asset data with an ARMA(1,1); 2. Model the volatility process for the asset data with an GARCH(1,1); 3. Fit a Skew-t distribution on the residuals;

4. Create a dependence structure among the assets which has a Student t distribution; 5. Extract a new correlation matrix which enters the original BL framework.

The model hereby proposed has been developed in this thesis for the rst time and can be seen as a modied version of BL. It still keeps the advantage of having a closed-form solution but introduces the non-normal feature of the market and allows for interaction among views expressed (in terms of their condence).

(23)

4.2 ARMA

The ARMA(1,1) is the rst step into "cleaning" the data. Cleaning refers to the fact that we wish to arrive to a point in which deviations from expectations are purely random and unpredictable. An ARMA process allows us to link the expected return of an asset to its value in the previous observation and to the previous error term. An ARMA is a combination of two simpler process, namely, an AR and MA. The rst, an AutoRegressive process, assumes that returns depend on their past values, up to a certain number of past observations, p. An AR(p) is expressed as:

rt = φ0+ φ1rt−1+ · · · + φprt−p+ εt

By contrast, a Moving Average imposes that returns depends on the q previous errors. An MA(q) is therefore written as:

rt= θ0+ εt+ θ1εt−1+ · · · + θqεt−q

An ARMA(p,q) combines the two processes, therefore returns depend both from the previous p returns and the previous q errors terms. In practice though, an ARMA(1,1) is considered to be enough to model mean returns and has the following shape:

rt= c + φrt−1+ εt− θεt−1

The purpose of the ARMA is to remove any form of serial correlation in the returns. In a normal distribution environment, dependence is fully captured by correlation. In a non-normal framework, on the other hand, this is not enough and one should consider dependence also in functions of the error terms. It is common practice to consider also correlation in the square of the residuals. This consideration brings us to the next step.

4.3 GARCH

A GARCH is the commonly used instrument to consider the correlation in squared residuals we mentioned above. The implication of considering the square function is that it allows to account for a well-known phenomenon in time series, that is volatility clustering. Volatility clustering can be explained as the tendency of errors to be similar in magnitude to the following errors. In other words, big shocks (either negative or positive) are followed by big shocks and the same for small shocks. This phenomenon is particularly obvious for nancial returns as in times of business expansions we observe low volatility and trending markets, whereas during market crashes we observe very high volatility and big market uctuations driven by excitations of traders. The rst attempt to model this behavior is the AutoRegressive Conditional Heteroskedasticity process, or ARCH. The model predicts squared errors by autoregressing over a number of previous squared errors. The main issue with this process was the number of lags required. For this reason a more ecient approach has been introduced, the Generalized ARCH, GARCH in Engle (1982). A GARCH(p,q)

(24)

models variances for assets based on lags of the previous values of variance and of the square of shocks. The expression for a GARCH(p,q) is:

σ_t2 = ω + α1σt−12 + · · · + αpσt−p2 + β1ε2t−1+ · · · + βqε2t−q

In practice the common choice is to use a GARCH(1,1) to account for conditional volatil-ity dynamics. Therefore one lag of the variance and one lagged squared return is included in the expression. In this specic case, the unconditional variance for the GARCH dynamics is:

σ2 = ω

1 − α − β

Arrived at this point we can construct the standardized innovations which we introduced in the beginning of this section, namely zt. They will be the input for the estimation of our

distribution.

4.4 Skewed t-Distribution

To take a decision over how to model the individual return time series we need to account for a few dierent characteristics of nancial returns. We know that a return series normally presents negative skewness and excess kurtosis, leading to extreme events being more frequent than the normal distribution (due to higher kurtosis) and are more likely to be negative than positive (due to negative skewness). To provide an example we present a table of descriptive statistics for a broad equity index, MSCI World, using both daily and monthly returns available from 1980 until 8th _{July, 2019.}

Mean St.Dev Skewness Kurtosis Min Max Daily 0.0272 0.0087 -0.5328 14.1028 -10.3633 9.0967 Monthly 0.5704 0.0429 -0.8606 5.3382 -21.1279 10.9473

Table 1: MSCI World Descriptive Statistics

We focus on skewness and kurtosis (excess kurtosis to be precise) and notice that, both in the case of monthly and daily data frequency, we observe negative skewness and excess kurtosis. Furthermore, considering minima and maxima conrms that the worst returns are greater in absolute value than the best returns. To further conrm the non-normality of the time series we perform a standard Jarque-Bera test for normality proving non-normality at the 1% signicance level.

This becomes even more evident if we plot the empirical distribution of returns versus a normal distribution.

(25)

Figure 3: Empirical distribution, monthly returns

Figure 4: Empirical distribution, daily returns

Now that we have proven that non-normality and in particular, negative skewness and excess kurtosis, are standard characteristics of nancial series, we ask ourself the following question: what distribution is most appropriate? If we are to focus on fat tails, the most obvious solution is the use of a Student t distribution which, for low degrees of freedom provides more common extreme events and produces higher kurtosis. The issue with this distribution is that it is symmetric (has skewness of 0). To solve the issue, the Skewed t-Distribution has been introduced by Hansen (1994) and is dened as:

(26)

g(zt|η) = b Γ(ν+1₂ ) pπ(ν − 2)Γ(ν 2) 1 + ζ 2 t ν − 2 −ν+1₂ where: ζt= (bz_t+ a)/(1 − λ), if zt ≤ −a/b (bzt+ a)/(1 + λ), if zt > −a/b a = 4λcν − 2 ν − 1 b 2 _{= 1 + 3λ}2_{− a}2 _{c =} Γ( ν+1 2 pπ(ν − 2)Γ(ν 2)

The mathematical denition of the distribution is complicated but the principle behind it is easy to understand. The distribution treats the upside and downside shocks dierently and the shape parameters, λ and ν, regulate by how much. λ is the asymmetry parameter (−1 6 λ 6 1) and ν represents the degrees of freedom of the distribution (2 6 ν 6 ∞). The shape parameters of the distribution can be readily estimated by Maximum Likelihood given the innovations zt we previously found. MLE is performed to all the dierent assets

and each one will be tted with the appropriate parameters η. Once the marginals have been estimated for all assets we proceed to aggregate the individual results and consider how they cobehave. For this purpose, we introduce a copula.

4.5 Copula

When switching from univariate to multivariate series an important aspect is how the be-havior of each marginal inuences and is inuenced by the others. To account for this, one needs a structure that considers multiple individual times series simultaneously. As we know, observations of frequent crashes and booms in the market suggests that assuming a multivariate normal distribution would be wrong. This assumption would generate too few extreme events and, as a consequence, the correlation among the so-called tail events (very high or very low returns) would very low.

To account for this, one could think that a multivariate Student t distribution, which allows for thick tails, is enough. As a matter of fact, we simulate a bivariate normal se-ries imposing a xed correlation and then simulate a bivariate student imposing the same restriction we obtain very encouraging results.

To illustrate the discussion above, we provide the results of simulation of both the bi-variate distribution simulated with correlation imposed to 0.5. For the normal, we compute the overall correlation between the two series and then we compute correlation only in the lower tail (taken to be the portion under the -2%). We repeat the same experiment for a bivariate Student t. The output produced is the following:

(27)

Figure 5: Bivariate Normal dependence Figure 6: Bivariate Student t dependence

For the bivariate normal, we see that extreme events for the both variables are extremely rare and small in magnitude (all events are in the axis range of ±5%). The overall correlation for the randomly generated bivariate series is 0.5003 but it is 0.1398 in the tails.

For the Student t bivariate series instead we have many more simultaneous extreme events, and when we compute the correlation in the tails it is 0.4454, much closer to the imposed correlation of 0.5. The overall correlation in the series was found to be 0.5009.

This feature, showing correlation of cocrashes and cobooms is what we are looking for and, for this reason we will nd a way to specify our "view" over the dependence structure among the assets.

But, the results for the Student t distribution, which are encouraging on one hand, forego another important aspect. Imposing a multivariate Student t with ν degrees of freedom to obtain dependence among assets, implies that also the marginals have the same distribu-tion: Student t with ν degrees of freedom. Therefore, we lose characteristics unique to the marginals which have looked for so intensively. To overcome this problem a widely used instrument in nance are copulas. They allow to impose any dependence structure of some assets, without imposing any restriction over the marginals.

A bivariate copula is a function which is able to measure how any two given marginals cobehave. We propose a theorem which guarantees their existence and denes them:

(28)

Sklar's Theorem (1959)

Let H be a joint distribution function of X and Y with marginal distributions F and G, respectively. Then:

- there exists a copula C : [0, 1] × [0, 1] → [0, 1] such that, for all real numbers (x, y) H(x, y) = C(F (x), G(y))

furthermore, if F and G are continuous, C is unique.

- conversely, if C is a copula and F and G are univariate distribution functions, then H(x, y) = C(F (x), G(y)) is a joint distribution function with marginals F and G. From the above we understand that we could, in principle assign joint probabilities to any given marginal distributions. As we are dealing with Skewed t-Distribution marginals, we can impose a dependence structure belonging, for example, to a normal or Student t distribution. This produces an ad hoc joint distribution, which has all of the desired properties (both at univariate and at multivariate levels). Copulas can be of dierent types, namely, empirical, elliptical or Archimedean. Empirical copulas are, as the name suggests constructed based on the empirical probabilities. Archimedean copulas are produced by specic functions called copula generators. This family of copulas is very attractive because it can produce asymmetric dependence structures at the bivariate level, but is not very easily extendable to multiple series. Because they are the most suited for multivariate analysis, we will focus on elliptical copulas. The most important elliptical copulas are based on the Gaussian, Student t, Cauchy and Laplace distributions. We will base our selection of the copula based on a ndings above and choose the one that allows better for the cocrash/coboom feature. For this reason the properties of the Student t copula suit our purposes at best.

To switch back from their mathematical description to their practical use, we focus now on the concrete purpose of this process. It is possible to t a Student t copula by playing with the correlation and degrees of freedom parameters. Once the Student t copula has been tted, it provides us with a new correlation matrix. This correlation matrix can then be used to switch back to a new covariance matrix by mean of the following passage:

Σ = D1/2· R · D1/2

where R is the correlation matrix we obtain and D is a diagonal matrix containing the variances of the assets.

Once we have obtained this new covariance matrix we can use it in the optimization process to measure expected dependence across the assets. This dependence is the one we wish to transfer in the matrix Ω of condence over our views.

(29)

4.5.1 ABL vs. BL

To sum up, let us point out some dierences between the regular model and the advanced one. ABL distinguishes from BL mainly in two aspects. We express our views just as in BL, but this time we eliminate the assumption of normality of returns and also specify a view over the dependence structure across the assets. This view is indirectly incorporated in the matrix Ω. In BL, such matrix is the "handmade" diagonal matrix of the uncertainty over the views, purely subjective. In ABL, such matrix, call it ΩABL, is found as follows:

ΩABL = P · D1/2· RCopula· D1/2· P0

Unlike in BL, ΩABL is not a diagonal matrix and it is scaled using the selection matrix,

P.

We argue that using this new measure for condence in views will bring more meaning to the views expressed and at the same time will provide for an easier version of the more complex models already present (see Meucci (2009), which are harder to replicate. The formula tells us, and this is a key element of the thesis, that not only condence shouldn't be subjective, but also that the views' uncertainties are related among each other by their dependence structure (the o-diagonal elements should be dierent from 0).

Breaking down the formula we rst see that the center part: D1/2· Rcopula· D1/2

is a new covariance matrix found by pre- and post-multiplying the correlation matrix Rcopula, found through the copula, by D1/2 which is the diagonal matrix containing the

standard deviations of the individual assets.

On top of this, we pre- and post-multiply this new matrix by the selection matrix P, in order to scale up the matrix based on the views we have imposed and return an adequate ΩABL.

A simplied example of the dierence between the two is the following:

ΩBL=   (5%)2 ₀ ₀ 0 (5%)2 0 0 0 (5%)2   and ΩABL =   0.062 −0.001 0.003 −0.001 0.031 −0.023 0.003 −0.023 0.052  

This numbers shown here are an example but already highlight the key dierences. The condence in ΩBL of 5% is an arbitrary number, provided as an example, quantifying our

uncertainty over the views we have expressed.

The matrix Ω in the BL framework answers to the question: "What is, respectively, the uncertainty over the views we have expressed? And, how does such uncertainty interact across views?".

In BL, the answer is discretionary. The decision maker "feels" such uncertainty for the individual views and, usually, does not specify their interaction (o-diagonal elements equal

(30)

to 0). For ΩABL, the process is mathematical instead of discretionary. Notice how the

o-diagonal elements of ΩABL are not equal to 0.

The positive aspect is that the introduction of considerations over normality and depen-dence of condepen-dence still leaves us with a closed-form formula, ready for use:

E[µ|v] = [(τΣ)−1

+ P0Ω−1_ABLP ]−1[(τ Σ)−1µeq+ P0Ω−1ABLv]

(31)

5 Implementation

In this section we try to focus on how to implement such ndings in practice. That is, how do we use this model starting from raw data over some indices of the asset classes? First of all we need the raw data. Not so long ago, data sources were very scarse and costly, this is not the case anymore and all types of data are accessible through platforms such as Bloomberg or Thomson Reuters. In this study data was taken from Datastream by Thomson Reuters. We choose six dierent asset classes, to provide a well-diversied SAA in which we include one alternative asset to take advantage of its potentially high returns and diversication benets. Once we have raw data we need to manage it and make it suitable for the application of the model. When we have the data ready to be the input of the model, we apply the models described above by means of a software. We make use of MATLAB to perform the required calculations and obtain the portfolio weights as output.

5.1 Data

In this study we start o with information over prices and market values of a number of six asset classes, proxied by some indices. They are:

US Equities. Index: S&P 500 Composite EU Equities. Index: MSCI Europe

Emerging Market Equities. Index: MSCI EM

US Bonds. Index: BoA Merrill Lynch US Total Bond Return EU Bonds. Index: Bloomberg Barclays Euro Aggregate Alternatives. Index: HFRI Fund Weighted Composite

S&P 500 is among the main US indices, it contains the 500 top rms of the country and therefore it is representative of the major stake of US equity. For Europe and Emerging Markets, the choice has fallen on two indices provided by Morgan Stanley Capital Inter-national, one of the most trusted providers of most types of index. As regards debt, we use US Total Bond index provided by Bank of America Merrill Lynch for the US and the Bloomberg Barclays Euro Aggregate index for the EU. These indices are provided by trusted institutions and can be taken as trustworthy for the purpose of our study. The bond indices are all composed by only investment grade debt securities.

The data for the rst ve "traditional" asset classes is easily accessible, for the last one, the "alternative" asset class, it is more complicated to nd trustable data for prices and market value. As we use a hedge fund index for alternatives, the main problem is to assign a market value to assets which are traded only by sophisticated investors and therefore are not found on exchanges. For the rst ve indices, data is obtained by use of Datastream

(32)

for the download of all types of data directly in Excel. Regarding our sixth and more troublesome index, we choose to rely on the data provided by HFR (Hedge Fund Research) for what regards prices of the HFRI Fund Weighted Composite Index. And we take the Market Values of the entire Hedge Fund industry from Preqin, another hedge fund data provider. The prices and market values will match thanks to selection of the HFRI Fund weighted index which contains all the strategies of the hedge fund world and is therefore representative of the full hedge fund market performance (hence we can consider the market value provided by Preqin).

The choice of the indices has been made in such a way that each index would be the closest available product to proxy the asset class that it matches. The indices chosen are both large enough to be considered as an asset class and also liquid enough to be considered investable and reliable.

The frequency of data used is monthly, matching the long-term horizon of the model (around ve years). The data starts from July 31st _{1998 and ends in June 19}th _{2019. The}

choice of the time window is very important in terms of results obtained. In our case we have a sample containing more than 20 years of data, enough to allow implementation of the model and perform additional inference and backtesting.

Looking at the events contained in the data, we must highlight the fact that our data contains two major nancial crises. As we know, 2008 has marked one of the most severe nancial crises of the last century, and it is a part of our data. At the same time also the sovereign debt crisis of the 2010's is included. To include a crisis in a dataset is probably a good thing and a bad thing at the same time. Including a major crisis in the time series is a way to test the functioning of the model under stress but at the same time could lead us to decisions which are biased by unique events which, in theory, could never happen again. In any case, it would be pointless to take a dataset which only contains trends and no market surprises.

(33)

5.2 Procedure

Starting from the clean data, we start applying the model in the order that we have previously mentioned. MATLAB supports nancial toolboxes which allow for easier coding (mainly the MFE Toolbox provided by Kevin Sheppard). The steps taken, after importing data and making it possible to work with it, are the following:

1. Fit a model for the mean in the marginal distribution, ARMA(1,1); 2. Fit a model for the volatility of the marginals, GARCH(1,1);

3. Estimate a Skew t distribution for the residuals of the marginals;

4. Implement a Student t copula across the various marginals to nd a correlation matrix; 5. Proceed with standard methodologies (historical optimization, market priors and BL)

for comparison purposes;

6. Use the new correlation matrix found via copula in the new ABL framework;

In the rst steps there are a few functions that can be used to simplify the procedure. In particular the Kevin Sheppard toolbox provides for armaxlter and tarch to easily es-timate ARMA and GARCH processes, the copulat function allows to eses-timate copulas. The way things are put together is the usual way found in the Black-Litterman framework, but this time the inputs are found through preliminary steps (there is a dedicated code for construction of the selection matrix and the expression of views).

To have a form of comparison, we decide to optimize in four dierent ways and then analyze the results. Because the optimization produces a set of possible portfolios for many possible levels of risk, we decide to set 25 dierent levels of risk, producing 25 possible allocations for each methodology we use in optimizing. In other words, our Ecient Frontier is made up of 25 points, from left to right of the x-axis, representing increasingly risky portfolio compositions.

For the interested reader the step-by-step implementation of the model can be found in the section dedicated to the code.

(34)

6 Results

In this section we will show how the model described above behaves, in particular we will analyze its output and its sensitivities to inputs. We are interested in the comparison among some candidate models and check their performance through backtesting. We will, further on, present three dierent scenarios which will give us an understanding of how the model behaves in dierent circumstances.

Now that we have discussed the data used in this thesis, the model is implemented through the code (presented at the end of the writing) and provides us with the optimization outputs. We decide to optimize in four dierent ways and then compare the results. The rst two methods, "mean-variance" and "market prior", are independent of the views we express. The third and fourth instead depend on them heavily.

The four optimization methodologies are the following: 1. Standard mean-variance;

2. Market prior with EWMA variance (Black-Litterman with no views and modied co-variance matrix);

3. Standard BL; 4. Advanced BL;

In the rst case, we perform a simple mean-variance optimization to have an understand-ing of the startunderstand-ing point. We optimize usunderstand-ing the simple mean of the assets as expected return and their historical covariance matrix for risk.

The second case is already much more reliable in terms of results. We revert the market equilibrium to nd the expected returns and compute volatilities and covariances by EWMA. EWMA is a step forward we take from the historical covariance and allows us to produce a much more reliable result, yet not sophisticated enough to introduce views and more complicated processes.

For the last two methods we need to pay more attention as this is where the purpose of this paper lies. In the normal BL framework, the element ΩBL is found in a discretionary

way, by simply assigning some degree of condence to our views, in the new approach, the degree of condence, ΩABL, is computed using correlation matrix stemming out of the copula

estimation.

As we provide results for 25 dierent levels of risk propensity, we will not obtain a single optimal portfolio, rather, a set of portfolios, each optimal for a given level of risk (for each optimization we produce Ecient Frontiers made up of 25 points each). Sometimes we will take three levels of risk propensity to proxy the methodology output as a whole.

These three levels are:

"Minimum Volatility", the portfolio with least risk propensity (Portfolio 1, on the left side of the x-axis)

(35)

"Middle Portfolio" (Portfolio 12, on the center of the x-axis)

"Maximum returns", the portfolio with highest risk propensity (Portfolio 25, on the right side of the x-axis)

As they are static in the model, we rst present the results of the rst two optimizations. The issues with mean-variance have been previously mentioned, we expect the output to be underdiversied or possibly present corner solutions.

The graphs which depicts the product of the optimization has Risk Propensity on the x-axis and weight on the y-axis. Each asset is highlighted in dierent colors and has a weight going from 0 to 1. The sum of the weights will obviously add up to the total of 1 (100% of the investment). We plot the weights for 25 increasing risk levels (increasing risk propensity). The weights change as we move from left to right to produce increasingly risky portfolios.

The result for mean-variance is the following:

Figure 7: Mean-Variance optimized weights

The picture shows that we are fully invested in only two assets classes: US bonds and Hedge Funds. The huge weights in the individual assets imply that we are greatly exposed to shocks to individual assets and that we are not beneting from diversication. This is

(36)

To highlight the disadvantages of mean-variance, let us compare the results found with the allocation (therefore, the beliefs) of the market. For this reason, we introduce neutral expected returns and EWMA volatility. We expect a great improvement in the portfolio diversication and to be invested in more assets at the same time.

Figure 8: Market Prior optimized weights

The above in our model represents the average view over the future based on current market weights. The portfolio is diversied across the asset classes for most levels of risk propensity. The transition of the weights across the x-axis is smooth and does not present any corner solution (always keep in mind that the plot contains 25 points and is therefore discrete and not continuous). If we analyze the result, we see that we gradually shift from bonds to equity as we increase the risk of the portfolio. The least risky portfolio is almost fully invested in US bonds and hedge funds. Even though the latter asset class may sound risky to most, the fact that the two are almost uncorrelated (historical correlation of 0.2289) allows to bring down the total expected level of risk.

As expected, whenever we are more incline to risk we must switch to equities. Consis-tently with this, the portfolio is almost entirely invested across our three equity asset classes, US, EU and EM.

(37)

be happy with the benets of this method but, in the case in which we need to express some views, we need to switch to the other two optimization methodologies.

The views we impose are arbitrary and can stem out of dierent considerations. We dene a rst scenario as follows:

Scenario 1 (Basic)

1. US equity underperforms EU equity by 4% (uncertainty of 5%). 2. US bonds overperforms EU bonds by 2% (uncertainty of 5%). 3. Hedge funds outperform equities by 3% (uncertainty of 5%).

The scenario above, which we will refer to as "basic" is simply an opinion generated by a hypothetical investor based on his personal views, these numbers could be dened as discretionary views (both the view itself and the condence). Later on, we will consider more scenarios in order to compare the regular BL model with our proposed ABL. It is important to keep in mind that ABL is not impacted by the condence over views, therefore condence will be expressed only to allow the construction of BL allocation.

Introduction of the views should have an impact over the Market Prior allocation in order to account for the personal views of the investor. We should observe the weights tilted in the direction of our "bets". The impact of the changes in weights depends on the magnitude of the view (if it highly divergent from market neutral views) and on the condence we express over the view itself (views for which we are certain cause big changes, views over which we have no condence don't cause any change).

Before analyzing the output, let us remark that the condence we express over the views leads to the following ΩBL matrix:

ΩBL =   5%2 ₀ ₀ 0 5%2 ₀ 0 0 5%2  

We notice a rst impact of our decisions on the output produced. As a matter of fact, we must consider that the condence imposed over the views has an impact on the output. As we mentioned, this impact regards the regular Black-Litterman model and not the ABL we present here. We provide evidence on this alongside with the output produced by the basic scenario.

(38)

The optimization using BL produces the following results given the market prior and expressed views:

Figure 9: Black-Litterman optimized weights

The changes in the portfolio construction are visible by naked eye as we are more invested in hedge funds and US equity and less invested in EU bonds, in accordance with the views expressed. We provide the tables of the weights to highlight this fact at the end of this section.

Furthermore, we check the impact of the certainty over the views by reoptimizing using 50% and 1% as a degree of uncertainty for all the views expressed. the results are the following:

(39)

Figure 10: Uncertainty of 50% Figure 11: Uncertainty of 1%

On the left-hand side, we observe that we tilt back towards the market prior, with asset weights decreasing closer towards the market neutral values. Even if we express some views, by the fact that we are highly uncondent, the model does not make great changes to the market prior.

On the right-hand side instead, we revert to a form of output which is close to the mean-variance issue. This time the message which this picture carries is that being too condent on our views eliminates the mixture of BL and brings us back to a mean-variance scenario in which we rely completely on our inputs.

We conclude by remarking that the changes generated in the portfolio for the individual assets, even if apparently small (±5%) in magnitude (especially on the left hand graph) can be very signicant in the overall performance of the portfolio.

Going back to the output produced by BL in the basic scenario, we can also notice that the "curves" generated by the portfolio weights are very rounded and don't present any form of spike or corner.

To sum up, the graphs above have shown that expressing views causes a shift from the prior state. On top of this, the condence specied over our views is relevant in determining the magnitude of these changes.

Before we move to the ABL, let us briey recall the dierences between BL and ABL. Recall that the ABL model makes use of the correlation matrix found through maximization of the likelihood of a copula.

Therefore, let us rst present the new correlation matrix and compare it with the common historical correlations: