An expectile-based approach to portfolio optimization

(1)

POLITECNICO DI MILANO

Scuola di Ingegneria Industriale e dell’Informazione

Corso di Laurea in Ingegneria Matematica

An expectile-based approach to portfolio

optimization

Relatore: Carlo Sgarra

Candidato:

Emanuele Moizi Matr. 854445

(2)

Abstract

This thesis presents the expectile, a statistical functional that has recently attracted the attention of researchers in the area of risk manage-ment, as it is the only risk measure which is both coherent and elicitable. The work is focused on analysing the applications to portfolio manage-ment. In particular, the expectile optimization is compared against the classic Markovitz method and the CVaR one, a recent and highly dis-cussed optimization method for portfolio optimization. A case study for the portfolio of Euro Stoxx 50 stocks is performed to demonstrate how the different optimization techniques can be implemented and to evaluate their performances. The resulting outcomes of the methods are presented, compared and discussed in order to assess which model per-forms the best and under what circumstances. The biggest contributions of this piece of work are the study of portfolio expectile optimization with asset returns modelled by a Student’s t copula with asymmetric marginal distributions and the extension of the linear programming formulations of the expectile optimization. The statistical modelling of the returns addresses the non-normality issues in the asset returns, protecting the investor from statistical uncertainties and avoids great losses from market downturns. The general conclusion is that portfolio optimization with expectile is advantageous over Markovitz and CVaR optimizations when returns are modelled with such an asymmetric distribution.

The remainder of the paper is laid out as follows. Chapter 1 is an introduction to this thesis project. It contains a brief overview on the academic research on the expectile and explains the ideas and motiva-tions that inspired the work. Next, chapter2 introduces the concept of expectile and shows that is a coherent and elicitable risk measure. The chapter also proves that the expectile is the only coherent and elicitable risk measure and explains why the elicitability property is desirable for a risk measure and finally, it mentions some properties that make the expectile interesting for an application in asset management. Chapter3

is an introduction to asset management. It presents the first model that was introduced to optimize a portfolio with regards to risk: the Markovitz model, which aims to reduce the portfolio variance. The chapter shows the Markovitz solution that will be considered as a benchmark to the expectile one and explains the criticalities of the Markovitz approach, giving the motivations for the next models that will be considered. In chapter4it is described the Rockafellar and Uryasev model, a very pop-ular portfolio optimization method which optimizes the portfolio CVaR and, due to its similarities with the expectile approach, will be investi-gated in the numerical analysis chapter. The chapter presents a linear programming formulation and ends showing that the Markowitz model and Rockafellar and Uryasev model lead to the same optimal portfolio under an ellipticity assumption on the returns distribution. Next, chap-ter 5 introduces the expectile optimization. It starts by proving that the expectile solution and the Markovitz one coincide if the returns are elliptically distributed. Then, three different linear programming for-mulations for the expectile optimization are derived, and it is explained

(3)

how the different methods may increase the computation efficiency of the method. Chapter 6 is devoted to the statistical modelling of the asset returns. It identifies the non-normality in the data and then develops methods to accurately model it. In particular, it starts by correcting the serial correlation of the returns. Then, to better model the kurtosis and skewness phenomenons, it presents a semi parametric distribution fitted dividing the entire distribution into three parts: left tail, centre and right tail. The chapter continues deriving a closed formula for the entire resulting distribution, named semi-parametric Generalized Pareto Distribution. Then, the chapter analyses the correlation breakdown in periods of market stress and models this events through the use of the copula theory. The chapter concludes explaining the algorithm through which the returns, modelled in such a way, are simulated. The chapter7

draws the conclusions on the different methods by evaluating their perfor-mances. It compares the computation times of the three different linear programming formulations for the expectile optimization, showing that the last formulation, the most difficult to derive, guarantees an enormous speed up. Then, it studies the numerical convergence of the solution as a function of the sample size. Finally, the expectile, the Markovitz and the CVaR methods are compared with the introduction of portfolio per-formances measures. It is also performed a stressed scenario analysis to see which method is the most robust and resilient during market crashes. Finally, chapter8 states the conclusion of the work and suggests topics for further research on the subject.

The original contributions of the thesis are the following. To the best knowledge of the author, this thesis is the first work that: evaluates the expectile optimization performances from an asset allocation point of view and presents and compares the expectile, CVaR and Markovitz methods pointing out pros and cons of each method. This work is also an extension of the paper by Jakobsons [32], as it changes all the formula-tions stated by Jakobsons, in order to guarantee a minimum return to the investor. Furthermore, the thesis shows how the Jakobsons modelling of the returns with Gaussian and Student’s t distribution can be faulty, as it leads to underestimation of the downturn risk. Instead, the returns have been modelled through a semi-parametric Generalized Pareto Distribu-tion with dependence induced by a Student’s t copula, whose expression was been derived independently by the author. Finally, all the plots and tables were reproduced independently and all the numerical procedures were written by the author using Matlab.

(4)

Sommario

Questo progetto di tesi presenta l’espettile, un funzionale statistico che recentemente ha attirato l’attenzione dei ricercatori nell’ambito del risk management in quanto è l’unica misura di rischio coerente ed elicita-bile. La tesi ne analizza le applicazioni all’ottimizzazione di portafoglio. In particolare, l’ottimizzazione con l’espettile è confrontata con quella di Markovitz e con quella del CVaR, un altro metodo recente e molto di-scusso in letteratura. Viene effettuato un case study sull’Euro Stoxx 50 per analizzare l’implementazione e le performance delle diverse tecniche di ottimizzazione. I risultati dell’analisi sono presentati, confrontati e di-scussi al fine di capire quale metodo è il migliore e sotto che circostanze. L’originalità della tesi sta nello studiare l’ottimizzazione con l’espettile nel caso in cui i rendimenti dei titoli sono modellati con una particola-re distribuzione asimmetrica con dipendenza indotta da una copula t di Student e nelle nuove formulazioni lineari dell’ottimizzazione con espet-tile. Dal punto di vista statistico, il nuovo modello per i rendimenti è finalizzato a risolvere i problemi di non normalità, al fine di proteggere l’investitore da incertezze modellistiche e da perdite dovute a cali bru-schi del mercato. La conclusione generale è che, sotto tutte le ipotesi fatte per i rendimenti, l’ottimizzazione di portafoglio con l’espettile è più vantaggiosa rispetto a quella di Markovtiz e del CVaR .

Il resto del lavoro `e strutturato nel seguente modo. Il capitolo 1 `e l’introduzione, consiste in un rapido riassunto del dibattito accademico sull’espettile e presenta le idee che hanno ispirato la tesi. Il capitolo 2

definisce l’espettile e dimostra che è una misura di rischio coerente ed elicitabile. Viene anche dimostrato che l’espettile è l’unica misura di ri-schio coerente ed elicitabile e viene spiegata l’importanza del concetto di elicitabilità. Il capitolo finisce elencando alcune proprietà che rendono l’espettile interessante dal punto di vista dell’asset management. Il ca-pitolo 3 è interamente dedicato all’introduzione dell’asset management. Viene presentato il modello di Markovitz: un modello storico che costrui-sce un portafoglio ottimizzato minimizzandone il rischio inteso come la varianza del rendimento del portafoglio. Il capitolo mostra come l’ap-proccio di Markovitz possa essere considerato un punto di riferimento per un futuro confronto con l’espettile e spiega anche tutte le criticità e le ipotesi troppo semplicistiche di Markovitz, dando le motivazioni per introdurre i nuovi modelli che verranno presentati in seguito. Nel capi-tolo4 viene descritto il modello di Rockafellar e Uryasev, un metodo di ottimizzazione di portafoglio molto famoso in letteratura che si basa sul CVaR e che, date le somiglianze con il metodo dell’espettile, verrà ana-lizzato nel penultimo capitolo di analisi numerica. Il capitolo presenta la formulazione lineare del problema di minimizzazione e mostra che le solu-zioni di Markovitz e di Rockafellar e Uryasev coincidono sotto l’ipotesi di ellitticità della distribuzione dei rendimenti dei titoli. Dopodiché, nel ca-pitolo5viene introdotta l’ottimizzazione con l’espettile. Il capitolo inizia dimostrando che, anche in questo caso, la soluzione coincide con quella di Markovtiz sotto l’ipotesi che i rendimenti sono ellitticamente distribui-ti. Vengono poi formulati tre problemi lineari per l’ottimizzazione con

(5)

l’espettile e viene spiegato come questi tre diversi metodi possono velo-cizzare il calcolo della soluzione. Il capitolo6descrive la modellizzazione statistica dei rendimenti. Vengono identificati i problemi e gli effetti della non-normalit`a nei dati e vengono presentati dei metodi per modellizzarli correttamente. In particolare, viene corretta la correlazione seriale dei rendimenti, poi per modellizzare correttamente i fenomeni di curtosi e di asimmetria viene utilizzata una distribuzione semi parametrica costrui-ta dividendo e modellizzando singolarmente tre parti della distribuzio-ne: coda sinistra, centro e coda destra. Il capitolo continua ricavando l’espressione matematica per la distribuzione risultante, chiamata Gene-ralized Pareto Distribution semi parametrica. Poi, viene analizzato il fenomeno della correlazione che non si mantiene costante durante tutto il periodo di osservazione e viene implementata un Student-t copula per tenerne conto. Il capitolo finisce spiegando l’algoritmo attraverso il quale i rendimenti, modellizzati in questo modo, vengono simulati. Il capitolo

7 trae le conclusioni sui diversi metodi descritti in precedenza valutan-done le rispettive performances. Confronta i tempi computazionali delle tre diverse formulazioni lineari per l’ottimizzazione con l’espettile, mo-strando che l’ultima formulazione, la più difficile da ottenere, garantisce un grande incremento di velocità. Dopo, viene studiata la convergenza numerica della soluzione con l’espettile. Vengono poi confrontati i tre modelli di Markovitz,CVaR e espettile attraverso l’introduzione di misu-re di performance di portafogli. Viene anche condotto uno stmisu-ress test al fine di capire quale metodo è più resiliente e robusto durante periodi di crash del mercato. Infine, il capitolo 8riassume le conclusioni tratte ed elenca alcuni punti di partenza per nuovi sviluppi sull’argomento.

I contributi originali della tesi sono le seguenti. Per quanto ne sa l’autore, questa tesi è il primo lavoro che: analizza le performance del-l’ottimizzazione con espettile dal punto di vista dell’asset management e che presenta e confronta i metodi con l’espettile, il CVaR e di Markvoitz in maniera unificata presentando pro e contro di ogni metodo. Questo lavoro è anche un’estensione dell’articolo di Jakobsons [32], in quanto cambia le formulazioni lineari da lui introdotte al fine di garantire un rendimento minimo all’investitore. Inoltre, questa tesi dimostra anche come la modellizzazione di Jakobsons dei rendimenti con distribuzioni Gaussiane e t di Student può rivelarsi inefficace e condurre a sottostime del rischio. Invece, i rendimenti sono stati modellizzati con una Gene-ralized Pareto Distribution semi parametrica, la cui espressione è stata ricavata indipendentemente dall’autore. Infine, tutti i grafici e le tabel-le sono stati generati indipendentemente e tutte tabel-le procedure numeriche sono state implementate dall’autore usando Matlab.

(6)

(7)

Ringraziamenti

In primis vorrei ringraziare il mio relatore, Carlo Sgarra. Per tutto l’aiuto che ha saputo darmi durante questo periodo, per essere stato una grande fonte di ispirazione, per avermi sempre sostenuto e per la fiducia che mi ha dato. Grazie anche a tutto lo staff di professori del Politecnico, mi avete insegnato tantissimo durante questo percorso di magistrale.

Un ringraziamento speciale a mia madre, mio padre e mio fratello. Con il loro immancabile sostegno mi hanno permesso di raggiungere questo tra-guardo bellissimo. Spesso non ve lo dico, ma siete la mia gioia pi`u grande.

Grazie ai miei amici di universit`a, avete reso questi 5 anni meravigliosi e non ci sono parole per descrivere quanto sono felice di aver fatto questo per-corso insieme a voi. Grazie Simo, Azzo, Lent, Lore e Gianmi.

Un ringraziamento ai miei amici storici: Mimmo, Botta, Cave, Cecco, Cima, Furlan, Mirco, Ripa e Umbe. Sempre al mio fianco per rendere indi-menticabili i momenti felici ed alleviare quelli tristi.

Infine, grazie a Marta, Gloria, Maria e Otta, le mie donne. Grazie per aver sempre saputo ascoltarmi ed aiutarmi.

(8)

7.3 Expectile vs Markovitz . . . 73 7.4 Back-testing . . . 78 7.5 Stress testing . . . 82 7.6 Expectile vs CVaR . . . 83 8 Concluding remarks 88 8.1 Conclusion . . . 88 8.2 Further perspectives . . . 88 A . Reference portfolio 91 B . Empirical distribution vs Normal distribution 92 C . QQ-plot 96 D . Expectile efficient frontier 100 E . Optimal portfolios 102 Bibliography 105

List of Figures

2.1 VaRα and CVaRα of a random variable X ∼ N (µ, σ) representing a loss. µ = 2.5, σ = 0.7, α = 95%. . . 8

(10)

6.1 Price development for the stocks considered in the analysis during January 1 2007 - December 11, 2017. . . 48 6.2 Empirical distribution of the historical returns versus the normal

distribution, for Enel returns. . . 53 6.3 Hypothetical dissection of return distribution.. . . 55 6.4 Enel returns QQ plots (Normal and semi-parametric GPD). . . 57 6.5 Moving standard deviation for stock returns with time window of

30 days. . . 60 6.6 Likelihood values for the Student’s t copula versus the degrees of

freedom ν.. . . 63 6.7 Modelling assets returns using the marginal distributions. . . 64 6.8 Modelling joint dependence in asset returns using the t-copula. . . 65 7.1 Computation times plotted in logarithmic scale for Primal. . . 68 7.2 Computation times plotted in logarithmic scale for Dual. . . 69 7.3 Computation times plotted in logarithmic scale for the aggregation

algorithm. . . 69 7.4 Convergence of the aggregation algorithm as a function of the

sam-ple size n, with α ∈ {0.9, 0.99}. . . 72 7.5 Convergence of the aggregation algorithm as a function of the

sam-ple size n, with α ∈ {0.999, 0.9999}. . . 72 7.6 Legend of figures 7.4 and 7.5 . . . 73 7.7 Efficient frontiers computed solving Markovitz and the expectile

optimization problem with α ∈ {0.9, 0.99, 0.999}. . . 74 7.8 Cumulative returns over the whole control period of the EUROSTOXX

50 index. . . 76 7.9 Cumulative returns over the whole control period of the Markovitz

optimal portfolio and the EUROSTOXX 50 index. . . 76 7.10 Cumulative returns over the whole control period of the Markovitz

portfolio, the expectile portfolio and of the market. . . 77 7.11 Cumulative returns over the test period of the Markovitz portfolio,

the expectile portfolios with α ∈ {0.99, 0.999, 0.9999} and of the market. . . 81 7.12 Cumulative returns over the stress period of the Markovitz

portfo-lio and the expectile portfoportfo-lios with α ∈ {0.99, 0.999, 0.9999}. . . . 83 7.13 Cumulative returns over the stress period of the expectile and

CVaR portfolios with α = 0.99. . . 85 7.14 Cumulative returns over the stress period of the expectile and

CVaR portfolios with α = 0.999. . . 86 7.15 Cumulative returns over the stress period of the expectile and

CVaR portfolios with α = 0.9999.. . . 86 B.1 Empirical distribution of the historical returns versus the normal

distribution, for Inditex returns. . . 92 B.2 Empirical distribution of the historical returns versus the normal

(11)

B.3 Empirical distribution of the historical returns versus the normal distribution, for Societe Generale returns. . . 93 B.4 Empirical distribution of the historical returns versus the normal

distribution, for Adidas returns.. . . 94 B.5 Empirical distribution of the historical returns versus the normal

distribution, for Philips E. K. returns. . . 94 B.6 Empirical distribution of the historical returns versus the normal

distribution, for CRH returns.. . . 95 B.7 Empirical distribution of the historical returns versus the normal

distribution, for Volkswagen returns. . . 95 C.1 Inditex returns QQ plots (Normal and semi-parametric GPD). . . 96 C.2 Airbus returns QQ plots (Normal and semi-parametric GPD).. . . 97 C.3 Societe Generale returns QQ plots (Normal and semi-parametric

GPD). . . 97 C.4 Adidas returns QQ plots (Normal and semi-parametric GPD). . . 98 C.5 Philips E. K. returns QQ plots (Normal and semi-parametric GPD). 98 C.6 CRH returns QQ plots (Normal and semi-parametric GPD). . . . 99 C.7 Volkswagen returns QQ plots (Normal and semi-parametric GPD). 99 D.1 Efficient expectile frontier computed solving Markovitz and the

expectile optimization problem with α = 0.9. . . 100 D.2 Efficient expectile frontier computed solving Markovitz and the

expectile optimization problem with α = 0.99.. . . 101 D.3 Efficient expectile frontier computed solving Markovitz and the

expectile optimization problem with α = 0.999. . . 101

List of Tables

1 Solution to problem 3.3 applied to the reference portfolio, with m = 2.2 · 10−4. . . 30 2 European assets used in this chapter with their corresponding

busi-ness area and region. . . 47 3 Statistics of the log-return of the asset prices considered.. . . 49 4 Testing for serial correlation at lag one in daily asset returns. . . . 50 5 Correlation coefficients at lag one and statistical significance. . . . 51 6 Tests for serial correlation at lag one on corrected data. . . 51 7 Volatilities before and after unsmoothing for serial correlation. . . 52 8 Correlations of the returns based on the original data. . . 52 9 Correlations of the returns based on the unsmoothed data. . . 52

(12)

10 Jarque Bera test on the data. . . 54

11 Testing goodness of fit of Normal and semi-parametric GPD. . . . 57

12 Correlations of the asset returns for the control period.. . . 60

13 Correlations of the asset returns for the stress period. . . 61

14 Computation times in seconds for Primal. . . 70

15 Computation times in seconds for Dual. . . 70

16 Computation times in seconds for the aggregation algorithm. . . . 71

17 Yearly returns over the test period of the different asset allocation techniques. . . 80

18 Daily volatility of the returns over the test period of the different asset allocation techniques. . . 81

19 Omega ratio and Sharpe ratio over the test period of the different asset allocation techniques. . . 82

20 Volatility, Omega ratio and Sharpe ratio over the stress period of the different asset allocation techniques. . . 83

21 Yearly return, daily volatility, Omega ratio and Sharpe ratio over the test period of the expectile and CVaR methods. . . 84

22 Volatility, Omega ratio and Sharpe ratio over the stress period of the expectile and CVaR methods.. . . 87

23 Euro Stoxx 50 assets included in the reference portfolio with their corresponding business area and region. . . 91

24 Solution to Markovitz problem 3.3 applied to the train set, with m = 2.2 · 10−4. . . 102

25 Solution to the expectile problem 5.1 applied to the train set, with m = 2.2 · 10−4 and α = 0.99. . . 103

(13)

1 Introduction

The aim of the thesis is to introduce the expectile and to use it in order to solve an alternative portfolio optimization problem to Markovitz classical procedure. The expectile is a concept introduced in the statistical literature by Newey and Powell [42] in 1987 that was discovered only recently to have fascinating properties as a risk measure.

A matter of primary importance to the financial services industry is comput-ing risk measures, both for regulatory capital allocation and short term risk management. For a portfolio of assets a risk measure is generally interpreted as some functional of the conditional distribution of the portfolio loss between today and a future time instant, given all the market information up to today. The most widely used risk measures are the value at risk VaR and the condi-tional value at risk CVaR, known also as expected shortfall. Recently, there has been renewed debate about the merits of VaR and CVaR. Nowadays, VaR is the common practice in the finance sector, but it has been criticised on two main points. First, it does not take account of the magnitude of losses beyond VaR and second, it is not a coherent risk measure (see Artzner et al. [5], for example). As a result, the Basel Committee on Banking Supervision [6] is rec-ommending to abandon VaR in favour of CVaR. The Basel Committee may also have been influenced by the elegant and efficient method of computing CVaR in specific model advanced by Rockafellar and Uryasev [50]. However, CVaR has been criticised for instability of computation (see Cont et al. [15]) and, interestingly, for lacking a fundamental property called elicitability. The statistical functional expectile has recently attracted the attention of re-searchers in the area of risk management, because it is the only risk measure which is both coherent and elicitable. Until recently, it was known almost exclusively among statisticians and was applied in regression analysis (see Efron [18]). Currently the expectile is making a debut in the risk manage-ment, due to functional properties that make it suitable as a risk measure both for regulatory and portfolio management purposes. The axioms of co-herence, introduced in the paper of Artzner et al. [5], are often considered as indispensable for any risk measure used in practice, thanks to their financial interpretation and mathematical tractability. Recently, Gneiting [27] brought another property, elicitability on the foreground. A risk measure is said to be elicitable if it is a minimizer of the expectation of some scoring function, which

(14)

depends on the point forecast and the true observed loss. Gneiting [27] also showed that CVaR is not elicitable. In Bellini and Bignozzi [7] it is shown that the only risk measure that is both coherent and elicitable is the expectile (for a confidence level α ∈ [1₂, 1), when positive outcomes for the random variable represent losses). This makes the expectile an interesting object for research, with the aim of assessing its suitability for measuring risk in practice.

It has been suggested by several authors that elicitability is a relevant require-ment in connection with back-testing and with the comparison of the accuracy of different forecasts of a risk measure. We refer the interested reader to Gneit-ing [27], Ziegel [57] and Bellini and Bignozzi [7]. This is of great importance, as the Basel Committee on Banking Supervision [6] is stressing the importance of a robust back-testing. Recall that back-testing is the activity of periodically comparing the forecast risk measure with the realized value of the variable of interest, in order to try to assess the accuracy of the forecasting methodology. In this sense, elicitability is a key property for a risk measure as it provides a natural methodology to perform back-testing.

Another property of the expectile is that it balances gains and losses, which is desirable for portfolio management. Because of this the expectile is closely related to the Omega performance measure of Shadwick and Keating [53]. Bellini and Di Bernardino [8] also point out that the expectile has an intuitive interpretation in terms of its acceptance set, namely, the risk is acceptable if its gain-loss ratio is sufficiently high. Portfolio optimization has come a long way from the introduction of return-variance risk management framework by Markowitz [40]. Developments in portfolio optimization are stimulated by two requirements. First the adequate modelling of utility functions, risks and con-straints and second, the ability to handle large numbers of instruments and scenarios. For portfolio optimization, VaR have many undesirable properties, such as lack of sub-additivity for non-normal distributions, this can make a diversified portfolio more riskier than an undiversified one. Moreover, VaR is difficult to control and optimize when is calculated using scenarios, i.e. for discrete distributions. In this case VaR is not convex, non-smooth as a func-tion of the posifunc-tions, and has multiple local extrema. CVaR optimizafunc-tion overcomes the VaR issues and Rockafellar and Uryasev [50] managed to state a convex optimization problem for the CVaR portfolio optimization. They also proved that linear programming techniques can be used for the CVaR optimization, this is one of the most important contributions to CVaR opti-mization, as nowadays linear programming techniques are really efficient and make the optimization possible for reasonable dimensions that can be used in practice (hundreds of instruments and thousands of scenarios). Jakobsons [32] derived three different linear programming formulations for the expectile optimization problem, which can be considered as the counterpart to the lin-ear programming formulations for the CVaR of Rockafellar and Uryasev [50]. When the linear programs are based on a sample of the true, assumed con-tinuous, asset return distribution it is important to model the distribution

(15)

correctly. The turmoil of 2007/2008 is just one among many such financial crises happened in the past 30 years. This rare and unpredictable events should challenge conventional ideas about portfolio construction. Of course, we can never know in advance when such events will occur and how bad they will be. We do know, however, that the empirically observation of such events has a much greater frequency than current models allow for. A challenge of risk management is to better capture the long term downside risk associated with these rare but dangerous anomalies. There are two specific weaknesses in conventional risk assessment that may contribute to a quantifiable under-estimation of portfolio risk. The first are frameworks assuming ”normality”: conventional asset allocation frameworks typically make various assumptions about the ”normality” of asset returns, the most problematic are that returns are independent from period to period and normally distributed. The second is risk measures. However, using the latest statistical methods and the expec-tile these shortcomings can be addressed. The result that will be presented is a modified risk framework that may help investors to improve portfolio efficiency and resiliency. Obviously, no single statistical or risk management framework can render portfolios immune to such extreme events, but the hope is that this thesis project will be a step toward addressing and clarifying some critical and unresolved issues in asset allocation.

The outline of the thesis is the following:

Chapter 2 introduces the concept of expectile as a risk measure for an univariate loss distribution. It starts by defining VaR, CVaR and the expec-tile. Then, the notion of coherent risk measure is introduced and it is proven that the expectile is coherent, it is also mentioned that the CVaR is coherent as well while VaR is not. Then it is introduced the notion of elicitability. In particular, it is shown that the expectile is elicitable by definition while CVaR is not. The chapter proves also that the expectile with a confidence level α ∈ [1₂, 1) is the only coherent and elicitable risk measure. The chapter finishes by showing further properties of the expectile that make it interesting for an application in asset management.

Chapter3 moves to the asset management framework. It is discussed the first model that was introduced to optimize a portfolio with regards to risk: the Markovitz model, which aims to reduce the portfolio variance. The chap-ter describes also the dataset used for the thesis project and computes the Markovitz solution that will be considered as benchmark solution to the ex-pectile optimization method. The chapter concludes showing the criticalities of Markovitz approach and gives the motivations for the next models that will be considered.

(16)

the portfolio CVaR. The chapter gives a linear optimization formulation that minimizes the CVaR of a portfolio. It ends showing that the Markowitz model and Rockafellar and Uryasev model lead to the same optimal portfolio if the returns of the assets in the reference portfolio are elliptically distributed.

Chapter5introduces the expectile optimization. It starts by proving that the expectile and Markovitz optimization procedures coincide if the returns of all assets in the reference portfolio are elliptically distributed. Then, three different linear programming formulations for the expectile optimization are derived, and it is explained how the different methods may increase the com-putation efficiency of the method.

Chapter 6 is devoted to the statistical modelling of the asset returns. It presents the statistical tools used to better model the events observed in real market data. It starts by analysing and correcting the serial correlation of the returns. Then, the focus is shifted to the distribution of the returns, which presents phenomenons such as kurtosis and skewness. To better model these aspects a semi-parametric distribution is fitted dividing the entire distribu-tion into three parts: left tail, centre and right tail. The chapter continues deriving a closed formula for the entire resulting distribution, named semi-parametric Generalized Pareto Distribution. Then, the chapter analyses the correlation breakdown in periods of market stress and explain how this issue can be addressed through the use of the copula theory. The chapter concludes explaining the algorithm through which the returns, modelled in such a way, are simulated.

Next, chapter7 draws the conclusions on the different methods by evalu-ating their performances. It starts with the comparison of the computation times of the three different linear programming formulations for the expectile optimization. Then it is studied the numerical convergence of the solution as a function of the sample size. Next, the expectile, the Markovitz and the CVaR methods are compared with the introduction of portfolio performances mea-sures. They are compared in two different scenarios: standard and stressed.

Finally, chapter8 states a brief conclusion of the work done and suggests topics for further research on the subject.

The original contributions of the thesis are the following: first of all, to the best knowledge of the author, this thesis is the first piece of work that analyses the performances of the optimal expectile portfolio from an asset allocation framework. To the best knowledge of the author, this is also the first work that analyses the expectile, CVaR and Markovitz methods in a unified way and points out the strengths and weaknesses of each method. This thesis is also an extension to the expectile optimization problem of Jakobsons [32], as the

(17)

returns are modelled through a semi-parametric Generalized Pareto Distribu-tion, while Jakobsons uses multivariate Normal and Student’s t distributions, the thesis shows also why Jakobsons’s assumptions on the distribution of the returns is not backed up by real market data and underestimates the portfolio risk. Other original contributions are proposition4.1: although this is a stan-dard result, the author proves it independently. Problems Primal, Dual and

Aggregated, formulated by Jakobsons has been extended to consider also a constraint on the minimum return for the investor, this additional constraint required a new version of the proof of proposition 5.3 and made it possible to compare the expectile and Markovitz methods in terms of efficient fron-tiers. Finally, to the best knowledge of the author, the closed formula for the cumulative distribution function of the semi-parametric Generalized Pareto Distribution of equation 6.2 was never explicitly stated before. Some parts of this thesis presents results of other papers, but even with established con-cepts, the author aims to present in such a way that the concepts are easily understandable. Also, all the plots and tables in this paper were reproduced independently to confirm the results of other authors. Furthermore, all the algorithms and numerical procedures described in this work were implemented independently by the author using Matlab.

(18)

2 The expectile in the risk

measures framework

This chapter is devoted to state and prove the main properties of the ex-pectile, introduced first by Newey and Powell [42] in 1987 for lots of years it has been used exclusively among statisticians and was applied in regression analysis; see e.g. Efron [18]. Currently the expectile is making a debut in the area of risk management, due to functional properties that make it suitable as a risk measure for regulatory and portfolio management purposes.

The basic definitions are given in Section2.1, while Section2.2introduces the framework of risk measures and defines one of the most important properties that a risk measure should have: the coherence property. Section 2.3 proves that the expectile is a coherent risk measure and Section 2.4 states the defi-nition of elicitability; despite this notion finds its roots in decision theory, see for example Savage [52] in 1971, the term ”elicitable functional” seems to have been introduced in 1985 by Osband [44]. Elicitability is an important property because it can be related to back-testing, and the section proves also that the expectile is elicitable. Section 2.5 is devoted to show why the expectile is so important in the risk measures framework, in fact the expectile is the only risk measure which is both coherent and elicitable. Finally, Section 2.6describes further properties of the expectile that makes it a good metric to be used in portfolio management.

2.1 Basic definitions

Definition 2.1 (expected value). Let X be a random variable, the expected value of X is defined as E [X] := Z ∞ −∞ xf (x)dx if X is continuous or: E [X] := ∞ X k=−∞ kP (X = k) if X is discrete

With f (x) and P (X = k) being respectively the probability density function and the probability mass function of X.

(19)

The value E [X] is representative of the value around which the possible values of X distribute and will be denoted as µ.

Definition 2.2 (variance). Let X be a random variable, the variance of X is defined as

Var(X) := E h

(X − E [X])2 i

The variance will often be denoted as σ2. To better understand its meaning one could look at the standard deviation, defined as σ = pVar (X). The standard deviation is given in the same units as µ and measures how strongly X is dispersed around µ. Small values of σ mean that X is concentrated around µ, while larger values of σ mean that values of X far away from µ are more likely.

Definition 2.3 (covariance). Let X and Y be two random variables, the co-variance of X and Y is defined as

Cov(X, Y ) := E [(X − E [X]) (Y − E [Y ])]

The covariance measures how strongly the two random variables vary to-gether. If X and Y are independent their covariance is 0. Also, Cov(X, X) = Var(X). As in the case of σ2 the covariance is hard to interpret, as its units are the product of units of X and Y, therefore it will be used another measure to express how strongly X and Y vary together.

Definition 2.4 (Pearson correlation coefficient). Let X and Y be two random variables, the correlation coefficient of X and Y is defined as

ρXY :=

Cov (X, Y ) pVar (X)pVar (Y )

ρ always takes value between -1 and 1 and if |ρXY| is close to 1, then there

is a strong dependence between X and Y .

It is common practice in risk management to quantify risks using risk measures, in the rest of this section we will define three risk measures: Value-at-Risk, Conditional Value-at-Risk and expectile.

Definition 2.5 (Value-at-Risk (VaR)). Let X be a random variable represent-ing a loss. Given a confidence level α ∈ (0, 1) the α-VaR of X is

VaRα(X) := inf {l ∈ R : P (X > l) ≤ 1 − α}

In probabilistic terms, VaR is simply the α-quantile of the distribution of X. It can be interpreted as the maximum loss that will not be exceeded at the given confidence level α. VaR is a very well known risk measure, but its drawbacks (see Section 2.3) lead to a different definition, preferred by risk managers.

(20)

Definition 2.6 (Conditional Value-at-Risk (CVaR)). Let X be a random vari-able representing a loss. Given a confidence level α ∈ (0, 1) the α-CVaR of X is CVaRα(X) := 1 1 − α Z 1 α VaRu(X)du

For a continuous random variable CVaR is also related to VaR by the following relationship

CVaRα(X) = E [X|X ≥ VaRα(X)]

It can be seen that the Conditional at-Risk is an average of the Value-at-Risk over all levels u ≥ α, and thus looks further into the tail of the loss distribution. Figure 2.1 shows the VaR and CVaR for a continuous random variable X. Both VaR and CVaR focus on extreme losses, i.e. on the tails of the loss distribution. See Hull [29, chap. 16] for further details on VaR and CVaR.

Newey and Powell [42] introduced the expectile in 1987 with the following definition:

Definition 2.7 (expectile). Let Y be a random variable with finite second moment and cumulative density function F (y) representing a loss, let α ∈ (0, 1). The α-expectile is defined as the minimizer of an asymmetric quadratic scoring function y_α∗ := argmin x∈R Z ∞ −∞ h α (y − x)+2+ (1 − α) (y − x)−2 i dF (y) (2.1) With the notation x+= max(x, 0) and x−= max(−x, 0).

Since the defining minimization problem is convex, the expectile can be char-acterized by means of a first order condition, i.e. it is the unique solution

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 x 0 0.1 0.2 0.3 0.4 0.5 0.6 density function f(x)

Probability density function of X

95% VaR

95% CVaR

Figure 2.1: VaRα and CVaRαof a random variable X ∼ N (µ, σ) representing

(21)

x = y_α∗ to the equation α Z ∞ x (y − x)dF (y) = (1 − α) Z x −∞ (y − x)dF (y) (2.2) To have a better understanding of the financial meaning of this quantity one can consider the following rewriting of 2.2

αE(Y − y∗α)+ = (1 − α)E (Y − y ∗ α)

−

Now it can be seen that the expectile is a balance between gain and losses, therefore the optimization of the expectile can be an interesting topic for portfolio managers.

2.2 Risk measures

Let (Ω, F , P) be a probability space, and ∆ a time horizon. L∞(Ω, F , P) is the set of bounded random variables on (Ω, F ). Financial risks are represented by a set of random variables G ⊂ L∞(Ω, F , P) and they are interpreted as the portfolios losses over the time horizon ∆. We assume that G is a convex cone, i.e. that X ∈ G and Y ∈ G implies that X + Y ∈ G and λX ∈ G for every λ > 0.

Risk measures are real-valued functions ρ : G → R. One can interpret them the following way: ρ(L) is amount of capital that should be added to a position with loss given by L, so that the position becomes acceptable to an external or internal risk controller. Artzner et al. [5] analysed risk measures and stated a set of properties that are desirable for any risk measure. In particular, a risk measure that satisfies the axioms that are going to be stated is said coherent. Consider the following setting: X and Y are random variables representing a loss, l ∈ R is a scalar representing a loss, ρ is a risk measure, mapping a random variable X (or Y ) to R, according to the risk associated with X (or Y ).

Axiom 2.1 (translation invariance). For all X ∈ G and every l ∈ R, we have ρ(X + l) = ρ(X) + l.

Axiom 2.2 (positive homogeneity). For all X ∈ G and every λ > 0, we have ρ(λX) = λρ(X).

Axiom 2.3 (monotonicity). For X, Y ∈ G such that X ≥ Y almost surely we have ρ(X) ≥ ρ(Y ).

Axiom 2.4 (sub-additivity). For all X, Y ∈ G it holds ρ(X + Y ) ≤ ρ(X) + ρ(Y ).

(22)

• Translation invariance: Increasing (or decreasing) the loss by a deter-ministic quantity l increases (decreases) the risk by the same amount; • Sub-additivity: Diversification reduces risks;

• Positive homogeneity: Doubling the risky position doubles the risky; • Monotonicity: Higher losses mean higher risk.

The wide majority of risk measures are law-invariant, meaning that the out-come of the risk measure depends uniquely on the distribution of the loss. This thesis project focuses on this class of risk measures. The reader is re-ferred to the book McNeil et al. [41, chap. 6] for more details and comments on risk measures and their properties.

2.3 Coherence

In spite of being one of the most widespread risk measure in the banking and insurance sector, the Value-at-Risk is not coherent. In particular, it does not satisfy the property of sub-additivity, see for example Acerbi [1]. As men-tioned before the CVaR overcomes this problem and it is a coherent measure of risk, see Rockafell and Uryasev [48] for further details.

In this section we will show that the expectile is a coherent measure of risk, but in order to state results on the coherence of the expectile it is useful to move to a more generic setting, used by Bellini et al. [10]: the one of the generalized quantiles. Let (Ω, F , P) be a probability space, let L0 be the space of all random variables X on (Ω, F , P). Let Φ1, Φ2 : [0, +∞) → [0, +∞) be

convex, strictly increasing functions satisfying

Φi(0) = 0 and Φi(1) = 1 (2.3)

Consider the following space and norm

Definition 2.8 (Orlicz heart MΦ). Given a function Φ as above, the Orlicz heart MΦ is the set

MΦ := X ∈ L0: E Φ |X| a < +∞, for every a > 0

Definition 2.9 (Luxemburg norm k·kΦ). Let Y ∈ MΦ, the Luxemburg norm

is defined as kY kΦ:= inf a > 0 : E Φ |Y | a ≤ 1

Then, we have that Proposition 2.1. MΦ_{, k·k}

(23)

Now, consider the following minimization problem πα(X) := inf x∈Rπα(X, x) Where πα(X, x) := αEΦ1 (X − x)+ + (1 − α)E Φ2 (X − x)− (2.4) According to Breckling and Chambers [12] we call any minimizer

x∗_α∈ argmin π_α(X, x) a generalized quantile.

Notice that, for a proper choice of the functions Φ1(x) = Φ2(x) = x2, the

generalized quartile reduces to the expectile, therefore the expectiles are a sub-class of the generalized quantiles. The next proposition gives a general statement on the generalized quantiles.

Proposition 2.2. Let Φ1, Φ2 : [0, +∞) → [0, +∞) be convex, strictly

increas-ing and satisfyincreas-ing 2.3. Let X ∈ MΦ1 _{∩ M}Φ2_{, α ∈ (0, 1) and π}

α(X, x) as in

2.4, then

(a) πα(X, x) is finite, non-negative, convex and satisfies

lim

x→−∞πα(X, x) = limx→+∞πα(X, x) = +∞;

(b) The set of minimizers is a closed interval: argmin πα(X, x) = [x∗−α ; x

∗+

α ];

(c) x∗_α ∈ argmin πα(X, x) if and only if

( αE 1{X>x∗ α}Φ 0 1−((X − x∗α)+) ≤ (1 − α)E 1{X≤x∗ α}Φ 0 2+((X − x∗α)−) αE1{X≥x∗ α}Φ 0 1+((X − x∗α)+) ≥ (1 − α)E 1{X<x∗ α}Φ 0 2−((X − x∗α)−) , (2.5) where Φ0_i− and Φ0_i+ denote the left and right derivatives of Φi;

(d) if Φ1 and Φ2 are strictly convex, then x∗−α = x∗+α .

Proof. (a) If X ∈ MΦ1 _{∩ M}Φ2_{, then (X − x)}+ _{∈ M}Φ1 _{and (X − x)}− _{∈ M}Φ2 for any x ∈ R. Indeed, by convexity and monotonicity of Φ1, it follows that

E [Φ1((X − x)+)] ≤ E [Φ1(|X| + |x|)] ≤ 1₂E [2|X|] + 1₂Φ1(2|x|) < +∞.

Simi-larly, E [Φ2((X − x)−)] < +∞ for any x ∈ R. So, πα(X, x) is finite for any

x ∈ R. Non-negativity and convexity of πα(X, x) are trivial. By the Monotone

Convergence Theorem, it follows that lim

x→−∞πα(X, x) = x→+∞lim πα(X, x) =

+∞;

(b) Is a direct consequence of (a);

(24)

only if 0 ∈ ∂πα(X, x) := h ∂−_π α ∂x , ∂+_π α ∂x i . Define r(x) = E [Φ1((X − x)+)] and

l(x) = E [Φ2((X − x)−)]. Let y > x. We have that

r(y) − r(x)

y − x =

E [Φ1((X − y)+)] − E [Φ1((X − x)+)]

y − x =: E [HX(x, y)] ,

where the random variable HX(x, y) fulfills

HX(x, y) =      Φ1(X−y)−Φ1(X−x) y−x if X ≥ y −Φ1(X−x) y−x if x < X < y 0 if X ≤ x.

Since Φ1 is convex and X ∈ MΦ1 we have that y 7→ HX(x, y) is non-positive

and increasing, and therefore for all y with |y − x| ≤ 1 it holds E [|HX(x, y)|] ≤ E [|HX(x, x − 1)|] < ∞

From the dominated convergence theorem we get r0₊(x) = lim y→x+ r(y) − r(x) y − x = E lim y→x+HX(x, y) = −E1{X>x}Φ01−((X − x)+) .

The same argument gives r0−(x) = −E 1{X≥x}Φ01+((X − x)+) and similarly l0₊(x) = E 1{X≤x}Φ02+((X − x)−) l0−(x) = E 1{X<x}Φ02−((X − x) − ) Since πα(X, x) = αr(x) + (1 − α)l(x), we get ∂−πα ∂x (X, x) = −αE 1{X≥x}Φ01+((X − x)+) + (1 − α)E 1{X<x}Φ02−((X − x)−) ∂+πα ∂x (X, x) = −αE 1{X>x}Φ01−((X − x)+) + (1 − α)E 1{X≤x}Φ02+((X − x)−)

from which (c) follows immediately.

(d) If Φ1 and Φ2 are strictly convex, then for each t the function

g(t, x) := αΦ1((t − x)+) + (1 − α)Φ2((t − x)−)

is strictly convex in x. It follows that πα(X, x) = E [g(X, x)] is strictly convex

in x, hence its minimizer is unique.

It can be proven that, under additional smoothness assumptions on the functions Φ1 and Φ2 or on the distribution of X, the firs order condition in

(25)

Corollary 2.1. Under the assumptions of Proposition 2.2, let Φ1 and Φ2 be

differentiable. If Φ0₁₊(0) = Φ0₂₊(0) = 0 or if the distribution of X is continu-ous, the the first order conditions 2.5reduces to

αEΦ01(X − x ∗ α) = (1 − α)E Φ 0 2(X − x ∗ α) −_. (2.6) Proof. If Φ1 and Φ2 are differentiable, then Φ0i+= Φ0i−. Hence 2.5becomes

( αE1{X>x∗ α}Φ 0 1((X − x∗α)+) ≤ (1 − α)E 1{X≤x∗ α}Φ 0 2((X − x∗α)−) αE 1{X≥x∗ α}Φ 0 1((X − x∗α)+) ≥ (1 − α)E 1{X<x∗ α}Φ 0 2((X − x∗α)−)

If moreover Φ0₁₊(0) = Φ0₂₊(0) = 0 or if X is a continuous random variable (hence P (X = x∗_α) = 0), then the two inequalities above reduce to

αEΦ01((X − x∗α)+) = (1 − α)E Φ02((X − x∗α)−) .

As mentioned before, in the case Φ1(x) = Φ2(x) = x2 generalized

quan-tiles reduce to expecquan-tiles. Since Φ1 and Φ2 are differentiable and Φ01+(0) =

Φ0₂₊(0) = 0, the first order condition of eq. 2.5becomes αE(X − x∗α)+ = (1 − α)E (X − x∗α)−

Any solution of the equality above is called α-expectile of X.

In all cases in which the hypothesis of Corollary 2.1 are satisfied, generalized quantiles are defined implicitly by means of equation 2.6. Letting

ψ(t) := (

−(1 − α)Φ0

2(−t) t < 0

αΦ0₁(t) t ≥ 0, (2.7)

we have that ψ : R → R is strictly increasing, ψ(0) = 0 and x∗α is the unique

solution of the equation

E [ψ(X − x∗α)] = 0.

The following Proposition collects some elementary properties of the general-ized quantiles, in particular it is shown that they satisfy axioms 2.1-2.3. Proposition 2.3. Let Φ1, Φ2 : [0, +∞) → [0, +∞) be convex, strictly

increas-ing and satisfy 2.3. Let X, Y ∈ MΦ1_{∩ M}Φ2_{, α ∈ (0, 1) and let π}

α(X, x) be as

in 2.4. Denote with y_α∗−, y_α∗+ the lower and upper generalized quantiles of Y . Then the following holds.

(a) translation invariance: if Y = X + l with l ∈ R then [y∗−_α ; y∗+_α ] = [x∗−_α + l; x∗+_α + l];

(b) positive homogeneity: if Φ1(x) = Φ2(x) = xβ, with β ≥ 1, then

Y = λX for λ ≥ 0 =⇒ [y_α∗−; y_α∗+] = [λx∗−_α ; λx∗+_α ]; (c) monotonicity: if X ≥ Y almost surely, then x∗−_α ≥ y∗−

(26)

Proof. (a) and (b) follow immediately from

πα(X + l, x) = πα(X, x − l), for any h ∈ R,

and

πα(λX, λx) = λβπα(X, x), for anyλ > 0,

respectively.

(c) from the condition2.5the functions g−(X, x, α) := ∂−πα

∂x and g+(X, x, α) := ∂+_π

α

∂x and non-increasing in X and α and non-decreasing in x. Moreover it

holds that

x∗−_α = infx ∈ R : g+(X, x, α) ≥ 0 , x∗+_α = sup_{x ∈ R : g}−(X, x, α) ≤ 0 .

Suppose now that X ≥ Y almost surely. Since g+(Y, x, α) ≥ g+(X, x, α) and g−(Y, x, α) ≥ g−(X, x, α), it follows that x∗−_α ≥ y∗−

α and x∗+α ≥ y∗+α .

Finally, the next Proposition characterizes the generalized quantiles that have also reasonable properties in the sense of the axiomatic theory of risk measures. In particular it characterizes quantiles that are positively homoge-neous and convex and therefore guarantees that also Axiom2.4is satisfied by the expectile.

Proposition 2.4. Let Φ1, Φ2 : [0, +∞) → [0, +∞) be strictly convex and

differentiable with Φi(0) = 0, Φi(1) = 1 and Φ01+(0) = Φ02+(0) = 0. Let

α ∈ (0, 1) and

x∗_α(X) := argmin

x∈R αE Φ

1(X − x)+ + (1 − α)E Φ2(X − x)− .

(a) x∗_α(X) is positively homogeneous if and only if Φ1(x) = Φ2(x) = xβ, with

β > 1;

(b) x∗_α(X) is convex if and only if the function ψ : R → R defined in eq. 2.7

is convex; it is concave if and only if ψ is concave;

(c) x∗_α(X) is coherent if and only if Φ1(x) = Φ2(x) = x2 and α ≥ 1₂.

Proof. (a) Due to the assumptions made on Φ1 and Φ2, the minimizer x∗α(X)

is the unique solution of

E [ψ(X − x∗α)] = 0. (2.8)

Assume now that x∗_α is positively homogeneous. Let δ < 0 < γ and

X = ( δ, with prob. p γ, with prob. 1 − p, with p = ψ(γ) ψ(γ) − ψ(δ).

(27)

Since

Eψ(X) = pψ(δ) + (1 − p)ψ(γ) = 0, It follows by eq. 2.8 and by the uniqueness of x∗_α(X) that

x∗_α(X) = 0. (2.9)

If x∗_α is positively homogeneous, then it must be for every λ > 0

x∗_α(λX) = 0, (2.10)

that is

pψ(λδ) + (1 − p)ψ(λγ) = 0.

Denoting by ψ1 and ψ2 the restrictions of ψ to the domains (−∞, 0) and

(0, +∞), respectively, the previous equation can be written as ψ1(λδ)

ψ2(λγ)

= ψ1(δ) ψ2(γ)

for every δ < 0 < γ and for every λ > 0. For γ = 1 we get

ψ1(λδ) =

ψ1(δ)ψ2(λ)

ψ2(1)

, (2.11)

which is a Pendixer functional equation. From Theorem 4 in Aczel [2] it follows that ψ1(−λ) = ψ1(−1)λc and ψ2(λ) = ψ2(1)λc, for some c > 0. The

condition Φ1(x) = Φ2(x) = xβ, with β > 1, follows by integrating equation

2.7. The reverse implication is due to proposition2.3(b).

(b) Due to the first order conditions x∗_α(X) is the unique solution of E [ψ(X − x∗α(X))] = 0;

furthermore, the term on the left side of the equality above is a non-increasing function of x∗_α(X). If ψ is convex, it follows that

E [ψ(λX + (1 − λ)Y − λx∗α(X) − (1 − λ)x ∗ α(Y ))] = E [ψ(λ(X − x∗α(X)) + (1 − λ)(Y − x ∗ α(Y )))]

≤ λE [ψ(X − x∗α(X))] + (1 − λ)E [ψ(Y − x∗α(Y ))] = 0,

from which

x∗_α(λX + (1 − λ)Y ) ≤ x∗_α(λX) + (1 − λ)x∗_α(Y ). (2.12) If ψ is concave, then the opposite inequality holds, so that

(28)

It remains to prove the reverse inequality in eq. 2.12(and in eq. 2.13as well). The proof is done by contradiction. Suppose that ψ is not convex; then there exist x, y ∈ R such that ψ x+y₂ > ψ(x)₂ + ψ(y)₂ . Therefore there must exist z ∈ R and α ∈ (0, 1) such that

αψ(z) + (1 − α) ψ(x) 2 + ψ(y) 2 < 0 ≤ αψ(z) + (1 − α)ψ x + y 2 . Consider now two random variables X, Y satisfying

P (X = z, Y = z) = α

P (X = x, Y = y) = P (X = y, Y = x) = 1 − α 2 .

It is clear that E [ψ(X)] = E [ψ(Y )] < 0, while Eψ x+y₂ ≥ 0. Hence, x∗_α(X), x∗_α(Y ) < 0 and x∗_α X+Y₂ ≥ 0, so that x∗_α cannot be convex.

(c) From (b), ψ has to be convex. This implies that Φ0₂ has to be concave and Φ0₁ has to be convex. Moreover, from (a) it follows that Φ1 and Φ2 should

necessarily satisfy Φ1(x) = Φ2(x) = xβ for some β > 1. Putting together the

two conditions above, it must hold that β = 2. To ensure convexity of ψ one

has to require additionally that α ≥ 1₂.

We have finally proved that the α-expectile with α ≥ 1₂ satisfies axioms

2.1-2.4, therefore it has the nice property of being a coherent risk measure. Proposition 2.4gives us a slightly stronger results, in fact it tells us that the only generalized quantiles that are coherent risk measures are the expectiles with α ≥ 1₂.

2.4 Elicitability

This section and the next one will introduce and use a different notation, in order to be consistent with Bellini and Bignozzi [7], the paper that focus on the elicitability and the uniqueness of the expectile. Informally, a statistical functional T on a set of probability measures M on the real line is elicitable if it can be defined as the minimizer of a suitable expected scoring function. This property has become more and more important in risk management thanks to its close relation with back-testing.

Law-invariant risk measures can be defined as functionals on the space of probability measures with compact support M1,c(R). Hence, the core of this

section are the properties of statistical functionals in general, having risk mea-sures as a special case. Denote by M1(R) the set of probability measures on R,

each µ ∈ M1(R) is represented by its distribution function F (x) = µ(−∞, x].

A single-valued statistical functional is defined as a map T : M → R, where M ⊂ M₁_{(R). M is assumed to be m-convex, in the sense of the following} definition

(29)

Definition 2.10 (m-convex). A set M is said to be m-convex if for any λ ∈ [0, 1]

F, G ∈ M =⇒ λF + (1 − λ)G ∈ M.

The functional T can have different properties, one desirable is to have convex level sets (CxLS). To introduce this concept we need to introduce a few definitions.

Definition 2.11 (m-convex). A functional T is m-convex if for each F, G ∈ M and λ ∈ [0, 1]

T (λF + (1 − λ)G) ≤ λT (F ) + (1 − λ)T (G).

Definition 2.12 (m-quasiconvex). A functional T is m-quasiconvex if for each γ ∈ R its lower level sets

{T ≤ γ} := {F ∈ M s.t. T (F ) ≤ γ} are m-convex.

That is, for any λ ∈ [0, 1] and F, G ∈ M

F, G ∈ {T ≤ γ} =⇒ λF + (1 − λ)G ∈ {T ≤ γ} .

Definition 2.13 (m-quasiconcave). A functional T is m-quasiconcave if −T is m-quasiconvex.

Definition 2.14 (m-quasilinear). A functional T is m-quasilinear if is both m-quasiconvex and m-quasiconcave.

Definition 2.15 (CxLS). A functional T has convex level sets (CxLS) if for each γ ∈ R the level sets

{T = γ} := {F ∈ M s.t. T (F ) = γ} are m-convex.

That is, for any λ ∈ [0, 1] and F, G ∈ M

T (F ) = T (G) = γ =⇒ T (λF + (1 − λ)G) = γ.

Since {T = γ} = {T ≤ γ} ∩ {T ≥ γ}, an m-quasilinear functional T has also CxLS; the converse is not necessarily true.

Definition 2.16 (mixture continuous). A functional T is mixture continuous if for each λ ∈ [0, 1] and F, G ∈ M the function

λ → T (λF + (1 − λ)G) is continuous in λ.

(30)

Before we give the most important definitions regarding elicitability we state a lemma on functionals with CxLS and mixture continuity that will be used in the next section.

Lemma 2.1. If T has CxLS and is mixture continuous, then the sets {T < γ}, {T ≤ γ}, {T > γ} and {T ≥ γ} are m-convex.

Proof. Consider λ ∈ [0, 1] and F, G ∈ M such that Hλ = λF + (1 − λ)G ∈ M.

Let T (F ) ≤ γ and T (G) ≤ γ. If T (F ) = T (G) = γ, then from CxLS also T (Hλ) = γ, so we can assume without loss of generality that T (F ) < T (G) =

γ. Let now by contradiction T (Hλ) > γ. From mixture continuity since

T (Hλ) > T (G) > T (F ), there exist λ0 ∈ [0, 1] such that

T (λ0F + (1 − λ0)Hλ) = T (G) = γ,

that is

T ((λ0+ (1 − λ0)λ)F + (1 − λ0)(1 − λ)G) = γ.

Since λ0+ (1 − λ0)λ > λ, the distribution Hλ belongs to the segment joining of

the distributions (λ0+ (1 − λ0)λ)F + (1 − λ0)(1 − λ)G and G, and hence from CxLS we would have T (Hλ) = γ, a contradiction. The same argument shows

the m-convexity of the sets {T ≥ γ}, {T < γ} and {T > γ}. As anticipated before, elicitability is a concept deeply connected to the one scoring function. In particular a scoring function is defined as follows Definition 2.17 (scoring function). A scoring function S : R2 → [0, +∞) satisfies for any x, y ∈ R

(a) S(x, y) ≥ 0 and S(x, y) = 0 if and only if x = y;

(b) S(x, y) is increasing in x for x > y and decreasing for x < y; (c) S(x, y) is continuous in x.

Finally, it is possible to define

Definition 2.18 (elicitability). A statistical functional T is elicitable on MT ⊆

M if there exist a scoring function S as in Def. 2.17 such that for each F ∈ MT (a) gF(x) := Z S(x, y)dF (y) < +∞ ∀x ∈ R; (2.14) (b) T (F ) = argmin x∈R Z S(x, y)dF (y).

In this case the scoring function S is said to be strictly consistent with T . Now that we have the definition of elicitability a good question can be: why

(31)

elicitability is an important property for a risk measure?. This is due to re-cent policy of the Basel Committee on Banking Supervision [6], more specifi-cally they asked a deeper understanding of the consequences of changing the regulatory regime from VaRα to CVaRα, especially regarding what concerns

back-testing. Back-testing is the activity of periodically comparing the fore-cast of the risk measure with the realized values of the variable of interest, in order to try to assess the accuracy of the forecasting methodology. Several authors (Gneiting [27], Emmer et al. [21], Embrechts and Hofert [20], Ziegel [57]) agree that elicitability is a key property for a risk measure since it pro-vides a natural methodology to perform back-testing. In fact, if the functional T is elicitable and the forecaster is an expected score minimizer, then T will report a correct forecast by means of the expected score. A natural statistic to perform back-testing is given by the average of the expected score

ˆ S = 1 n n X i=1 S(Ti, Yi), n ∈ N

where T1, . . . , Tn are point forecast of the functional T and Y1, . . . , Yn are

outcomes of a random variable Y with distribution F ∈ M.

It is important to remind that elicitability alone does not guarantee a correct ranking of different point forecasts. In fact, it ensures that the expected score is minimized only by exact reporting of the functional, this does not guarantee that given two point forecast, the more accurate one (i.e. the closer one to the true value of the functional), receives a lower expected score. This is possible if, in addition to elicitability, one requires also that the scoring function is accuracy rewarding in the terminology of Lambert et al. [37]

Definition 2.19 (accuracy rewarding). A scoring function S as defined in

2.17 is accuracy rewarding if for any x1, x2 ∈ R and F ∈ M

T (F ) < x1< x2 or x2 < x1< T (F ) =⇒

Z

S(x1, y)dF (y) ≤

Z

S(x2, y)dF (y)

If that is the case, then it is possible to compare different approaches for computing T (historical simulation, parametric approaches, Monte Carlo etc...) in a natural and consistent way, by simply comparing the ex post realized expected score.

All the concepts introduced above will be used in this chapter and in the next one to prove the accuracy rewarding of the expectile and the uniqueness of the expectile in the set of coherent and elicitable risk measures. Recalling Def.

2.7 the expectile is defined as

T (F ) = argmin

x∈R

Z

[α((y − x)+)2+ (1 − α)((y − x)−)2]dF (y) it is elicitable by definition, hence it satisfies the following

(32)

Proposition 2.5. If T is elicitable on M1,c(R) then:

(a) T (F ) ∈ [ess inf(F ), ess sup(F)]. In particular T (δx) = x, for any x ∈ R;

(b) T has CxLS;

(c) T is mixture continuous;

(d) any scoring function S that elicits T is accuracy rewarding.

Proof. (a) From definition 2.17, we have that gF(x) :=R S(x, y)dF (y) is

de-creasing for x ≤ ess inf(F ) and inde-creasing for x ≥ ess sup(F ); it follows that T (F ) ∈ [ess inf(F ), ess sup(F )]. The second part is a direct consequence of the definition of δx that is the Dirac measure at the point x ∈ R.

(b) Let F, G ∈ M1,c(R) and λ ∈ [0, 1]. For T (F ) = T (G) = γ, it holds that

T (λF + (1 − λ)G) = argmin x∈R λ Z S(x, y)dF (y) + (1 − λ) Z S(x, y)dG(y) = γ. (c) Consider F, G and λ as before and define

hF,G(x, λ) := λ

Z

S(x, y)dF (y) + (1 − λ) Z

S(x, y)dG(y) < +∞. The claim is that hF,G(x, λ) is jointly continuous in x and λ. Indeed, for

(x, λ) ∈ [x − , x + ] × [0, 1], we have that

λS(x, y) ≤ S(x, y) ≤ max {S(x − , y), S(x + , y)} ,

so from condition 2.14and the dominated convergence theorem we have that

λn Z S(xn, y)dF (y) → λ Z S(x, y)dF (y) whenever (xn, λn) → (x, λ).

A similar argument applies to the second term (1 − λ)R S(x, y)dG(y), so that joint continuity of hF,G(x, λ) is established. It follows that the minimization

problem

min

x∈RhF,G(x, λ)

has a jointly continuous objective function and from (a) can be equivalently defined on a compact domain. Applying the Berge maximum theorem (see e.g. Aliprantis and Border [3]) it is possible to conclude that

arg min

x∈R

hF,G(x, λ) is continuous in λ,

that corresponds to mixture continuity of T .

(d) The result follows from mixture continuity and Proposition 2 of Lambert

(33)

It is shown in Gneiting [27] that the expected shortfall does not have the CxLS property and therefore is not elicitable. So, up to now, the only measure of risk which is coherent and has also the elicitability property is the expectile. Up to this point another good question is: are there other coherent and elictiable risk measures?

2.5 Uniqueness

The answer is no, indeed the expectile is the only risk measure which is both coherent and elicitable and this section is devoted to formally prove it. The proof exploits the properties of monetary and shortfall risk measures, therefore it is necessary to introduce them.

Definition 2.20 (monetary risk measure). A risk measure ρ : M1,c(R) → R

is monetary if ρ(F ) = T (F ), with T monotone and translation invariant. We also say that a monetary risk measure is elicitable if it is an elicitable function of the loss. A monetary risk measure induces two sets.

Definition 2.21 (acceptance/rejection set). Given a monetary risk measure ρ : M1,c(R) → R its acceptance set at the level of distribution is

N := {F ∈ M_1,c_{(R) s.t. ρ(F ) ≤ 0} ,} and the corresponding rejection set

Nc:= {F ∈ M1,c(R) s.t. ρ(F ) > 0} .

We will show that the set of monetary elicitable risk measures is a subclass of the shortfall risk measures introduced by F¨ollmer and Schied [25] defined as follows

Definition 2.22 (shortfall risk measure). Let l : R → R be increasing and not constant, with x0 in the interior of the convex hull of the range of l. The

shortfall risk measure ρl is defined as:

ρl(F ) := inf x ∈ R s.t. Z l(y − x)dF (y) ≤ x0 .

From F¨ollmer and Schied [25] if, in addition, l is continuous, then from the dominated convergence theorem ρl(F ) is a solution of

Z

l(y − ρl(F ))dF (y) = x0, (2.15)

while, if l is strictly increasing the solution of 2.15is unique.

(34)

l(−∞, 0) ⊆ (−∞, 0) and l(0, +∞) ⊆ (0, +∞). This guarantees that ρl has

the constancy property ρl(δy) = y that must be satisfied by any elicitable risk

measure from 2.5(a).

The following theorem is a characterization of the shortfall risk measures given in Weber [56].

Theorem 2.1 (Weber). Let ρ be a monetary risk measure on M1,c(R) with

acceptance set N . If there exists x ∈ R with δx∈ N such that, for each y ∈ R

with δy ∈ Nc, it holds

(1 − α)δx+ αδy ∈ N for succifiently small α > 0, (2.16)

then the following assumptions are equivalent:

(a) the accpetance and the rejection sets N and Nc are m-convex and N is ψ-weakly closed for some gauge function ψ;

(b) ρ is shortfall risk measure with a left continuous l.

Before proving the relation between elicitable monetary risk measures and shortfall risk measures we have to introduce the general definition of ψ-weak topology. Let ψ : R → [1, +∞] be a continuous function which serves as gauge function, let Cψ(R) be the linear space of all real continuous functions f for

which exists a constant c such that |f | ≤ cψ.

Definition 2.23 (ψ-weak topology). The ψ-weak topology is the topology gen-erated by the family of functionals

F 7→ Z

f (y)dF (y), with f ∈ Cψ, F ∈ M1(R).

Moreover, we have that

Fn→ F ψ-weakly ⇐⇒

Z

f dFn→

Z f dF for every continuous function f ∈ Cψ.

Now we are able to prove that an elicitable and monetary risk measure is a shortfall by simply adding a weak hypothesis on S.

Theorem 2.2. Let ρ : M1,c(R) → R be a monetary and elicitable risk measure

with a scoring function S(x, y) that is continuous in y and, for any x ∈ [−, ] with > 0, satisfies S(x, y) ≤ ψ(y) for some gauge function ψ. Then it is a shortfall risk measure.

Proof. If ρ is elicitable, then from 2.5 it has CxLS and it is mixture con-tinuous. From Lemma 2.1 it follows that the set N = {F : ρ(F ) ≤ 0} and Nc _{= {F : ρ(F ) > 0} are m-convex. Let δ}

x ∈ N and δy ∈ Nc; from mixture

(35)

conclude that ρ is a shortfall, it remains to show that N is ψ-weakly closed for some gauge function ψ. Recall that

N = {F ∈ M1,c(R) s.t. ρ(F ) ≤ 0} , where ρ(F ) = T (F ) = arg min x∈R Z S(x, y)dF (y).

Since under our hypotheses the scoring function S(x, y) is accuracy rewarding, we have that ρ(F ) ≤ 0 ⇐⇒ Z S(0, y)dF (y) ≤ Z S(, y)dF (y),

for each > 0. Let now F1, F2, . . . be a sequence of measures in N , with

Fn→ F ψ-weakly. We have that

Z

S(0, y)dFn(y) ≤

Z

S(, y)dFn(y).

Since S(x, y) is bounded by ψ(y) for x ∈ (−, ) and continuous in the second argument, it follows Z S(0, y)dFn(y) → Z S(0, y)dF (y), Z S(, y)dFn(y) → Z

S(, y)dF (y) and ρ(F ) ≤ 0.

Thus _Z

S(0, y)dF (y) ≤ Z

S(, y)dF (y),

so that N is ψ-weakly closed and we can apply 2.1to conclude that

ρ(F ) = inf x ∈ R s.t. Z l(y − x)dF (y) ≤ 0 ,

with a left continuous l.

In general, the converse of Theorem2.2 does not hold. In fact, there can be shortfall risk measures which are not elicitable, for an example see Bellini and Bignozzi [7].

Finally, we are able to state the uniqueness result.

Theorem 2.3. Let ρ : M1,c(R) → R be an elicitable monetary risk measure

satisfying the hypotheses of Theorem2.2. If ρ is coherent then it is an expectile with α ≥ 1₂.

Proof. From Weber [56], we know that any coherent shortfall risk measure can be represented as the unique solution of

Ea(Y − T (F ))+− b(Y − T (F ))− = 0,

where a ≥ b > 0. By defining α = _a+ba we can rewrite the loss function as l(y) = αy+_−(1−α)y−_{that is the loss function of an expectile with α ≥} 1

(36)

2.6 Further properties

We proved that the expectile is a coherent and elicitable risk measure. This section collects further properties of expectiles.

Proposition 2.6. Let X ∈ L1(Ω, F , P) a random variable representing a loss, let x∗_α be the expectile of X, unique solution of

E(X − x∗α)+ = (1 − α)E [X − x∗α] , (2.17)

then

(a) monotonicity: x∗_α is monotone in α, with α ∈ (0, 1); (b) constancy: if X = c almost surely, then x∗_α= c;

(c) internality: if X ∈ L∞, then x∗_α∈ [ess inf (X), ess sup(X)]. Proof. (a) Can be proved similarly to prop. 2.3(c).

(b) If X = c almost surely, then it is easy to check that the equality in 2.17

holds if and only if x = c.

(c) Follows immediately from constancy and monotonicity. The formal definition of the expectile has no clear financial meaning, to grasp it is necessary to slightly change point of view. So lets move to the set-ting in which random variables represent losses and recall that the acceptance set of a translation invariant risk measure ρ is

N := {X s.t. ρ(X) ≤ 0} ,

in the case of the α-expectile, the acceptance set can be written as N_exp α = X s.t. E [X −_] E [X+] ≥ α 1 − α , see for example Delbaen [17] for further details.

Being X a loss, X− is a gain, thus the expectile maps financial positions ac-cording to their gain-loss ratio. This has financial meaning and the gain-loss ratio, or Ω-ratio is a popular performance measure in portfolio management (see Shadwick and Keating [53]). Given its economic relevance and the con-nection with the expectile, the Ω-ratio will be used in the concluding chapter to analyse the performances of a portfolio based on the expectile optimization.