• Non ci sono risultati.

Bayesian functional emulation and prediction of CO2 emission on future scenarios

N/A
N/A
Protected

Academic year: 2021

Condividi "Bayesian functional emulation and prediction of CO2 emission on future scenarios"

Copied!
126
0
0

Testo completo

(1)

Department of Mathematics

Master of Science in Mathematical Engineering

Bayesian Functional Emulation and

Prediction of CO

2

Emission on Future

Scenarios

Master Degree Thesis

Candidate:

Luca Aiello

Supervisor:

Prof. Alessandra Guglielmi

Co-supervisor:

Dr. Matteo Fontana

(2)
(3)

Luca Aiello

Bayesian Functional Emulation and Prediction of CO2Emission on Future

Sce-narios

(4)
(5)

...our economic system and our planetary system are now at war. Or, more accurately, our economy is at war with many forms of life on earth, including human life. What the climate needs to avoid collapse is a contraction in humanity’s use of resources; what our economic model demands to avoid collapse is unfettered expansion. Only one of these sets of rules can be changed, and it’s not the laws of nature.

(6)
(7)

Abstract

The goal of this thesis is the proposal of a statistical method for a relevant current issue: emulation and uncertainty quantification of climate model simulations.

The climate scientists community makes large use of huge deterministic com-putational models in order to obtain projections on key variables related to climate change. With researchers working in parallel all over the world, summarising their work is important to help policy decision makers implement the right actions. In this thesis we analyze the CO2time-series output of different deterministic

mod-els.

We propose a statistical emulator for these data, that is a functional regression model. Inference on the unknown parameters is carried out according a fully Bayesian framework with a prior distribution on the vector of all parameters. In particular, we build a model in which we describe the time-dependent outcome CO2 as a functional mean plus a time-dependent error, that is a

"function-on-scalar" regression model with functional response and scalar covariates.

We also propose two different parameterization for the covariance matrix of the error, with their corresponding marginal priors. In the second case, we model heterogeneous variances within an autoregressive covariance structure for our data. In this way, we are able to fix the time-dependence of our data and obtain more precise posterior predictive distributions of CO2 emissions at time points

not included in the dataset, i.e. at time points where data were not available. We have successfully applied the Bayesian approach to simulator data by mod-eling the variability induced by the different deterministic models, giving a con-tinuous framework to the discretized output of the simulators and thus obtaining the predictive distribution at unobserved points.

(8)
(9)

Sommario

Obiettivo principale di questa tesi è la proposta di un metodo statistico per un problema attuale e rilevante: l’emulazione e la quantificazione dell’incertezza dei modelli di simulazione climatici.

La comunità scientifica che si occupa del clima fa largo uso di enormi modelli computazionali allo scopo di ottenere delle proiezioni su alcune variabili chiave collegate al cambiamento climatico. Dal momento che molti ricercatori lavorano in parallelo in tutto il mondo, sintetizzare il loro lavoro è di fondamentale importanza per incoraggiare e sostenere i responsabili delle decisioni di policy a promuovere le azioni adeguate. In questa tesi analizziamo serie storiche di emissioni di CO2

prodotti da diversi modelli deterministici computazionali.

Proponiamo un emulatore statistico per questi dati, ovvero un modello di re-gressione funzionale. L’inferenza sui parametri incogniti è ottenuta seguendo l’approccio bayesiano attraverso una distribuzione a priori sul vettore di tutti i parametri incogniti. In particolare, costruiamo un modello in cui descriviamo la CO2, in funzione del tempo, come la media funzionale più un errore

dipen-dente dal tempo, cioè un modello di regressione "function-on-scalar" con risposta funzionale e covariate scalari.

Inoltre proponiamo due diverse parametrizzazioni per la matrice di covarianza dell’errore, con le relative distribuzioni a priori. Nel secondo caso, per i nostri dati, modelliamo le varianze eterogenee dentro una struttura di covarianza au-toregressiva. In questo modo, fissiamo il tipo di dipendenza dal tempo dei nostri dati ottenendo così distribuzioni predittive a posteriori più precise delle emissioni di CO2nei tempi non inclusi nel dataset, ovvero dove i dati non erano disponibili.

In conclusione, in questa tesi abbiamo applicato con successo l’approccio bayesiano a dati provenienti dai simulatori, modellando la variabilità indotta da diversi modelli deterministici. Inoltre, abbiamo fornito un contesto continuo ai pro-fili discretizzati di CO2 e quindi abbiamo ricavato le distribuzioni predittive nei

(10)
(11)

Acknowledgments

Per cominciare, vorrei ringraziare la Professoressa Guglielmi per avere supportato questo lavoro e per non avermi mai fatto abbassare la guardia. La sua costante ricerca della rigorosità e della precisione è stata una lezione di vita ancora prima che accademica. Vorrei ringraziare anche il Dottor Matteo Fontana, nel quale ho trovato, oltre che una valida guida, una persona con la quale confrontare varie idee ed opinioni. Un ringraziamento va anche a tutti i professori incontrati in questi anni, che mi hanno fatto apprezzare quanto il mondo sia complesso. Vorrei ringraziare infine, nell’ambito universitario, il Professor Verri; senza il suo terrorismo psicologico del primo giorno di lezione, sarei probabilmente stato tra "la metà di voi che il prossimo semestre non sarà seduta qui".

Un ringraziamento speciale va al mio professore di matematica della quinta liceo, il Professor Giugni, che mi aveva messo in guardia sulla scelta di questo percorso ("guarda che vai in convento eh"). Senza la sua passione per la materia, la spensieratezza nell’insegnamento e la carica emotiva che riusciva a darmi, non sono convinto avrei intrapreso questa strada.

Durante la mia infanzia ed adolescenza ho avuto la fortuna di essere stato circondato da tanti amici, i quali hanno contribuito alla persona che sono ora e a quella che diventerò. Tra questi vorrei ringraziare Enri, la persona che conosco da più tempo al di fuori della mia famiglia. Uno degli amici più buoni che abbia. Vorrei poi ringraziare gli amici con cui ho passato i tempi delle medie e del liceo e che mi accompagneranno per tutta la vita: Geky, che, pur essendo un gobbo, è comunque riuscito a stare nel mio cuore, Andre e Gian. La semplicità nel relazionarmi con voi è sempre stata preziosa, ne ho bisogno. Come non ringraziare i ragazzi di Goodison Park: Fausto, Pasta, Marco, Gian, Mora, Piube, Johnny, Teo, Mura e Dono. Compagni di avventure che mi hanno insegnato che stare con i piedi per terra è importante almeno quanto raggiungere risultati importanti.

Un omaggio importante va a Santo, un grandissimo amico. Non dimenticherò mai gli sguardi di intesa tra i banchi del liceo e le chiaccherate filosofiche che accompagnano i nostri pomeriggi.

(12)

Acknowledgments

Milano. Tutto il gruppo di GLS ha costituito la mia casa in questi anni a Milano. La costante ricerca l’uno dell’altro ci ha permesso di tessere amicizie e relazioni che andranno avanti per sempre. In particolar modo vorrei ringraziare, le persone con le quali ho passato più tempo: Jimmy, nel quale ho trovato un amico speciale, sincero e genuino, sei un riferimento per me; Simo, una persona originale e mai banale, continueremo sicuramente a fare gli idioti insieme; Giodaz, una persona ordinaria di facciata, ma un pazzo scatenato di fatto; Giopogg, che mi ha insegnato che tra pensare una cosa e farla c’è di mezzo la volontà; Apotz, una delle persone più generose e gentili che abbia mai conosciuto; Checco, amico affettuoso; Eug, cosplayer conosciuto per caso su un regionale; Bea, mitica compagna delle avventure cilene; Leti, un’amica fantastica con un cuore enorme pronta sempre ad accoglierti.

Non meno importante, il resto del gruppo: Teo, Albi, Dipa, Andre, Angelino, Cami, Nderep, Lollo, Lux, Tower, Buch, Ste e Tina. Tutti voi mi avete trasmesso ed insegnato qualcosa, ed il merito di questa Laurea è anche vostro.

Ruffo, che dire, sarebbe banale definirti prima come amico e poi come coin-quilino. Le due cose, messe insieme, mi hanno fatto affezionare a te in modo particolare. Sei stato un compagno di viaggio importantissimo. Le nostre dif-ferenze sono state fondamentali per la mia crescita. Ho perso il conto delle cene in cui perdevamo il controllo delle nostre risate senza riuscire più a parlare. Sei nel mio cuore.

Nene, per caso, o forse no, ci siamo trovati a frequentare lo stesso percorso di studi. Non era scontato riuscire a mantenere saldo il nostro legame con così tanto in comune ad una così giovane età. Lo abbiamo fatto nella maniera più naturale e spontanea possibile, e proprio la tua naturalezza e sincerità hanno reso meno pesanti alcuni momenti di questi anni. Sei una mamma ed una compagna di vita allo stesso tempo.

Un ringraziamento va a tutta la famiglia estesa. In particolar modo, vorrei omaggiare la mia famiglia. I miei genitori mi hanno dato tantissimo in questi anni, come possibilità e come insegnamenti. Se mi capiterà di avere dei figli, vorrei essere come voi. Grazie papà, che mi hai trasmesso l’umiltà, la semplicità della vita e la consapevolezza che la fatica è necessaria per raggiungere i traguardi incontrati. Grazie mamma, per avermi insegnato che nella vita puoi essere anche superman ma i valori della gentilezza, della dolcezza e della generosità non andrebbero mai dimenticati. Grazie Michele, per avere portato in casa la ricerca dell’uscita dagli schemi per provare a vedere un po’ più in la dell’orizzonte più lontano che ognuno di noi possiede. Grazie Giulia, per avermi trasmesso la

(13)

determinazione che hai nella vita, per essere stata la Sorella Maggiore, un occhio sempre vigile su un fratello che si credeva un po’ troppo spesso "invincibile". Grazie Antonio, perchè anche tu mi hai insegnato che i sacrifici sono necessari per qualsiasi meta si voglia raggiungere.

La dedica più grande di questa tesi va alla nonna Rosanna, che ha sempre seguito la mia crescita con molta attenzione e molto affetto, trasmettendomi quei valori che oggi sembrano antiquati ma che sono più necessari che mai. La tua straordinaria forza con cui sei sempre riuscita a far fronte alle avversità della vita mi ha reso una persona più forte, contribuendo in modo significativo al raggiungimento di questa meta.

Questo traguardo appartiene tanto a me quanto a tutti voi. Grazie.

(14)
(15)

Table of contents

Abstract vii

Sommario ix

Acknowledgments xi

Contents xv

List of figures xvii

List of tables xxi

Introduction 1

1 Climate Change Modeling and Simulation 5

2 Review of Statistical models 13

2.1 Functional Data Analysis . . . 13

2.1.1 Representing Data by Basis Functions . . . 14

2.1.2 Function on Scalar Regression . . . 15

2.2 Bayesian Inference . . . 16

2.2.1 Computation of Posterior Inference . . . 18

2.3 Bayesian Functional Data Analysis . . . 23

3 Exploratory Data Analysis of CO2Emissions 25 3.1 Dataset . . . 25

3.2 Data Visualization. . . 27

4 Bayesian Functional Regression 33 4.1 Fixed Effects Model . . . 34

4.1.1 Full Conditionals . . . 37

4.1.2 Hyper-parameters Choice . . . 38

4.1.3 Posterior Inference . . . 39

(16)

Table of contents

4.2 Mixed Effects Model . . . 45

4.2.1 Full Conditionals . . . 49

4.2.2 Hyper-parameters Choice . . . 50

4.2.3 Posterior Inference . . . 51

4.2.4 Sensitivity Analysis. . . 61

4.3 Autoregressive Covariance Structure . . . 63

4.3.1 Prediction . . . 67

4.3.2 Comparison with the IW Covariance Structure . . . 69

5 Conclusions 71 Bibliography 73 Appendices 79 A Further Material on the Posterior Inference for CO2Emissions 81 A.1 IW Prior for Time Covariance Dependence . . . 81

A.2 ARH(1) Time Covariance Dependence . . . 85

A.2.1 AR Order Investigation . . . 92

B Stan Codes 93 B.1 Fixed Effects Model . . . 93

B.2 Mixed Effects Model . . . 96

B.2.1 IW Prior on Time Dependence Covariance Matrix . . . 96

B.2.2 ARH(1) Structure on Time Dependence Covariance Matrix . 98 B.3 Prediction with ARH(1) Covariance Structure . . . 100

(17)

List of figures

1.1 SSPs plane on adaptation and mitigation . . . 7

2.1 The nine cubic B-spline basis that are used in this thesis. . . 14

2.2 Update of the prior distribution through the likelihood. . . 18

3.1 Explanatory frame of the dataset. . . 25

3.2 Future projections described by the SSP variables. . . 27

3.3 CO2emissions profiles grouped by IAM; within each panel the 23 SSPs combinations. . . 28

3.4 CO2emissions profiles grouped by SSP combination; within each panel the 5 IAM projections of such combination. . . 29

3.5 Boxplots of CO2 emissions for each year; in each panel a different IAM. . . 30

3.6 Boxplots of CO2emissions for each IAM; in each panel a different year.. . . 30

3.7 Histograms of CO2 emissions, considering the 5 IAMs; in each panel a different year.. . . 31

4.1 Traceplots of two representative coefficients (i.e. [BW,1]1and σ2W,1). 39 4.2 Diagnostic of the same chains in Fig. 4.1. Left: running mean. Right: Autocorrelation. . . 40

4.3 nESS histogram of all the parameters of the fixed effects model. . . 40

4.4 Posterior mean of βk(t) (solid red line); 95% credibility bands (shaded red); the region in which CI include zero (shaded grey).. . 41

4.5 Emission estimates as in (4.8) for the given 23 combinations of variables (curves); 95% credibility bands associated (grey shading); observed outputs given by the IAMs (points).. . . 43

4.6 Mean and 95% credibility bands for the regression coefficient func-tion with different hyper-parameters. . . 44

4.7 Graphical representation of model (4.11)-(4.12) (taken from Gold-smith and Kitago, 2016). . . 49

(18)

List of figures

4.8 Diagnostic of some representative parameters (i.e. [BW,1]1 and

σ2W,1). Left: running mean. Right:Autocorrelation. . . 52

4.9 Posterior mean of βk(t) (solid red line); 95% credibility bands

(shaded red); the region in which CI include zero (shaded grey).. . 53

4.10 Visualization of π0,k(t)  P(|βk(t)| > δk(t)) (in solid red) for k 

0, 1, ..., p and the 95% threshold (in dotted black).. . . 55

4.11 Emission estimates as in (4.15) for the given 23 combinations of variables (curves); 95% credibility bands associated (grey shading);

observed outputs given by the IAMs (points).. . . 57

4.12 Data range compared with the estimates for the reference SSPs

(computed as in (4.15)) together with the 95% credibility bands. . . 58

4.13 Individual effect of each variable when deviating from SSP2. . . 59

4.14 Total effect of each variable deviating from SSP2. . . 60

4.15 Time correlation matrix of the residuals of the mixed effects model. 61

4.16 Posterior predictive distributions, computed as in (4.21), of the

three base scenarios for unobserved data points. . . 69

A.1 nESS histogram of all the parameters of the mixed effects model,

with the IW covariance prior. . . 82

A.2 Individual effect of each variable when deviating from SSP2 to

SSP1, for the mixed effects model with the IW covariance prior. . . 83

A.3 Individual effect of each variable when deviating from SSP2 to

SSP3, for the mixed effects model with the IW covariance prior. . . 83

A.4 Total effect of each variable when deviating from SSP2 to SSP1, for

the mixed effects model with the IW covariance prior. . . 84

A.5 Total effect of each variable when deviating from SSP2 to SSP3, for

the mixed effects model with the IW covariance prior. . . 84

A.6 nESS histogram of all the parameters of the mixed effects model,

with the ARH(1) covariance structure. . . 85

A.7 Posterior mean of βk(t) (solid red line); 95% credibility bands

(shaded red); the region in which CI include zero (shaded grey) of

the mixed effects model, with the ARH(1) covariance structure. . . 86

A.8 Visualization of π0,k(t)  P(|βk(t)| > δk(t)) (in solid red) for k 

0, 1, ..., p and the 95% threshold (in dotted black) of the mixed effects

model, with the ARH(1) covariance structure.. . . 86

(19)

List of figures

A.9 Emission estimates as in (4.15) of the mixed effects model, with the ARH(1) covariance structure, for the given 23 combinations of variables (curves); 95% credibility bands associated (grey shading);

observed outputs given by the IAMs (points).. . . 87

A.10 Data range compared with the estimates for the reference SSPs (computed as in (4.15)) together with the 95% credibility bands of

the mixed effects model, with the ARH(1) covariance structure.. . . 87

A.11 Individual effects deviating from reference scenario (SSP2) for the

mixed effects model, with the ARH(1) covariance structure.. . . 88

A.12 Total effects deviating from reference scenario (SSP2) for the mixed

effects model, with the ARH(1) covariance structure.. . . 88

A.13 Individual effect of each variable when deviating from SSP2 to SSP1

for the mixed effects model, with the ARH(1) covariance structure. 89

A.14 Individual effect of each variable when deviating from SSP2 to SSP3

for the mixed effects model, with the ARH(1) covariance structure. 89

A.15 Total effect of each variable when deviating from SSP2 to SSP1 for

the mixed effects model, with the ARH(1) covariance structure. . . 90

A.16 Total effect of each variable when deviating from SSP2 to SSP3 for

the mixed effects model, with the ARH(1) covariance structure. . . 90

A.17 Time correlation matrix of the residuals of the mixed effects model,

with the ARH(1) covariance structure. . . 91

A.18 ACF and PACF for the residuals of the mixed effects model, with

(20)
(21)

List of tables

4.1 LPML and WAIC for different choices of hyper-parameters in the

fixed effects model; in bold the best choice for each index. . . 45

4.2 Computation times for each choice of hyper-parameters in the fixed

effects model. . . 45

4.3 Set of hyper-parameters used in the sensitivity analysis. . . 62

4.4 LPML and WAIC for different choices of hyper-parameters in the

mixed effects model; in bold the best choice for each index. . . 62

4.5 Computation times for each choice of hyper-parameters in the

mixed effects model. . . 63

4.6 LPML and WAIC for different choices of hyper-parameters in the

mixed effects model, with the ARH(1) covariance structure; in bold

the best choices for each index. . . 66

4.7 Computation times for each choice of hyper-parameters in the

(22)
(23)

Introduction

The climate scientists community makes large use of huge deterministic compu-tational models in order to obtain projections on key variables related to climate change. These variables can be greenhouse gases emissions, temperatures, precip-itations, winds, sea level and many others. Usually these models require massive amount of inputs and parameters, but only a part of them will result to be signif-icant. Eliminating non essential parameters is a key task to ease computational burden of the models.

Projections of climate variables are a critical part in determining environmental policies, with researchers working in parallel all over the world. Summarising their work is necessary to help policy decision makers implement the right actions. Statistical methods can consolidate these efforts by providing uncertainty quan-tification, since probability distributions help us in quantifying such uncertainty, giving support in analyzing future scenarios. Uncertainty quantification is a sci-ence discipline able to provide information with the aim of reducing uncertainty both in the real and computational world.

Large scale computer simulations based on deterministic mathematical models have been widely used since the mid of the last century, but such simulations are, without any doubts, simplifications and therefore inexact. As a discipline, uncer-tainty quantification was born to develop methods able to understand bounds on the errors produced by the simulation, and limitation on the solutions that we get from such computations.

In this thesis we analyze the output of different Integrated Assessment Models (IAMs), deterministic simulators of future projections concerning important vari-ables in the study of climate change. More precisely, since each IAM relies on different assumptions and methods, we aim to model the variability induced by this experimental setting, as to emulate the simulators. The variable we are in-terested in is the carbon dioxide (CO2), since it is one of the main factor assessed

to reduce the effects on the environment due to climate change. Statistical em-ulators on large scale climate models are fundamental because they can “fast” approximate complex simulation models, requiring a limited number of training

(24)

Introduction

runs.

This work deals with the emulation of dynamic computer models that simulate phenomena evolving with time. For this reason we propose an emulator based on a functional regression model, with inference on the parameters carried out in a fully Bayesian framework. Previous works on this type of emulation were made by

Stommel(1961), in which the authors model the ocean density. Birrell et al.(2011)

emulate the process and describes influenza outbreak in London. Williamson

and Blaker(2014) create an emulator that compared to ours has similar purposes

emulating large scale computational models whose output is a time series. In other words, the emulator is a statistical model for the data, where the parameters are unknown, and we assume a prior distribution on them, since we adopt the Bayesian approach. In our context, data are time-varying vectors of CO2emissions

produced by different IAMs under different scenarios.

We present a model in which we describe the time-dependent outcome as a functional mean plus a time-dependent error, the "function-on-scalar" regression with functional response and scalar covariates. This method, under the Bayesian approach, has been successfully adopted to model and estimate the response in different applications (Goldsmith and Kitago,2016;Kowal,2018;Montagna et al.,

2012; Morris, 2015, and many others). Between the aforementioned works we

built on the model in Goldsmith and Kitago (2016). We first adopt their same model, then we adjust it for our purposes, and ultimately extend the method in order to include different covariance structures useful for the predictive inference. We first proceed by making use of the B-spline basis expansion on the coefficient functions of the model (i.e. the mean), in a Bayesian way, that is assuming a prior for the parameters in the expansion. The covariance structure, on which we are making inference, that models the time dependence within the time-series output, has been first modeled without imposing a structure but assuming an Inverse-Wishart prior, a standard prior for covariance matrices. This choice leads to good performances of the model in terms of estimation and interpretation of both the regression coefficients and the emission projections. However, this parametric assumption for the covariance structure has a shortcoming for the posterior predictive distributions since we have not been able to predict CO2

emissions at unobserved times. In fact, the predictive distributions were too imprecise and diffuse to be considered "useful".

For this reason, we propose an alternative model with heterogeneous variances within an autoregressive covariance structure (i.e. ARH(1)), aiming to further characterize the temporal dependence of the data. This is an attempt to add a

(25)

modeling feature to the functional outcome able to take into account a reasonable time-dependence design. In fact, under this model, it is possible to obtain more precise posterior predictive distributions of CO2emissions at time points in which

the computer outputs were not observed (i.e. between the decades).

Under the Bayesian approach we assume the unknown parameters as random, with a prior distribution; hence inference is based on the posterior distribution, i.e. the conditional distribution of the random parameters, given data, obtained via Bayes’ theorem. However, in general this posterior can be computed only via simulation methods. In particular, as it is standard in the Bayesian applications, we need to use an MCMC method to sample from the posterior distribution and obtain approximation of summaries of the posterior distribution. In our applica-tion, the MCMC sample was automatically built by Stan (Stan Development Team,

2019), a C++ (ISO,2012) open source software which allows to make MCMC sam-pling. Making posterior inference allows to model the uncertainty continuously, and provides a great tool to the policy-makers organs, capable of extracting, and consequently interpreting, time dependent information indicating what is driving the CO2emissions. In addition, we are able to statistically assess the "significance"

of the variables throughout the century, understanding in greater detail what is characterizing the emission projections in the simulators.

The efforts previously mentioned consisted in the implementation of a sta-tistical model able to emulate the IAMs simulators. It is capable of modeling the uncertainty contained within each model and the variability induced by the multi-model framework. As a result, a single model is produced, and, most im-portantly, it gives a plausible range inside which outputs from a new simulator could lie in. My original contributions in this work amount to: (i) the proposal of two Bayesian models to make inference for CO2 emissions, an approach that

is relatively new for this type of simulator data; (ii) understanding the data and inference we have obtained. The models we have used have already appeared in the statistical literature, but seem quite new in the application field of climate sciences. Of course, in (i) and (ii) I have been supported and guided by my thesis supervisor and co-supervisor. Moreover I have written Stan and R codes to run the MCMC simulations for computing the posterior inference. Other important original contributions of my work amount to: (iii) modeling the variability of the single SSPs combinations induced by the different IAMs; (iv) giving a contin-uous framework to the discretized output of the simulations; (v) obtaining the predictive distribution at unobserved points, making use of the autoregressive covariance structure.

(26)
(27)

Chapter 1

Climate Change Modeling and

Simulation

Global warming is one of the biggest challenges of our age, since it consists in an extremely complex and life threatening phenomenon whose study is more topical than ever. It is a multidisciplinary problem involving lot of fields such as physics, chemistry, biology, engineering, economy, sociology and many others. Statistics, in general, is a discipline that is transverse to the aforementioned ones and a powerful instrument able to give relevant information to the experts of a specific field. A statistical approach to the problem consists in a deep data analysis, as to discover unknown connections and relation between such different disciplines and to detect useful information to make well-informed decisions in a complex context.

Nowadays, many factors are considered as main causes of the recent speed up of global warming, first and foremost the emission in the atmosphere of greenhouse gases. Reducing them is one of the most common, and effective, measures of the effect of an individual, a community, an industry, or a country on the environment. In turn greenhouse gases are related to a lot of causes belonging to many fields, and this analysis will focus on determining how crucial factors affect the emissions.

Integrated Assessment Models

The data we based our analysis on are greenhouse gases discrete time profiles taken fromMarangoni et al. (2017), who set the background application frame-work on which this frame-work is based. These profiles are obtained, taking as input var-ious Shared Socioeconomic Pathways (SSPs), which will be introduced later, through Integrated Assessment Models (IAMs), which are deterministic computer models (i.e. simulators that produce the same outputs every time, if they are given the same inputs). These models are "integrated" since they span multiple academic

(28)

1 Climate Change Modeling and Simulation

disciplines, including economics and climate science and, for more comprehensive models also energy systems and land-use change. The word "assessment" comes from the use of these models to provide information for answering policy ques-tions. To quantify these integrated assessment studies, deterministic numerical models are used. Integrated assessment modelling does not provide predictions for the future but rather estimates of what possible scenarios look like. Those dif-ferences among the IAMs put uncertainty over the estimates and this thesis puts part of its efforts in determining what’s actually driving the emission behaviour.

Emission profiles generated by several IAMs allow one to quantify both para-metric and model uncertainty, which have been identified as a major source of uncertainty. Moreover, diagnostics of IAMs is a relatively nascent field that is growing in importance to help validate models. Hence, it is useful to disen-tangle the key drivers of the uncertainty in emissions projections because that understanding can help design better hedging strategies.

Shared Socioeconomic Pathways

Narratives used as input for the IAMs, which indicate a future scenario, are described by the previous mentioned Shared Socio-Economic Pathways. Scenarios showing future greenhouse gas emissions are needed to estimate climate impacts and mitigation efforts required for climate stabilization. SSPs are part of a new framework that the climate change research community has adopted to facilitate the integrated analysis of future climate impacts, vulnerabilities, adaptation, and mitigation. Information about the scenario process and the SSP framework can be found in Moss et al. (2010), Van Vuuren et al. (2014), O’Neill et al. (2014),

Kriegler et al.(2014). An SSP consists in the discretization of a continuous plane

of mitigation and adaptation to climate change (Riahi et al.,2017) as in Fig. 1.1. Thus, there are 5 SSP scenarios. Three of them belong to the main diagonal, describing future where the challenges for both adaptation and mitigation are low (SSP1), intermediate (SSP2), high (SSP3). In addition there are two asymmetric scenarios that do not belong to the main diagonal. In fact SSP4 has high challenges for adaptation combined with low challenges for mitigation, while the opposite is true for SSP5. A more precise explanation of the aforementioned future scenarios is the following:

• SSP1 - Sustainability, taking the Green Road (low challenges to mitigation and adaptation): this is the best case scenario. The habits of the world population become more sustainable and respective of the environment.

(29)

Figure 1.1:SSPs plane on adaptation and mitigation

There is a general transition towards better educational and health systems, and inequality is gradually reduced within and across countries.

• SSP2 - Middle of the Road (medium challenges to mitigation and adaptation: the world does not deviate from its historical pattern. Some countries reach good results for sustainability and equality while others are still too far from these objectives. The global population remains, generally, vulnerable to economic and environmental crisis.

• SSP3 - Regional Rivalry, a Rocky Road (high challenges to mitigation and adaptation): competition across the countries is worsened by nationalism and regional conflicts. There is low interest for global challenges and high concern for local issues. Inequality spreads all over the countries due to a decline in education and technological investments.

• SSP4 - Inequality – A Road Divided (low challenges to mitigation, high chal-lenges to adaptation): unequal investments bring huge inequalities within and across countries, with a part of the population that benefits from them and the other not reached by the improvements. The development of low carbon technologies go on with the persistent usage of fossil fuels and the focus is on the regional issues.

• SSP5 - Fossil-fueled Development – Taking the Highway (high challenges to mitigation, low challenges to adaptation): high faith in competitive markets and innovation as a path to reach sustainability. Strong investments in the

(30)

1 Climate Change Modeling and Simulation

well-fare and in technology to face environmental problems as air pollu-tion. Global economy keeps growing bringing advantages for the whole population.

The level we are going to use are the one represented as the main diagonal in the previous shown cartesian plane, where on the two axes we have challenges to mitigation and adaptation to climate change. Moreover, these pathways were developed over the last years as a joint community effort and describe plausible major global developments that together would lead in the future to different challenges for mitigation and adaptation to climate change.

Over past years, projects designed to analyze the future development of the greenhouse emissions have been many (Lashof and Ahuja,1990;Cole et al.,1997, and many others). These works were done in order to evaluate the global change of temperatures and allow to assess the emissions’ reduction policies. The main sources of uncertainty in these type of studies are two. The first one is relative to the key drivers of the CO2emissions: how will they evolve in the next years? The

second one concerns the deterministic models used to obtain the CO2emissions

and their sensibility.

This thesis aims to fill these existing uncertainties by a careful analysis of the individual and combined influence of each driver on greenhouse gas emissions through a multi-model (i.e. IAMs) and multi-scenario perspective. Beyond this, a technical aim is to understand how certain factors contribute to the emissions. One of the main goals has been clear since the beginning: obtaining an estimated profile on a mesh much denser than the one given by the IAMs. Furthermore, efforts were made in order to make it possible that the unique estimate would contain all the information owned by the different IAMs, as to sum up and model such differences. It’s important to have proper instruments to be able treating problems in a continuous framework since a 10 year frequency could be too large in a world where climate change effects seem to be accelerating ever more.

Emulation

Large scale computer simulations (as the IAMs) are widely used in modern sci-entific research to investigate, among others, physical phenomena that are too expensive or impossible to replicate directly (Fan et al., 2009;Textor et al., 2005). Often, the interest focuses on quantifying how uncertainty in the input arguments propagates through the simulator and produce a distribution function over one or many outputs of interest.

(31)

Learning from O’Hagan (2006), an important feature of this context that is worth stressing is that, the outputs of an IAM, which are a computer prediction of the real phenomena (simulated by the model), will inevitably be imperfect. Statistical analysis should be able to incorporate the model inadequacy, since it is often a biased representation of the real process. It is of great interest to have one single fast statistical emulator able to capture all the variability induced by the simulators (Kennedy and O’Hagan,2001; Santner et al., 2003). In particular, there are certain aspects of CO2 simulation scenarios, that can never be known

with certainty, since we cannot run the model for an "infinite" length of time or with an "infinite" number of possible starting values. Part of the uncertainty is also due to the inability to run the model for every possible choice of the input parameters. Such aspects are the quantities driving uncertainty on the runs we cannot perform. The quantification of uncertainty in this context, performing many runs, each with different inputs from the entire input space, in order to learn the input-output map, could be too expensive. Conversely, a sparse input space could bring insufficient information. The imperfection is principally due to the simplification present in the model and, because of this, there will be uncertainty about how close the true real quantities will be to the outputs. This uncertainty arises also from many other sources, in particular from the correct values given as inputs and the correctness of the model.

For the aforementioned issues concerning huge computer simulations, this thesis proposes an emulator, treating the IAMs models as black boxes in order to model the uncertainty in a non-intrusive way. An effective emulator is one that provides good approximations to the computer code output for wide ranges of input values, and accurate quantification of the emulation uncertainty (see

Francom et al.,2018). In fact, one of the biggest advantages of this approach is the

possibility, after having emulated the process, of evaluating outputs with given inputs different from the one used to fit the model (Busby,2009). Furthermore, it will allow to have error intervals at "unseen" input, completing in this way basic requirements desired for uncertainty quantification. In simple words, an emulator is a stochastic representation of a computer model that generates a prediction for the output of a computer model at any setting of the model parameters and reports a measure of uncertainty for that prediction (Williamson and Blaker,2014).

Mathematically speaking, a computer model is a function of a possibly large number of parameters. In fact from a mathematical perspective IAMs are sim-ulators that can be considered as functions. More precisely, given an input x ∈ X ⊆ ’p, a simulator is a function f : X 7→ ’q such that y  f(x), with

(32)

1 Climate Change Modeling and Simulation

the output represented by y ∈’q(Conti and O’Hagan,2010). Because of the high complexity of the computer model, f is taken as a black box; hence proper statis-tical modelling assumptions will be needed in order to estimate it. A reasonable assumption in most of the cases is to assume smoothness in the variation of the outputs with input changes. By giving a prior on the structure f its estimate can be given through Bayesian inference.

Bayesian Functional Regression

To emulate these deterministic simulators, the framework in which we decide to pose the problem is Function-on-Scalar Regression (FOSR), a very adaptable method from Functional Data Analysis (FDA) (for more details see the textbook

from Ramsay et al., 2005), as to obtain the desired in-sample estimates. Such

method is the extension to functional responses of the classic linear regression. We chose to rely on a functional method because our single observations consist in time series of length 9 of the CO2, and one of our principal aims was to estimate the

functional form of such time series. The estimation of the functional form of the time series allows us to have a tool able to evaluate the CO2profile continuously

in any time instant inside the time series.

We relied on the functional representation of the data because we are deal-ing with a continuous phenomenon in time, that we can reasonably suppose as smooth, and whose temporal downscaling is of great interest. Applications of FOSR are wide and interesting: examples include blood pressure profiles dur-ing pregnancy (Montagna et al., 2012), analysis of the time-varying impact of macroeconomic variables on the U.S. yield curve (Kowal,2018) and longitudinal genome-wide association studies (Barber et al., 2017; Fan and Reimherr, 2017). FOSR has in common with multiple regression many features - estimation and inference for the regression coefficients, and prediction of new responses - with additional modeling challenges. Within-function dependence of functional data requires careful modeling of the covariance structure, which may be complex or require further assumptions (i.e. a parametric structure), with implications for computational complexity and scalability. Moreover, the regression coefficients in FOSR are function themselves, which complicates estimation, inference and interpretation.

We have decided to assume a Bayesian approach, adopting a Bayesian multilevel hierarchical model for the estimation of the coefficients of the basis expansions (brief explanation in Chapter 2), a method from FDA that allow to represent a function with through some basis. The model we assume is the one proposed

(33)

byGoldsmith and Kitago(2016) where they were interested to give a functional representation to the trajectories of the arm of patients affected by stroke. The Bayesian approach to statistical inference has several benefits, such as the ability to incorporate multiple sources of information and uncertainty, and a greater flex-ibility to build complex model structures. Concerning hierarchical modeling, it is a natural framework in which many kind of grouped data can be described, such as longitudinal data of many subjects, as in our application. In the Bayesian con-text the usual prior distribution for all the group specific parameters, is such that it allows sharing information from different groups to help collective parameters estimation corresponding to small groups. This type of Bayesian model is partic-ularly appropriate in our application, since it will allow to discover how scenarios are affecting the gas emissions during this century and more deeply what is the contribution of the various IAMs and SSPs. Moreover the entire development and application of such method is fully Bayesian, meaning that a prior was to be put over models’ parameters with a following posterior inference, describing the new uncertainty of the parameters "after having seen the data".

Thus, this work is providing a unified framework for Functional Regression and Bayesian inference to emulate a computer simulation, mixing advantages coming from Functional Data Analysis (Ramsay et al.,2005) and Bayesian Analysis (for a textbook on Bayesian Statistics seeJackman, 2009) as deeply highlighted by the huge review on Functional Regression inMorris(2015). One of the main advan-tages of this approach, consists in providing joint inference on parameters and predicted variables. This is quite a novelty in the Climatology framework, since past works in literature mainly focused on frequentist Functional Data Analysis.

The remainder of this manuscript is as follows. In Chapter 2 we give a brief summary of the method we are going to use for our problem. Chapter3contains an exploratory analysis to allow the readers understand what kind of data we are working with. Chapter4is the main part of the work containing the development of the methods as to emulate the IAMs simulators. Finally, in Chapter5we will generally discuss the results of the work and possible future developments.

(34)
(35)

Chapter 2

Review of Statistical models

This Chapter aims at making easier, to the readers not accustomed with Statistics, the comprehension of the more technical aspects of this text. In particular we are going to give a brief introduction to key fields of statistics that give contribution to the model we will fit to our data: Functional Data Analysis in Section2.1and Bayesian Inference in Section2.2. In Section 2.3 we will review and summarize the benefits of combining these two advanced statistical approaches.

2.1 Functional Data Analysis

The main feature of Functional Data Analysis (FDA), that differentiate it from classical statistics, is that the observations are random functions. Such functions consist in a realization of univariate outcomes, from an underlying function, over a continuous domain (i.e. time or space). SeeRamsay et al.(2005) for more details on functional representation of data.

Nowadays, high frequency data are being collected thanks to the technological improvements of the last decades. Such data find a natural framework in FDA, since they usually can be represented as curves or surfaces for each subject. For example, these kind of data include time series analysis (Ullah and Finch,2013), geo-referenced data (Delicado et al.,2010), genetic studies (Wang et al.,2007) and many others. Actually the measurement are not truly continuous, since it would mean storing an infinite number of values, but they are assumed to be punctual values of an underlying function. In fact, they are observed on a discrete grid, whose density varies depending to the application, reflecting, anyway, a smooth variation. Moreover the grid in which data are collected could differ a lot between each subject (uneven grid data), increasing the complexity of the study.

FDA has several goals. The main ones are:

(36)

2 Review of Statistical models

analysis;

• studying connection and dependencies among the subjects and within them; • extend all the methods present in classic statistics to the functional

frame-work;

FDA can be then characterized as exploratory, confirmatory or predictive. The exploratory part tend to find hidden feature in the data and to make known aspects clearer. The confirmatory analysis is, on the contrary, aimed at giving statistical evidence through the data to some hypothesis or assumptions. Finally the predictive studies focus on unobserved states, always using the data.

2.1.1 Representing Data by Basis Functions

Assuming that the data have observational error, discrete data are to be converted to functions by smoothing them. Since the functional form of the data is, in most of the cases, unknown, it is a common practice to represent the curves through a fixed or flexible basis expansion. By doing so functional data are projected to a finite dimensional space, and ultimately represented by the expansion coefficients. Among all the possible choices for the basis, the ones that are particularly suitable for most of the applications, because of their ability to manage complex shapes, are the Fourier basis for periodic data and the B-spline basis for non periodic data, represented in Fig. 2.1. 0.00 0.25 0.50 0.75 1.00 2010 2020 2030 2040 2050 2060 2070 2080 2090 Time B−spline basis functions

Figure 2.1:The nine cubic B-spline basis that are used in this thesis.

(37)

2.1 Functional Data Analysis

B-splines are constructed following the algorithm proposed byDe Boor(1972), in order to facilitate, above all, the computational burden. Specifically, assume a spline defined on an interval of the domain, dividing it into L sub-intervals by L − 1 breakpoints τl with l  1, ..., L − 1. Considering the end points of the

interval the total number of breakpoints is L+ 1. In each interval every spline is defined as an m order polynomial (i.e. it requires m parameters to be defined). The most important property of the B-spline is their positiveness over no more than m sub-intervals.

Linear combination of basis functions for representing functional data are one of the most used methods in the functional analysis. It has several benefits such as a reduced computational demands, a greater flexibility that can account for any kind of variation in the data, and the possibility to carry out all the required computations with well-known matrix algebra. For instance, concerning B-spline basis, a function x(t) can be represented by a linear combination of the score expansions and the basis:

x(t) 

K Õ k1

ckφk(t) ⇒ x(t)  cTφ(t) (2.1)

for K known basis φk(t) and unknown coefficients ck. This notation highlights

once more the reduction from a potentially infinite space to a finite dimensional space of the expansion scores. Fig. 2.1 displays the 9 basis belonging to the B-spline basis expansion we are going to adopt in our analysis.

We suppose representation (2.1) to assume smoothness over time for observa-tion from the same subject. Simply put, we want consecutive data from the same function to be linked by a particular covariance structure. This is done through the evaluation of the functional model in the observed grid, obtaining a multivariate model, with a substantial difference: the smoothness assumption.

2.1.2 Function on Scalar Regression

All the linear models of classic statistics can be extended to the functional frame-work very easily. We resort to a model within the frameframe-work of Function on Scalar Regression (FOSR), which aims to fit a functional response given scalar inputs used as covariates. Given a set of n random functions over T  [a, b] and pvariables, for i 1, ..., n we assume that:

(38)

2 Review of Statistical models yi(t) β0(t)+ p Õ k1 wi,kβk(t)+ i(t) t ∈ [a, b]

where wi,k are the known scalar covariates, βk(t) are the unknown fixed

func-tional regression parameters, and i(t) are iid zero mean random functions. The

goal of functional response regression is the estimation of {βk(t)|k  0, 1, ..., p}

and testing the significance of them across t ∈ [a, b]. The way we adopted to deal with inference on the continuous regression parameters, whose form is unknown, is to take advantage of the basis expansion and make inference on its scores.

Within the FOSR framework there are some added challenges, compared to classic regression. Such challenges are, among all, replication and regulariza-tion, which are key features of the functional response regression modeling. Replication is involved in the regression, and also in accounting for potential between-function correlation. Moreover, it involves combining information across functions to draw upon their commonalities and make inferences on the popu-lations from which they were sampled. Regularization involves the borrowing of strength within a function, exploiting the expected underlying structural re-lationships within itself to gain efficiency and interpretability. Furthermore it potentially increases estimation and precision accuracy, and allows interpolation to values between observed grid points. As it will be possible to appreciate in the main text, these two challenges are addressed by specifying certain model characteristics.

In Chapter4we will introduce the B-spline basis in the formulation of the FOSR model.

2.2 Bayesian Inference

Bayesian thinking reflects the way we all think. Starting from a prior knowledge on a quantity we are interested in, we update our beliefs on it gradually collecting information coming from the events. Everyone express his degree of uncertainty about beliefs or facts with ratios or percentages. What is more, the prior knowl-edge one might have can be wrong, or can greatly differ from person to person, but the occurring of events can hugely reverse the convictions. Thus, the observation of the reality confirms or changes our ideas of it in light of what really happens.

The Bayesian paradigm, consists in the following rules:

• unknown quantities are modeled by probability distributions;

(39)

2.2 Bayesian Inference

• posterior distributions encode the learning process; • inference is fully based on posterior distribution;

For example, in a statistical perspective, one is interested in making inference on an unknown parameter θ of a certain distribution that summarizes a phenomenon we are interested in. Under a frequentist approach θ is considered as unknown, but fixed. The estimate of θ is the point of maximum of the likelihood function (i.e. a function of the data). Hence the only source of variation comes from the data. The Bayesian approach changes the perspective on the parameter under investigation. The parameter θ is no more considered fixed but random, meaning that its estimation will not be a point estimate anymore, but rather a probabilis-tic distribution. In the Bayesian way of doing statisprobabilis-tics, distributions have an additional interpretation. Instead of just representing the values of a parameter (or data) and how likely each one is to be the true value, a Bayesian thinks of a distribution as describing our beliefs about a parameter (or data).

More specifically, assume we want to make inference on a parameter θ ∈ Θ, and we have the following elements:

• Y: the random variable representing observed data;

• π(θ|α): a prior distribution representing the information one have before any evidence from data, with α an hyper-parameter of the law;

• p(Y|θ)  L(θ|Y): the likelihood function, which reflects our belief on Y when the true parameter is θ;

Now there are all the ingredients needed to compute the posterior distribution of the parameter as to update the prior beliefs.This is done by making use of the Bayes’ theorem:

π(θ|Y) p(Y |θ)π(θ)

Θp(Y |t)π(t) dt

∝ p(Y |θ)π(θ) (2.2) In (2.2) π(θ|Y) is the probability distribution on the parameter after having observed the data Y. Since the denominator of (2.2) does not depend on the parameters, i.e. they are integrated out, the Bayesian mantra is "the posterior is proportional to the likelihood times the prior". Fig. 2.2shows graphically what happens in (2.2).

As it is clear the prior captures our beliefs before seeing any data. The likelihood distribution summarizes what the observed data are telling us, by representing a range of parameter values carried by the data. Combining these distribution we

(40)

2 Review of Statistical models 0 10 20 0.40 0.45 0.50 0.55 0.60 0.65 x y Distribution Likelihood Posterior Prior Update of the prior through the likelihood

Figure 2.2:Update of the prior distribution through the likelihood.

get the posterior distribution that tells us which parameter values maximize the chance of observing the particular data that we observed, taking into account our prior beliefs.

When the prior is weak, meaning that there are not many prior information, the posterior is driven by the likelihood (i.e. data), and in some cases this bring to an equivalence between posterior and likelihood (i.e. when the prior is uniform). On the contrary, when the likelihood is weak (i.e. the data do not bring enough information) the posterior is dominated by the prior.

2.2.1 Computation of Posterior Inference

All the inference in the Bayesian context are obtained from the posterior distribu-tion π(θ|Y). However computadistribu-tion is not always an easy task. When one deals with a prior such that the posterior belongs to the same distribution family, then the prior is called conjugate. This is the best case for obtaining the posterior since it can be found analytically in a closed form. In many other cases the concept of Monte Carlo Methods and Markov Chain is greatly exploited.

Conjugate Prior

As an example suppose we are interested in estimating the mean µ of a Gaus-sian distribution representing the likelihood. Specifically, assume that we have observed y1, ..., ynsuch that:

(41)

2.2 Bayesian Inference

yi|µ, σ2 iid∼ N (µ, σ2) (2.3)

where the variance σ2 is known. The conjugate prior for the mean such that the posterior belongs to the same family has to be taken normal.

µ ∼ N (µ0, σ20) (2.4) In this case, it is straightforward to obtain that the posterior of µ is:

π(µ|y) ∝ N(y; µ, σ2)N (µ; µ0, σ20) where: µ|y ∼ N(µN, σ2N) µN  σ2N  µ0 σ20 + Í iyi σ2  σ2N   1 σ20 + n σ2 −1

Having access to a closed form distribution has a lot of advantages, but it is not the case for more complex models than (2.3)-(2.4). In general, a closed form expression of the posterior does not exist. For this reason Monte Carlo Methods are largely used in Bayesian Statistics.

Monte Carlo Methods

Monte Carlo Methods are a class of algorithm that take advantage of random sampling to obtain numerical results. For example assume we can sample from a distribution and that we are interested in estimating its mean. This is possible by taking as estimate the empirical mean of sampled values, provided that the sampled values are a great number and the samples are independent. The fairness of this procedure is ensured by the Law of Large Numbers (LLN). This law states, in the weak form, that given a set of i.i.d. samples x1, ..., xNfrom a random variable

with expected value µ, the empirical mean ¯XN  N1 ÍNi1xi converges to the true

mean:

¯ XN

P

−→ µ for N −→ ∞ that is, for any positive number :

lim

N→∞

|X¯N− µ| > 

 0

(42)

2 Review of Statistical models

distribution π(µ|y). Formally it is computed as the following integral:

…[µ|y]

Θ

θπ(µ|y) dθ (2.5)

One of the main reasons behind the usage of Monte Carlo Methods is that the integral in (2.5) could be highly intensive to be determined. For this reason Monte Carlo algorithm provide an easy way to estimate integrals like (2.5). Quantities like these can be computed making use of the following algorithm, given that we may often iid sample from the posterior:

Algorithm 1:Monte Carlo algorithm

1 for n 1 to N do

2 iid sample θ(n)from π(θ|y);

3 end

4 Estimate…[θ|y] with N1 ÍNn1θ(n).

Markov Chain Monte Carlo

In modern Bayesian analysis, Markov Chain Monte Carlo (MCMC) has made possible to implement large hierarchical models that include a huge number of unknown parameters. Essentially a Markov Chain is a sequence of random variables that are probabilistically related to one another. The Markov Property is fundamental because everything that you would possibly need to predict the next event is available in the current state, and no new information comes from knowing the history of events.

MCMC is a powerful tool in Bayesian Inference since it combine the great advantages coming from Markov Chain theory and Monte Carlo Methods. By exploiting the equilibrium distribution of the desired distribution, constructed through a Markov Chain as to decrease autocorrelation and induce independence, one can obtain a sample of the desired distribution by recording states from the chain. The main idea of MCMC is to construct a Markov Chain on the param-eter space Θ whose invariant distribution is the posterior density π(θ|y). More precisely, MCMC methods are used to approximate the posterior distribution of a parameter of interest by random sampling in a probabilistic space. The history of the chain will then be considered as the sampled values of the parameter from its distribution and any following inference is done based on the application of Monte Carlo methods.

(43)

2.2 Bayesian Inference

If the MC is ergodic (for more details seeJackman,2009) then there are some key consequences that assure the convergence of the method. The following theorem formalize what we have just stated:

Theorem(Ergodic Theorem). If {θ(n)} is an ergodic Markov Chain on the parameter space Θ with an invariant distribution π. Consider a measurable function h : Θ −→ ’

such thatΘ|h(θ)|π(θ) dθ < ∞. Then:

lim N→∞ 1 N N Õ n1 h(θ(n)) ∫ Θ h(θ)π(θ) dθ

In the MCMC framework the previous theorem is fundamental when we replace π(θ)by the posterior distribution π(θ|y). In fact this result allow the estimation of any scalar function h(θ), provided that the chain is ergodic. The consequences of this are the following:

• the MC have a unique limiting (posterior) distribution π  π(θ|y);

• the starting point of the chain is not important if we let the chain run for enough iterations, since it will visit subspace X ∈ Θ with frequency propor-tional to∫Xπ(θ|y) dθ;

• the estimate of…[h(θ)|y]∫Θh(θ)π(θ|y) dθis computed asN1 ÍNn1h(θ(n)); One of the most famous MCMC method is the Metropolis-Hastings algorithm

(Hastings,1970) to obtain a sequence of random samples from a probability

dis-tribution from which direct sampling is difficult. This algorithm (shown in Al-gorithm2) works by generating a sequence of sample values in such a way that, as more and more sample values are produced, the distribution of values more closely approximates the desired distribution. Specifically, at each iteration, the algorithm picks a candidate for the next sample value based on the current sample value. Then, with some probability, the candidate is either accepted (in which case the candidate value is used in the next iteration) or rejected (in which case the candidate value is discarded, and current value is reused in the next iteration). Specifically the algorithm works as follows: This algorithm proceeds by evaluat-ing at each step an acceptance ratio. The value α is the probability of acceptevaluat-ing the new sample (i.e. moving the chain). If r > 1 the new sample will always be accepted, while if r < 1 it will be accepted with probability r.

Another algorithm, that is more suitable when the posterior density is high-dimensional, is the Gibbs Sampling (Geman and Geman, 1984). This method

(44)

2 Review of Statistical models

Algorithm 2:Metropolis-Hastings algorithm

1 sample the candidate θ∗ from a proposal distribution q(θ∗, θ(n−1));

2 r ← q(θ ∗ ,θ(n−1))p(θ∗|y) q(θ(n−1)∗ )p(θ(n−1)|y); 3 α ← min(r, 1); 4 sample U ∼ Unif(0, 1) 5 if U < α then 6 θ(t) ← θ∗; 7 else 8 θ(t) ← θ(t−1); 9 end

(shown in Algorithm3) is about sampling each variable (or group of variables) by generating its distribution conditional on the current values of the other variables. It can be shown that the sequence of samples constitutes a Markov chain. The key point of this algorithm is that for multivariate parameters is much easier to sample from the conditional distribution, of one component at a time (gn in Algorithm

3), rather than to marginalize by integrating over a joint distribution. Assume we want to obtain N samples on θ  [θ1, ..., θd] partitioned in d sub-vectors

(eventually scalars). Then, the procedure followed by the Gibbs Sampler is the following:

Algorithm 3:Gibbs Sampling algorithm

1 for n 1 to N do 2 sample θ(n1 +1)from g1 θ(n1 +1)|θ(n)2 , θ(n)3 , ..., θ(n)d , y; 3 sample θ(n2 +1)from g2 θ(n2 +1)|θ(n)1 , θ(n)3 , ..., θ(n)d , y; 4 ... 5 sample θ(nd+1)from gd θ (n+1) d |θ (n) 1 , θ (n) 2 , ..., θ (n) d−1, y  ; 6 θ(n+1) ← θ(n1 +1), ..., θ(nd+1) 7 end

These algorithm are the foundations of advanced Bayesian analysis, but they need some kind of diagnostic, depending to the application problem, to be sure the MCMC chain has reached the invariant distribution. To this purpose, this convergence can be verified graphically through visualization of traceplots, auto-correlation function and cumulative mean of the parameters of interests.

(45)

2.3 Bayesian Functional Data Analysis

2.3 Bayesian Functional Data Analysis

In the last decades, with the gradually improvement of computational power, Bayesian analysis for complex problem has become accessible to many, not only to the big research centers. Functional Data Analysis has particularly benefit from this technological development. In fact, with a more accessible computing power, complex model with many unknown parameters could take advantage (not feasible before) of the joint inference that Bayesian framework provide.

For example, Goldsmith and Kitago (2016) proposed a Bayesian method to estimate the coefficients of a B-spline basis expansion on the regression coefficient in the Function-on-Scalar regression. Kowal (2018) advanced their method by allowing the basis type to be unknown, together with its dimension, and estimated in the model. Kaufman et al. (2010) developed the functional ANOVA with a Bayesian viewpoint reaching analogous results present in Ramsay et al. (2005).

Thompson and Rosen(2008) formulated a model for functional data collected on

an irregular grid. The papers introduced here in this section are those which propose models more useful for our application. However, many others works and studies have developed Functional methods, historically belonging to the frequentist area, in a Bayesian framework.

The main characteristic of this work is the combination of most of the informa-tion and methods contained in the cited articles, in order to develop the wanted emulator.

(46)
(47)

Chapter 3

Exploratory Data Analysis of CO

2

Emissions

This chapter mainly focuses on an exploratory analysis of the data we will analyze using the Bayesian model described in Chapter4. This chapter aims at looking at data from many angles, describing it, and summarizing it without making any assumptions about its content. We will first present the dataset along with its main features and then an exploratory analysis will be done as to properly visualize its principal characteristics.

3.1 Dataset

Data were taken from the analysis ofMarangoni et al.(2017); further information and details on SSPs are available on theInternational Institute for Applied System

Analysis site. An explanatory frame of the dataset is shown in Fig. 3.1, where the

IAM is specified in the first column, then the factors describing SSPs and lastly the 10 years frequency CO2profiles associated with them. The emission projections

obtained are expressed in gigatons of carbon dioxide (i.e. GtCO2).

Figure 3.1:Explanatory frame of the dataset.

As previously mentioned, Shared Socio-economic Pathways (SSPs) have been intro-duced to describe alternative social, economic and technical narratives, spanning a wide range of plausible futures in terms of challenges to mitigation and adapta-tion. The base of this thesis is, how firstMarangoni et al.(2017) did, considering

(48)

3 Exploratory Data Analysis of CO2Emissions

three of these SSPs, which represent low, intermediate and high challenges to both mitigation and adaptation. For the sake of the analysis, five main global variables are considered:

• population (POP);

• gross domestic product per capita (GDPPC); • energy intensity improvements (END); • fossil fuel availability (FF);

• low-carbon energy technology development (LC);

Each of them takes a value indicating which SSP scenario it is describing, namely, SSP1 (Green Road), SSP2 (Middle of the Road) and SSP3 (Rocky Road). Each variable, indeed, is indicating which of three possible future curves it is describing, as shown in Fig. 3.2. It is possible to appreciate for each variable how the three SSPs variables describe very different predictive scenarios, especially for the gross domestic product per capita and the population. In fact, the difference between population in SSP1 and SSP3 is about 5 billion people and a huge gap is present between such scenarios also for the GDPPC.

We consider five Integrated Assessment Models (IAMs). The IAMs are deter-ministic models that show different structural characteristics which are needed to obtain the predictions. These models, taking as input different combinations of the SSP variables, supply one value per decade of prediction regarding the emission caused by fossil fuel combustion.

The analysis was performed by applying an algorithm that allows to evaluate the sensibility of every IAM to a given SSP variable, decomposing it into the individual one and the one influenced by the interaction among the variables. To each of the five deterministic model the most explanatory combinations were given. Such combinations are 23 and their design was chosen as to obtain the aforementioned sensibilities. Moreover, such combinations are used by the sim-ulators considered in the dataset, in a way that the reduced number of inputs was a good representation of the whole input space (i.e. space filling criterion). These combinations suggest from which SSP the value of each variable should be taken. To sum up, this setting consists 23 CO2profiles emissions for each of the 5

IAM, each one with a ten-year frequency, giving a total of 115 observations, that is {yij(t), i  1, ..., 23, j  1, ..., 5, t ∈ {2010, 2020, ..., 2090}}. Here i corresponds to

the i−th combination of the SSP variables, j to the j−th IAM and t to the decades.

(49)

3.2 Data Visualization 8 10 12 2025 2050 2075 2100 Time P opulation in billions POP (Population) 20 40 60 80 2025 2050 2075 2100 Time GDPPC in thousands US$2005 GDPPC 0.002 0.004 0.006 0.008 2025 2050 2075 2100 Time EJ/billion US$2005

END (Energy intensity)

0.6 0.7 0.8 2025 2050 2075 2100 Time % of F ossil o v er total pr imar y energy

FF (Share of fossil fuels)

40 50 60 2025 2050 2075 2100 Time Mt C O2 /EJ LC (Carbon intensity) SSP SSP1 SSP2 SSP3

Figure 3.2:Future projections described by the SSP variables.

3.2 Data Visualization

Each panel of Fig. 3.3, which corresponds to one of the 5 IAMs, represents the 23 emission profiles corresponding to the different scenarios. The plot help also in understanding how the different IAMs produce different projections, probably due to different assumptions and implementation. As expected the range of variation of the curves is very wide, particularly in the far future, with the most sustainable scenarios having low emissions and the less one almost exploding. Different scenarios within the same IAM seem to have independent behaviour, advising a possible uncorrelated structure between them. Thus an independence assumption between different scenarios of the same IAM could be reasonable and subject to deeper inspections.

Figura

Figure 1.1: SSPs plane on adaptation and mitigation
Figure 2.1: The nine cubic B-spline basis that are used in this thesis.
Figure 3.3: CO 2 emissions profiles grouped by IAM; within each panel the 23 SSPs combinations.
Figure 3.4: CO 2 emissions profiles grouped by SSP combination; within each panel the 5 IAM projections of such combination.
+7

Riferimenti

Documenti correlati

The latest entry in a special project in which business and labor leaders, social scientists, technology visionaries, activists, and journalists weigh in on the most

Many have marveled at the “leapfrog” adoption of mobile phones in the developing world that has allowed households to skip landline phones. Households in 

While the Labor Department has changed as the workforce we serve has changed, the principles guiding our work have never wavered..

The latter broad type is termed Mobile Labour Markets (MLMs) and is by definition localised. The essay defines and conceptualises these markets proposing a typology which

The Convention also recognises private employment services for the double function they play in labour markets: work enablers but also employers: at global level, the industry

For labor market conditions, we examined six indicators: percentage of workers em- ployed in jobs that require less than their actual skill level; the percentage of the labor

Even if productivity increases are sufficient to offset the increase in wages per hour, the tightening of the labour market that a shorter working week will bring means that

Workers need to be clear about their principles and values. The principles of platform