Capacità predittiva del modello ad alta risoluzione Weather and Research Forecasting sull'Italia: il caso dell'anno 2018

(1)

CORSO DI LAUREA MAGISTRALE IN SCIENZE

AMBIENTALI

"PREDICTIVE CAPACITY OF THE HIGH RESOLUTION

MODEL WEATHER AND RESEARCH FORECASTING (WRF)

ON ITALY:

THE CASE OF THE YEAR 2018"

Relatore: Prof. Antonello Provenzale

Correlatore: Dott. Ing. Antonio Parodi

Controrelatore: Prof. Giandomenico Mastroeni

Candidato:

LORENZA APICELLA

(2)

2

.

There you'll find me, with my head among the clouds.

(3)

3 Index

I. INTRODUCTION ... 4

II. METHODS AND MATERIALS ... 7

2.1. INTRODUCTION TO "NUMERICAL WEATHER PREDICTION" ... 7

2.2. WRF MODEL ... 13 2.2.1. FEATURES ... 13 2.2.2. WRF FORECAST DATA ... 24 2.2.2.1. PRECIPITATION ... 25 2.2.2.2. TEMPERATURE ... 26 2.3. OBSERVED DATA ... 27

III. DATA ANALYSIS ... 33

3.1. PRECIPITATION ... 33 3.2. TEMPERATURE ... 45 IV. RESULTS ... 49 4.1. PRECIPITATION ... 49 4.2. TEMPERATURE ... 86 V. CONCLUSIONS ... 99 VI. BIBLIOGRAPHY ... 105 VII. ACKNOWLEDGEMENTS ... 107

(4)

4

I. INTRODUCTION

Models are the representation of natural phenomena and the expression, made by different means, of the human comprehension of the Earth complex system: almost every science field uses model attempting to reproduce natural phenomena with the aim to discover along the time and space variables how a particular process works, going beyond the actual level of knowledge.

Meteorological models are mainly, but not exclusively, employed for the weather forecasts production, attempting to simplify and reproduce forward in time complex processes and their evolution: forecasts can be used at different levels as a support for decision processes, from the everyday plans up to the preventions and the risk reduction linked to dangerous natural phenomena.

The atmospheric model WRF (Weather Research and Forecasting Model) is an open source code conceived and developed since the mid 90's by NCAR (National Center for Atmospheric Research), National Oceanic and Atmospheric Administration (NOAA), U.S. Air Force, Naval Research Laboratory, University of Oklahoma, and the Federal Aviation Administration. WRF is a mesoscale forecasting system capable of operating at spatial resolutions from hundreds of meters to hundreds of kilometers. CIMA (International Center for Environmental Monitoring) operates WRF model, in the framework of the institutional cooperation with Regional Environmental Protection Agency for Liguria (ARPAL) and the Italian Civil Protection Department (ICPD). CIMA Foundation is applying WRF both in the hydro-meteorological research and in the operational domains.

The aim of this thesis is to evaluate the predictive capability of the high resolution WRF (Weather Research and Forecast) Model, operated by CIMA Research Foundation at 1.5 km on the Italian peninsula during the year 2018.

The WRF model performances assessment is carried on by comparing selected forecast products (temperature and rainfall) with the corresponding observational data, to highlight the critical points where the predictive capability is weak as well as the good performances.

(5)

5

Model's skills were tested on different time and space ranges: a check for the whole Italian territory for both considered variables month-by-month was firstly performed, taking into account the average value for that time interval; in addition a seasonal trend analysis was made, gathering the three months belonging to each season and evaluating the performances on a larger time span; all the results are displayed with a set of maps showing for each month the observed and forecasted data, sided by a map showing the BIAS for every grid point, expressed in percentage and absolute value for rain and temperature respectively.

A diurnal-cycle analysis has been applied to temperature data in order to evaluate the model performances in nine different climatic zones of the Italian peninsula: North, Center and South, together with three altitudes slots (0 m - 200 m , 200 m - 500 m , 500 m - 1000 m and over 1000 m), calculating the mean monthly value for every single hour of the day and matching the displacement between the two matrices containing forecast and observed data.

Rain forecasts' accuracy is of primary importance in the field of meteorological sciences applied to civil protection purposes: we decide to assess WRF

performances for this variable from the time range of a whole season down to a single day.

Thus, with support and guidance of ARPA Piedmont (Dr. Naima Vela, Dr. Massimo Milelli and Dr. Elena Oberto) a Fuzzy analysis (Ebert et al, 2008) on precipitation data was performed; Fuzzy analysis consists in a object-oriented technique which allows the comparison between real and forecasted data with a relaxed "neighborhood" of closeness, producing then a set of statistical indices describing model's rain forecast performance with three hourly data for the 365 days of the year 2018 and the whole domain.

Finally, always concerning precipitation data, some relevant events among the ones causing the issuing of weather alerts by ARPAL in Liguria, along the course of the 2018, were selected and studied with the " Method for Object-Based Diagnostic Evaluation" (MODE) (Davis et al. 2006a and 2006b); this method provides an objective comparison of the overall structure of the forecasted precipitation with what was observed, identifying precipitation objects and

(6)

6

comparing attributes of matched forecast and observed objects rather than evaluating hits and misses at a point/neighborhood as is done with the contingency table. The attributes considered include area coverage, centroid displacement, intensity, overlap and percentile intensity (in this work above the 90th percentile threshold).

It is worth to mention that for all the aforementioned analyses, the WRF forecast products were divided into "RUN 1" (first 24 hours of forecast) and "RUN 2" (second 24 hours of forecast), since for every single day the model produces a 48 hours forecast; thus each day and, in the same way, each month, presents two different forecasts which are both evaluated with the same procedures.

(7)

7

II. METHODS AND MATERIALS

2.1. INTRODUCTION TO "NUMERICAL WEATHER PREDICTION"

By the end of the 19th Century, thanks to the new discoveries in the field of thermodynamics and hydrodynamics, a mathematical approach in the meteorology field was proposed (Willis 2006), setting the starting point of what we now call "Numerical Weather Prediction" (NWP). The basic idea was to use a set of equations, the so called "primitive equations", to describe the atmospheric conditions and their evolution in time and space (Kalnay, 2002 Kalnay, E., 2002); this idea was carried out with the use of a "Numerical Model", which contains all the physical rules describing with a certain degree of approximation the atmospheric processes. Earth and the surrounding atmosphere are described as a multi-dimensional cell-divided system, in which cells' dimensions are defining the model's resolution, standing for a certain portions of real space. The basic scheme is represented in Figure 1.1.

Figure 1.1 - Numerical model's basic functioning scheme

Nowadays this idea is always the same but since then the whole discipline has been affected by a huge growth, following the developments in technology and computer sciences.

From a mathematical point of view, NWP is a typical problem determined by its initial and boundary (in case of limited area modelling) conditions. The uncertainty in these last two subjects is the first source of error in the weather modelling. Thus, a better comprehension of the atmospheric dynamics and more

INPUT ALGORITHM: differential equations to the partial derivatives OUTPUT

(8)

8

reliable input data from observations, obtained from weather-balloons, oceanic buoys, planes, satellites and other types of surveys, allowed a steps forward as well, together with the possibility to use more powerful computers as a practical mean to carry out the calculations required to predict the weather changes.

On every "scale" on which models are operating, from global to local, forecasts' consistence is strictly bounded to a good quality input data for the "initial conditions", which consists in the initial state for the all the model variables', and together with the "boundaries conditions" set the starting point for the NWP Models calculations over a limited area; the Global Circulation Models are the ones which usually provide initial and boundary conditions, because of their capability to simulate worldwide atmospheric dynamics operating at a coarser scale than Limited Area Models (LAMs): this procedure is called "Nesting" and it consists in the method by which finer resolution local models get their input data for boundaries and initial conditions, by the "embedding" of this into the grid of a coarser resolution model, which is going to be the parent model containing in its domain the area covered by the LAM.

The concept of spatial and temporal scale in the weather science and in the general description of natural phenomena is of crucial importance: small scale phenomena may be deeply influenced by the global circulation and, vice versa, some really localized phenomena, e.g. a convective cell producing a thunderstorm, can provide an important contribution to large scale energetic balance, being a big source of dissipation though the relatively contained dimensions in the respect of bigger circulation patterns. Thus, the capability to describe both kind of phenomena with the right meaning is a fundamental task; duration in time, spatial range of action and the peaks distribution of kinetic energy over the frequency are some criteria by which the division and the following descriptions are carried out:

 MICROSCALE: Inherent to phenomena acting on a small portion of space, in the order of tens of kilometers, including all the convective and turbulence-lined phenomena which are as well of a relevant interest.

(9)

9

 MESOSCALE: covering distances in the range of hundreds of kilometers considering times from one to three days and focusing on the passage on atmospheric fronts and consequent perturbations, hurricanes and storm of the middle latitudes.

 SYNOPTIC SCALE: This concerns phenomena during about 12-24 hours, with the dimensions of hundreds of kilometres, covering, e.g. portions of continents and oceans.

 PLANETARY SCALE: Concerning the whole globe in terms of space and a month or more as a time interval; the use of this kind of scale is prevalent in the field of climatology and in the weather forecast of long time events, like e.g. Rossby Waves.

Further information about the main components of numerical models functioning for weather forecast will be described hereafter.

Dynamical core

The first one is the "Dynamical Core", which contains all the Governing Equations (primitive equations) describing the atmosphere behavior, sided by "Physical Parameterizations" and the "Numerical Procedures" used as equations solving method. The dynamical core's equations follow a mathematical principles based on the Eulerian Method, specifically considering at fixed location on Earth and evaluating the derivatives with respect to time for each equation, describing a specific variable, for all the cells covering the spatial domain (x, y). The 3rd-dimentional coordinates are expressed on the vertical level depending on the atmospheric pressure value (x, y, P), because this method allows calculations' advantages and the simplification of some processes, as friction and adiabatic heating, together with the earth's curvature consideration. The fundamental equations are derived from some basic principles: conservation of motion (momentum), conservation of mass, conservation of heat (thermodynamic energy), conservation of water (mixing ratio/ specific humidity) in different forms and conservation of other gaseous and aerosol material.

(10)

10

These equations describe the variability in time and space of variables such as temperature, humidity, atmospheric surface pressure, wind and turbulent-kinetic energy, which are considered as "Prognostic" variables (and also the respective equations), predicting each value for some time in the future on the basis of the values at the current or previous times. Conversely, the values for the so called "Diagnostic" variables are obtained gathering and evaluating, for a specific time, the prognostic equations results, obtaining the forecasts for clouds and many other variables.

Numerical Procedures

Numerical procedures, by the mean of complex calculations, allow to integrate model's differential equations forward in time, providing a solution (variables' values), which consist in an approximation of every considered term. There are several way to describe, in a mathematical, language the space in which the thermodynamic and atmospheric dynamic equations operate, but the one considered in this work is the "Grid Point Method" or the "Finite Difference" method: this is the oldest and most used method and with this the forecast are solved in a gridded system of geographical coordinates (geodesic grids and latitude and longitude grids) which are systematically projected over the portion of atmosphere concerning the model's domain, providing a 3D-organized scheme of points representing the target zone (it can be viewed as a three-dimensional array of cubes); for each one of the points calculations will be held, creating an area-average over a grid box: the derivatives' values are calculated inside or around the grid cube, depending on the type of the considered physical quantity. The size (grid spacing ) of each box vary together with the resolution of the model and the number of point strictly depends on that parameter, remarkably differing between Global and Local models.

(11)

11

Figure 1.2 - Comparison of different NWP resolutions

Forecast values for meteorological variables in each cube are derived from the current value of the surrounding cubes; for the boundaries of the domain and for the initial conditions data from other sources are required, given that the values in object can't be provided by the model itself: for this task other models' data or observational data are used.

Hydrostatic and non-hydrostatic models

Model types are divided in two different categories: the ones considering the hydrostatic equilibrium which have the downward weight of the atmosphere balanced by the upward-directed pressure gradient force, called the "Hydrostatic models", that are used for synoptic and global-scale systems and sometimes for mesoscale domains. The others are the "Non-Hydrostatic" models, which are used when the horizontal wavelength of atmospheric phenomena is approximately equal to the height range of these the non-hydrostatic processes and their effects gain importance, mainly considering the features which are approximately 10 km and less in size (owing to the troposphere height and to the primary importance that this part of the atmosphere has in the meteorological phenomena); in this way all the processes which develop because of vertical motion (due to buoyancy changes and other vertical accelerations) are described, allowing the prediction of small-scale processes, as the ones due to convection and others cloud linked phenomena, while all the large scale motions and their effects over atmospheric conditions are partially neglected. It is longer to run than hydrostatic model with same resolution and domain size: the choice to use one kind or the other is strictly

(12)

12

linked to the target, whether the main interest is focused on small or large scale phenomena.

Physical processes parameterization

Whenever model can't describe, for lack of adequate spatial-temporal resolution, a certain kind of phenomena which actually have a significant role in the whole system thermodynamical behavior, it becomes necessary taking into account the effects that these processes have over the balances of the system.

Convective processes and Microphysical processes are sub-grid dimensions phenomena which are usually parameterized in NWP models; the first are about redistribution of temperature and moisture, via water transformation phases, primary dealing with atmospheric instability: only considering non-hydrostatic models operating at scales equals or smaller than 5 km convection can be explicit without producing unrealistic features. The subject will be dealt specifically for the study case in the following chapter.

(13)

13

2.2. WRF MODEL

The atmospheric model WRF (Weather Research and Forecast) is an open source code created and developed in the 1995 by the National Center for Atmospheric Research (NCAR), the National Oceanic and Atmospheric Administration (NOAA), the U.S. Air Force, the Naval Research Laboratory, the University of Oklahoma and from the Federal Aviation Administration.

Specific features of this operating system will be presented in the following lines, focusing the attention on the particular settings used by CIMA Foundation.

2.2.1. FEATURES

WRF is a next-generation mesoscale non-hydrostatic forecasting system able to operate in a wide range of spatial resolutions, from the finer scales of hundreds of meters to coarser resolutions of hundreds of kilometers. Designed both for operational and research applications the two WRF variants were called the Advanced Research (ARW version of WRF) WRF-ARW and the NCEP’s Non-hydrostatic Mesoscale Model (NMM) WRF-NMM; the model has been interested with a continuous developing since its first release at the end of December 2000. WRF is used in several fields both for short term weather forecasts and for long term climate scenarios previsions, reproducing real conditions or idealized conditions for research activity.

The model operates in two phases: first the domain(s) is configured, inserting input data and setting the initial conditions, while in the second phase the model runs allowing the dynamical solver and physics (microphysics, radiative processes, Planetary Boundary Level) for atmospheric processes to actually produce the forecast.

(14)

14

Figure 2.1 - WRF Modelling System Flow Chart

WRF includes data assimilation processors (3DVAR, 4DVAR): in the field of numerical weather prediction data assimilation is a method that combine observations of meteorological variables such as temperature and atmospheric pressure with prior forecasts in order to initialize numerical forecast models.

WRF can be frequently used in the field of atmospheric sciences: in our specific case at CIMA Research Foundation the main focus is on hydro-meteorological research applications. Forecast are carried out on behalf of ARPAL together with ICPD as described hereafter.

CIMA Foundation runs WRF with two different configurations:

WRF - 1.5 km OL : An open-loop configuration without data assimilation with three domains at different spatial resolution 13.5 , 4.5 , 1.5 km (this last one is the one issuing with this thesis) with 50 different vertical levels. Data for analysis and boundary conditions with hour-per-hour frequency are provided by Global Forecasting System (GFS) with 0.25 degrees resolution. The model runs a 48 hours forecast for every day, starting from 00 UTC, providing in this way two different forecast for every single day.

(15)

15

Figure 2.2 - WRF-1.5km OL - Open loop configuration (without data assimilation) with 3 domains respectively having spatial resolution 13.5, 4.5

and 1.5 km with 50 vertical levels.)

WRF - 2.5 km 3DVAR: Configuration equipped with cyclic data assimilation with three different domains respectively at a 22.5 , 7.5 and 2.5 km with 50 different vertical levels. Data for analysis and boundary conditions as already seen are provided by GFS with a 3 hours frequency and resolution of 0.25 degrees (not analyzed in this thesis).

Global Forecasting System

There are several global models producing data which are used as initial and boundary conditions for higher resolution forecasts: in this thesis WRF is initialized with data from GFS (Global Forecasting System); this global model is the masterpiece of the National Center for Environmental Protection (NCEP) and it provides both deterministic and probabilistic forecasts for 16 days forward in time; for longer time previsions (reaching one or two weeks ) the grid resolution

(16)

16

drops to 70 km. The Global Data Assimilation System (GDAS) initializes GFS with satellite data and conventional observations from global sources, providing the model with initial conditions.

Figure 2.3 - Global Model, scheme of three-dimensional grid cube (courtesy of the COMET program)

GFS creates a dataset for both atmospheric and land-soils variables ranging from wind, temperatures, precipitations, soil moisture, ozone concentration etc. GFS model produces a regular grid of data points with a spacing of 25 km.

Four separate cores compose GFS's structure, giving to the model the "coupled" feature: an atmosphere model, an ocean model, a land/soil model, and a sea ice model work together providing an effective representation of global weather conditions; in fact a "coupled model" stands as the result obtained by gathering different models representing different phenomena involved in the earth system, allowing therefore to consider the deep interactions among those and providing a more realistic picture of the reality.

WRF functioning

WRF finds solutions to the governing equations using an explicit high-order Runge-Kutta time-split integration scheme (Wicker, L. J. et al, 1998) in two horizontal dimensions, with an implicit solver in the vertical direction.

(17)

17

WRF as all NWP models includes a series of parameterizations, and thanks to those all the sub grid processes are included in the whole model structure and thus represented in model's reality: starting the description of those from the "planetary boundary layer" (PBL), that is the lowest part of the atmosphere, directly influenced by the short time changes in surface radiative forcing and other physical quantities such as flow velocity, temperature, moisture etc.; having rapid and turbulent fluctuations with a strong vertical mixing; the "geostrophic" approximation is not valid at PBL because the wind is mainly affected by surface drag and turns across the isobars (high turbulence conditions).

The lowest part of the PBL is the atmospheric surface layer and in this zone mechanical generation of turbulence (shear) exceeds buoyant generation or consumption; turbulent fluxes and stress are nearly constant with height in this layer.

The earth/land surface layer is interested with infiltration, internal soil moisture fluxes, internal soil heat fluxes, and gravitational flow, which all are really significant processes for the free atmosphere and PBL.

These three adjacent layers are connected through some key processes (Figure 2.4): heat and moisture's coefficients are provided from the atmospheric surface layer to the land surface layer, and this last provides land-surface heat and moisture fluxes to the PBL; the atmospheric surface layer supplies friction stress and water-surface fluxes of heat and moisture to the planetary boundary layer.

Figure 2.4 - Planetary boundary layer, and atmospheric surface layer dailycycle (courtesy of the COMET Program)

(18)

18

WRF model includes all these processes with parameterizations, which allow calculations of surface heat and moisture fluxes by the land-surface models and surface stress. For water surfaces the fluxes and surface diagnostic fields are computed in the surface layer scheme itself. These schemes provide the model only with stability-dependent information about the surface layer for the land surface and PBL schemes; for some of these is required a surface layer thickness representative of the actual surface layer (e.g. 50-100 ms).

Providing heat and moisture fluxes over land and sea-ice points is the task of land-surface models (LSMs), using for this aim atmospheric information from the surface layer scheme, radiative forcing from radiation scheme, precipitations forcing from the microphysics and convective schemes, surface temperature, water vapor and wind from the PBL scheme, together with internal information on the land’s state variables and land-surface properties.

Figure 2.5 - Main interactions between planetary boundary layer, the atmospheric surface layer and the land surface layer

(courtesy of WRF-ARW tutorials)

The vertical transport has its boundaries conditions set by the values of that fluxes. The land-surface models have various degrees of sophistication dealing with thermal and moisture fluxes in multiple layers of the soil and also may handle vegetation, root, and canopy effects and surface snow-cover prediction. The land-surface model provides no tendencies, but does update the land’s state variables which include the ground (skin) temperature, soil temperature profile,

(19)

19

soil moisture profile, snow cover, and possibly canopy properties. There is no horizontal interaction between neighbouring points in the LSM, so it can be regarded as a one-dimensional column model for each WRF land grid-point, and many LSMs can be run in a stand-alone mode.

Figure 2.6 - Direct interactions of parameterizations, with special focus on surface related ones (courtesy of WRF-ARW tutorials)

Talking about the Planetary boundary layer Model in the set of equation for turbulent flow the number of unknowns is larger than the number of equations, therefore there are unknown turbulence terms, which must be parameterized as a function of known quantities and parameters (Sommeria 1976). This problem is known as closure problem and much of the numerical modelling of the turbulent atmosphere is related to the numerical representation (or parameterization as a function of known quantities and parameters) of these fluxes (Wyngaard, 2004). Closure can be local and non-local: for local closure, an unknown quantity in any point in space is parameterized by values and/or gradients of known quantities at the same point; for non-local closure, an unknown quantity at one point in space is parameterized by values and/or gradients of known quantities at many points in space; additionally the use of first-order closure schemes for evaluating turbulent fluxes is common in many boundary layer, mesoscale, and general circulation models of the atmosphere.

The planetary boundary layer (PBL) model/parameterization is responsible for vertical sub-grid-scale fluxes due to eddy transports in the whole atmospheric

(20)

20

column and not in the boundary layer exclusively. Thus, when a PBL scheme is activated, explicit vertical diffusion is de-activated with the assumption that the PBL scheme will handle this process. The surface fluxes are provided by the surface layer and land-surface schemes. The PBL schemes determine the flux profiles within the well-mixed boundary layer and the stable layer, and thus provide atmospheric tendencies of temperature, moisture (including clouds), and horizontal momentum in the entire atmospheric column. Most PBL schemes consider dry mixing, but can also include saturation effects in the vertical stability, that determines the mixing. The schemes are one-dimensional, and assume that there is a clear scale separation between sub-grid eddies and resolved eddies. This assumption will become less clear at grid sizes below a few hundred meters (LES mode), where boundary layer eddies may start to be resolved, and in these situations the scheme should be replaced by a fully three-dimensional local sub-grid turbulence scheme.

WRF PBL schemes can be:

● Based on turbulent kinetic energy prediction ● Diagnostic non local

For what concerns Microphysics parameterizations those have a wide use also considering atmospheric schemes for NWP; those uses include the representations of vapour, cloud and precipitation processes: WRF model is quite versatile allowing to accommodate any number of mass mixing-ratio variables and other quantities. The representation of the these scalars is made through four-dimensional arrays (three spatial indices and one species index).

WRF offers microphysics parameterization options with different level of sophistication:

● Warm rain (i.e. no ice) – Kessler (idealized) ● Simple ice (3 arrays) – WSM3

● Mesoscale (5 arrays, no graupel) – WSM5

● Cloud-scale single-moment (6 arrays, graupel) – WSM6, Lin, Goddard, Eta-Ferrier

(21)

21

● Double-moment (8-13 arrays) – Thompson, Morrison, WDM5, WDM6, NSSL

Single-moment schemes have one prediction equation for mass (kg/kg) per species with particle size distribution being derived from fixed parameters. Double-moment (DM) schemes add a prediction equation for number concentration (#/kg) per DM species.

Figure 2.7 - Key microphysics and radiation processes

The last section concerning the parameterizations is the one about radiation: this counts both shortwave and longwave radiation in the ground heat budget, involving for the longwave the infrared and thermal radiation absorbed and emitted by gases and surface and, for the shortwave, this includes visible and surrounding wavelengths of the solar spectrum and the albedo (upward flux). Radiation in the model structure is a function of the presence of clouds, carbon dioxide, ozone and trace gas concentrations (not always) and it is represented as a set of one dimensional columns, each treated independently.

For longwave schemes of parameterizations (computing clear-sky and cloud upward and downward radiation fluxes):

● Consider IR emission from layers ● Surface emissivity based on land-type

(22)

22

● Flux divergence leads to cooling in a layer

● Downward flux at surface important in land energy budget ● IR radiation generally leads to cooling in clear air (~2K/day), ● Stronger cooling at cloud tops and warming at cloud base

The shortwave schemes compute a clear-sky and, including annual and diurnal solar cycles, they also compute solar fluxes, considering (for most of the schemes) downward and upward fluxes, and having a primarily warming effect in a clear-sky, which is important for the surface energy balance.

In the field of NWP, owing to the scale at which model are operating some phenomena can be directly represented while others have to be parameterized: mesoscale modelling, including nowadays cloud permitting and cloud resolving applications, and the large eddy simulation (LES) are two classes of models operating at different resolutions (Lilly 1967; Deardorff 1970).

Skills of NWP are deeply influenced by the unresolved processes: recent developments in computer technology allowed models to operate at really high resolutions (order of kilometres) aimed to develop the research in this field but also for an operational use. Additional advantages by applying high resolution and such convection-resolving models (CRMs) are a better representation of topography, surface fields, and boundary layer processes: regions where orography is a dominant factor are deeply influenced by this fact, having an effect on the fields of variables produced by the model. (Foresti et al., 2013; Foresti and Seed, 2015)

WRF Preprocessing System (WPS)

GFS initial and the boundary conditions are provided to WRF model through a separate package called the WRF Preprocessing System (WPS), which is constituted of a set of programs that take terrestrial and meteorological data (usually in GRIB format, a concise data format commonly used in meteorology to store historical and forecast weather data) transforming the input for the ARW pre-processor for real cases program.

(23)

23

Figure 2.8 - : WPS rationale (courtesy of WRF-ARW tutorials)

Geogrid defines the simulation's domains and interpolates various terrestrial datasets with the model grids. The simulation domains are defined using information specified by the user in the “geogrid” namelist record of the WPS namelist file, namelist.wps. In addition to computing the latitude, longitude, and map scale factors at every grid point, geogrid will interpolate soil categories, land use category, terrain height, annual mean deep soil temperature, monthly vegetation fraction, monthly albedo, maximum snow albedo, and slope category to the model grids by default.

The ungrib program reads GRIB files, "degribs" the data, and writes the data in the intermediate format. The GRIB files contain time-varying meteorological fields and are typically from another regional or global model, such as IFS-ECMWF, or GFS models.

The metgrid program horizontally interpolates the intermediate-format meteorological data that are extracted by the ungrib program onto the simulation domains defined by the geogrid program. The interpolated metgrid output can then be ingested by the WRF real program. The range of dates that will be interpolated by metgrid are defined in the “share” namelist record of the WPS namelist file, and date ranges must be specified individually in the namelist for each simulation domain. Since the work of the metgrid program, like that of the ungrib program, is time-dependent, metgrid is run every time a new simulation is initialized.

The inputs a to the ARW real-data processor (real. Exe) from WPS contains 3-dimensional fields (including the surface) of temperature (K), relative humidity (and the horizontal components of momentum (m/s, already rotated to the model

(24)

24

projection). The 2-dimensional static terrestrial fields include: albedo, Coriolis parameters, terrain elevation, vegetation/land-use type, land/water mask, map scale factors, map rotation angle, soil texture category, vegetation greenness fraction, annual mean temperature, and latitude/longitude. The 2-dimensional time-dependent fields from the external model, after processing by WPS, include: surface pressure and sea-level pressure (Pa), layers of soil temperature (K) and soil moisture (kg/kg, either total moisture, or binned into total and liquid content), snow depth (m), skin temperature (K), sea surface temperature (K), and a sea ice flag.

2.2.2. WRF FORECAST DATA

The output consists of a set of meteorological variables for the considered domain and period of time. In this work we consider model's output for the year 2018 on the Italian territory of Temperature and precipitation.

Both these datasets used for the thesis are provided by CIMA Foundation; the format used is "The Network Data Form" or NetCDF.

Datasets for both variables evaluated in this Thesis contains day by day the hourly value of temperature at 2 m (T2m hereafter) and rain at the surface, over the considered domain: for every hour a matrix composed of 888 x 1184 pixels contains the values forecast by WRF; model's forecasts are 49 hours long, covering two days time: we will refer, for every single day, to "Run 1" as the first 24 hours of the forecast and, for the same day, "Run 2" is the 24 hours long slot of the previous day, from the 25th to the 49th instant. The matrices containing the values for these two meteorological variables are the starting point for the model validation treated in this thesis.

(25)

25 2.2.2.1. PRECIPITATION

WRF output for precipitation (RAINNC) consists of the cumulated rain over every pixel (1.5x1.5 km2) composing the grid (888 x 1184) covering the considered domain; hour by hour a matrix containing the Quantity Precipitation Forecast (QPF) is created by WRF: a map showing the QPF for a specific month, day and hour of the 2018 is obtained plotting this matrix with the reference of geographical coordinates (a grid with the same matrix dimensions).

For this thesis we will consider daily, monthly and seasonal precipitation for the Italian territory: in the following chapters all the processes and operations carried out on QPF matrices will be dealt.

(26)

26 2.2.2.2. TEMPERATURE

The temperature data considered in this work are the values calculated by the model at a height of 2 metres from the ground (T2m), that is the one commonly used talking about surface-level atmospheric temperature; the model produces temperature values in Kelvin degrees.

Such as QPF, the temperature dataset is composed by hourly matrices for the whole 2018, with

888 x 1184 pixels of 1.5x.15 km2 each composing the grid covering the domain.

(27)

27

2.3. OBSERVED DATA

To perform a NWP validation a term of comparison for the forecast data is required: for this reason dataset of observed values for temperature and precipitation are necessary.

For managing risk prevention linked to natural phenomena an efficient network for hydro-meteorological data collection represents a fundamental tool. A real time monitoring is a well-based method in order to evaluate the state of the environmental systems, enhancing the comprehension of those factors that could turn into risky situations: this is true for the present but also for an evaluation ad posteriori of past critical situations; a better understanding of the natural system functioning is helpful in assessing possible critical scenarios for future.

In this study two different ground-based observing systems are considered, namely: hydro-meteorological stations and radar.

Hydro-meteorological stations Italian network

A hydro-meteorological station provides a continuous measurement of weather variables, storing and transmitting data with the aim to create a database for a certain place. Usually the time interval for measurements collecting spans from 1 minute to one hour, owing to the type of sensors included; this tool can provide users with different information:

 Liquid precipitation with rain gauges

 Temperature with thermomethers

 Wind strenght and direction with anemometers

 Snow with nivometers

 River height level with hydrometers

Two independent networks for real data collections cover the whole Italian territory; conventionally they are named "Functional Centers Network" (NPCD) and PAUN Network, for a total amount of 5222 weather stations.

(28)

28

The current database consists in 5222 stations, counting 3551 stations from PAUN and 4881 from Functional Centers (some of which belong to both the networks) , collecting data since 2004 and 2006 respectively .

Unfortunately there isn't an unique network with unvarying registry for the whole Country: the registry for weather stations contains one by one all the inherent information as the type of sensors by which the station is formed, locations, altitude, geographical coordinates and so on; having a not-uniform registry creates problems with updates and consistency of the database, especially the troubles are concerning the frequency by which data are collected and reach the national database. Every station sends data with a certain procedure and frequency.

Each Region manages the respective data measurements in the local area, collecting from each sensor with the mean of a software (ActiveDVD) and following several steps; passing from the central server located in Rome the National data arrive to CIMA Foundation and then are published on Dewetra 2.0. Almost the same passages are followed for the PAUN network data: both these slot are then published on the official myDewetra Platform, which is an integrated system used for real time monitoring purposes and for prevision and prevention of risks concerning natural phenomena on the Italian territory: myDewetra Platform is property of ICPD and developed by CIMA Foundation.

Thermometers

Among the weather stations, which cover the considered spatial domain the numbers of thermometers, is 3534, spread homogeneously on the Italian territory and providing the value for the variable "2m Temperature" with an hourly time resolution.

(29)

29

Figure 2.10 - Thermometers distribution

Rain Gauges

Rain gauges belonging to the observational network covering the National territory collect data which are representative for the value of the liquid precipitation in a point; the total amount for these sensor on the Italian territory for the 2018 reachs the 4366 points of data collecting.

(30)

30

Figure 2.11 - Rain Gauges distribution

National Radar Network

The National Radar Network (NRN) is a mosaic formed by data from all radars on Italian territory, some owned by ICPD, by ENAV and Military Aeronautic.

A radar device allows the detection of the position of an object by comparing a reference signal emitted by a transmitter and the return signal (echo radar), retransmitted by the object and analyzed by a receiver. A meteorological radar allows to detect, within an atmosphere volume, the presence of hydrometeors, that is of raindrops, which influence the propagation of the radar beam through electromagnetic waves. A meteorological radar is therefore constituted by a transmitter of electromagnetic waves of high power. The emitted radiation is concentrated in a beam, generally of amplitude between 1 ° and 2 °, through an antenna that also performs the function of receiving energy re-radiated by hydrometeors. The pointing direction of the antenna can vary both in azimuth and in elevation, in order to perform volumetric scans. The radiation emitted by a

(31)

31

meteorological radar consists of a train of pulses, generated by a pulse modulator, with a particular repetition frequency called Prf (Pulse repetition frequency). A radar usually transmits pulses lasting about one millionth of a second, separated by millisecond intervals, during which the re-radiated signal can return to the receiver, before the next pulse is issued. Based on the intensity of the radar echo, a processing system assigns an intensity to precipitation in the atmosphere that allows the expert user to identify the importance and location of the phenomenon.

Figure 2.12 - National radar mosaic, example

The choice to operate with a specific frequency results from a compromise between the conditions of propagation of the electromagnetic wave in the atmosphere and the technical construction requirements that characterize the instrument. In fact, the wavelength, and therefore the frequency, affects both the attenuation of the signal during propagation and in the antenna sizing phase. At our latitudes we mainly use the C band which guarantees good performance in terms of observation capacity and allows us to obtain a beam with an opening

(32)

32

degree with a diameter of the order of four meters. The meteorological radars that work in C band can provide qualitative information up to about 200 Km. However, precipitation estimates are possible in narrower ranges of around 100 km.. With a resolution of 1 km, a weather radar produces information equivalent to that provided by about 10 thousand rain gauges. Complete radar coverage allows precipitation amounts to be calculated more accurately over the entire surface, helping to improve water management. In addition to the operational aspects, the radar information is crucial for the study of the precipitation regime, for the general research on the climate of the region in function of a better protection of the environment, for the research in the field of physics of the clouds, as well as short-term weather forecasting, or nowcasting. The NRN mosaic product of interest for this study is the SRT [mm] (Surface Rainfall Total) - 60 min. From a series of SRI maps it is possible to derive a map called SRT which represents the total rainfall that fell to the ground with reference to the integration period. The cumulated generated are relative to 1h; 3h, 6h, 12h and 24h.

(33)

33 III. DATA ANALYSIS

In the following lines the processes applied to datasets described in the previous chapter are presented; the matrices containing data referring to the whole Italian territory for the year 2018 considering both rain and temperature values have quite huge dimensions, so it emerged the necessity to organize data in a meaningful way owing to this thesis's purposes: the tool used for the management of these datasets is Matlab, a programming platform designed for scientific and engineering tasks that allows the processing of big datasets with the use of a specific language based on the employment of matrices, analyzing data, developing algorithms and creating models and other kind of applications (© 1994-2019 The MathWorks, Inc.); specifically "matfiles" are the format by which Matlab stores data, "scripts" are a series of commands by which the variables in the dataset can be elaborated and function are algorithms that allow, starting from an input variable, to obtain an output variable after a series of procedures.

Owing to the type of the analysis which is performed data were organized, processed and displayed depending on specific time steps and other criteria apt to satisfy the purposes of this work: in the following paragraphs these methods are

showed one by one.

3.1. PRECIPITATION

The hourly dataset containing values collected by rain gauges is processed with the aim to make it more compliant with the real precipitation field spatial patterns which, of course, is not represented with a punctual information; the final product for Quantitative Precipitation Estimation (QPE) used in this study is based on the merging of ground based rainfall data provided by both aforementioned rain gauges networks for the Italian territory (stations belonging to both networks are counted only once) together with the radar observations.

(34)

34

The 60 minutes accumulation data from the radar source and from rain gauges are put together with a merging operation, obtaining a product which has a better accuracy in respect to the single datasets: in the gridded map covering the whole domain pixels containing a rain gauge are identified, the gauge-radar merged data are calculated for each square individually: the interpolated value at an un-gauged pixel is a weighted sum of the nearest neighbor gauge accumulation.

"Rainfusion" is the method used by CIMA Foundation for combining radar and rain gauge data (Pignone et al, 2013):

Considering that every station gives an information for a single point over the land surface an interpolation for the observational data is needed: by default precipitation maps after interpolation shows cumulative observations over the last 24 hours interpolated by GRISO (Rainfall Generator of Spatial Interpolation from Observation). The estimation of rainfall fields, especially its spatial distribution and position is a crucial task both for rainfall nowcasting and for modeling catchment response to rainfall. In the past several studies on the spatialization of rainfall from rain gauge were mad, and many methods were invented. The most known one belonging to geostationary family is the Kriging (Matheron, 1967). The new algorithm GRISO is similar to Kriging so the output map maintains the observed "real" rainfall value on the rain gauges position but is conditioned to reach the mean of the field far from the gauges. The main innovation is the improved computational time, the associated map of variance and above all the possibility of using more than one semivariogram for spatialize the information. The semivariance represent how close one gauge corresponds to another gauge in the neighborhood and a variogram (semivariance / distance) is a chart describing this effect for a certain area.

Data used in this thesis are the product of these processes. QPE output for precipitation consists of the cumulated rain over every pixel composing the grid (550 x 600) covering the considered domain, hour by hour at a spatial resolution of 1.5 km2.

For this thesis we will consider daily, monthly and seasonal precipitation for the Italian territory. In the following chapters all the processes and operations carried out on QPE matrices will be dealt.

(35)

35

Monthly cumulative precipitation

The first step we proceeded with in this work was the evaluation month by month of the model performances, calculating all year long for the whole territory, the monthly cumulative precipitation (mm/month) both from the observed and forecast (RUN1 and RUN2) dataset; dissimilarity between QPF and QPE for every month was measured and evaluated employing a normalized BIAS index:

This provides a proportioned measure of how much the model previsions mismatch the real cumulative merged precipitation values registered by rain gauges, quantifying this as a relative error %, positive or negative. The results are displayed with relative error maps, that show for every month QPF, QPE and the BIAS%; reminding that for every single day the model produces two different forecasts (RUN1 and RUN2) this comparison is made with both of these.

The purpose of this monthly analysis is to evaluate the performances for the whole territory in a relatively short time span: considering the monthly cumulative values is possible to evaluate the short time variability and which could possibly be the recurring errors in a relatively short fraction of the whole considered period, allowing a full perspective view of the entire domain.

QPF dataset consists in a daily mat-files containing hourly values of cumulative rain over the 1.5x1.5 km2 of the 888x1184 grid covering the Italian territory for all the months of the 2018; every file contains 49 instant values of forecast rain. The work was carried creating a script that allowed, day by day, the creation of two different mat files, with the aim to split the single 48 hours long forecast into two run of 24 hours, RUN1 and RUN2, respectively referring to the day in object and the following one. Once we had for every day the two QPF files (RUN1 and RUN2) a monthly sum was performed on each list, obtaining a monthly cumulated rain value.

(36)

36

QPE dataset is as well hourly and was processed cumulating hour by hour and day by day the values of the cumulated rain with a sum (considering a 24 hours long dataset, therefore avoiding the splitting between RUN1 and RUN2).

Owing to the fact that the QPF and the QPE are represented with two different grids (888x1184 and 550x600 respectively) it becomes necessary to interpolate the monthly cumulated QPE grid over the other, allowing a perfect superimposition of the two; the method used is the "Nearest neighbour" means that we assign to an empty pixel the value of the nearest one, reshaping the monthly observed matrix over the grid compatible with the geographical coordinates representation suitable for Dewetra and QPF data.

Once we obtained three different matrices for every month (QPE, QPF RUN1, QPF RUN2) these were plotted as showed in Figure 4.1, with the relative error. In the next chapter all maps are displayed.

Seasonal cumulative precipitation

Starting from the matrices produced with the previous procedure in this section we perform a seasonal analysis of the cumulative for the whole Italian territory; considering the monthly matrices maps showing the seasonal error were produced, cumulating for the Winter values from January and February (Data from December 2017 which would be suitable for completing the season were not available), March, April and May for Spring, June, July and August for summer and finally September, October and November for Autumn. The meaning of this analysis is to establish which is the season when WRF model works better and together when and where the errors occur most.

Fuzzy-logic analysys

The collaboration with ARPA Piedmont a Fuzzy-logic analysis was performed for meteorological data produced by WRF and collected by the observation network, allowing an innovative evaluation of the simulating performances of WRF(Ebert et al., 2008). This kind of analysis allows a comparison between observed and forecast data, with the advantage that the final error is not calculated by the rough difference between the forecast and the observed data, but allowing a window of approximate agreement between the two parts, without applying too

(37)

37

binding constraints. The employed method for this kind of comparison compares QPE and QPF considering 10 thresholds of rain intensities, for examples, 0.1, 0.2, 0.5, 1, 2.5, 5, 7.5, 10, 12 and 15 mm/3h (these are the ones used for WRF) and checking the agreement between WRF data and the observed ones starting from a single pixel (1.5x1.5 km2) up to 65 (97,5km2): the results are displayed with three different statistical indices (Fraction Skill Score, False Alarm Rate and Probability of Detection), described hereafter.

This analysis was carried out with hourly QPE and QPF for all 2018 over the whole Italian territory, producing charts that summarize month by month the performances of both RUN1 and RUN1 (respectively called in the comparison d0 and d1); in the pre-processing step data of both QPE and QPF (RUN1 and RUN2) were interpolated using a 1.5 km grid spacing, considering for the QPE the data from the radar and rain gauges merging for the whole territory cutting out the involved sea surface zones and evaluating the performances only the land surface. Analysis was applied to three hourly cumulated datasets (predicted and observational).

Furthermore a comparison with the performances of COSMO I2 was carried out for the months where most of the forecasts missed the hits, in particular for the summer period, considering the time window from May to September (included); forecasts covering this period owing to the convective phenomena are the most critical ones, so it emerged the idea to compare WRF performances with the ones from a model specifically shaped for the Italian territory: COSMO I2 (Steppeler et al., 2003) is a fundamental weather model that covers the entire national territory with a 2.2 km grid pitch. An up-scaling procedure was applied in order to compare two models operating at two different scales. Following a brief description of the Fuzzy-Logic method.

High resolution meteorological models provide more realistic weather forecasts with an increasing reliability considering intensity and spatial distribution; the problem linked with high resolution forecasts are the "double penalty" errors: forecast may be really precise about when and where a particular event will happen, but if this event in the reality is slightly displaced from when and where

(38)

38

the forecasts predicted it happens an error occurs both where the event was expected and where not; even though the displacement is small the error occurs as well, weighing on model's performances even if the event was forecast as well. The event is captured at a large scale but the error captured by the small scale dominates as well the total error. This is the "double penalty" error, and it occurs also when a particular event is forecast or not. Model's performances are evaluated comparing forecasts with observed events: it up to remind that data from observational networks suffer from up-scaling and interpolation errors, which are due to the high variability of the precipitation fields; so a perfect matching among forecast and observed gridded maps is not practically possible.

Fuzzy verification techniques require that the forecasts are in approximate agreement with the observations, meaning that forecasts are close in time, space, intensity of some other important aspect. The measurement of the strength of agreement as the closeness requirement is varied, assuming that it is acceptable for the forecast to be slightly displaced but remaining useful as well. The degree of neighbourhood measures how much the displacement among observations and forecasts can be wide. Depending on the grid spacing, time and resolution together with the meteorological situation the window size can have a certain dimension: this techniques allow the neighbourhood size to vary, providing thereby information on forecast quality as a function of scale.

In order to describe the fuzzy techniques within a common verification framework, it is first necessary to establish some notation. Let X be the observed value in a grid box, and Y the corresponding forecast value for that same (single) grid box. < >s denotes any value representing the neighbourhood surrounding the

grid box of interest, where s is the scale. An overbar (−) indicates the average value. Many verification metrics are based on the agreement between forecast and observed events. A weather event might be the occurrence of a tornado or a thunderstorm. For verification of rainfall or wind forecasts an event is typically defined by the occurrence of a value exceeding a certain threshold. This is the convention we will use here. Let Ix be the indicator (1 = yes, 0 = no) for the

observed event in a grid box, and Iy the indicator for the forecast event in the grid

box. <Ix>s and <Iy>s are the indicators for the observed and forecast events in the

(39)

39

neighbourhood event <I>s, depends on the particular fuzzy verification method

being used, as we shall see.

Some methods require that <I>s be discrete, i.e. 0 or 1, while other methods

allow intermediate values.

The assumption of equal likelihood of forecasts within a neighbourhood of a grid box lends itself well to a probabilistic interpretation. If we estimate the probability of an event at a grid box by the fraction of events within the window, we can then employ probabilistic verification metrics and methods. We denote <Px>s to be the fraction of grid boxes in a neighbourhood with observed events

and <Py>s to be the fraction of the neighbourhood with forecast events:

where n is the number of grid boxes in the neighbourhood. Finally, rather than comparing forecast and observed events, one could compute a scale-dependent error <E>s based on a comparison of grid box values <X>s and <Y>s in the

window. The framework for fuzzy verification is quite simple.

1. Select a set of scales with indices s = 1, 2, . . , S and event intensity thresholds indexed by k = 1, 2, . . , K over which to compute the fuzzy verification results.

2. For each scale s:

a. For each observation (where we use this term loosely to mean either a point observation or an analysis grid box) collect the gridded forecasts within the window of scale s surrounding the observation. If the method is a ‘neighbourhood observation-neighbourhood forecast’ method, also collect the corresponding observations within the same window.

(40)

40

b. For each intensity threshold k compute scale-dependent quantities (<Ix>s,

<Iy>s, <Px>s, <Py>s, <E>s) k according to various decision models (described in

the next section), the desired then verification scores over the domain. This last one is the method used in this work.

The result is a K × S matrix of scores for each fuzzy verification method, with the scores varying in both intensity and scale. By examining the forecast performance in this way, one can determine the scale-intensity combinations at which the high-resolution forecast is useful. Fuzzy verification results can be aggregated over several forecasts, for example by month or season or across similar weather events, to look for systematic behaviour. Care should be taken to combine the verification results in an appropriate manner to represent the performance of the pooled data sample (WWRP/WGNE Joint Working Group on Verification, 2007).

All Fuzzy techniques allow multi-scale evaluation, in the way to find the resolution at which model is more reliable. The charts presented show the spatial scales (km) and the cumulated precipitation in the time interval (mm / 3h) on the x-axis.

The used statistical indices are: The Fractions Skill score (FSS), the Probability Of Detection (POD) and the False Alarm Rate (FAR), which are presented in the following lines.

Fractions Skill Score

The main index summarizing the potential of a Fuzzy verification is the Fractions Skill Score (FSS). The FSS directly compares the forecast and observation (radar) on a portion of an area affected by an event (ie: precipitation exceeding a certain threshold in the unit of time) gradually increasing the spatial dimension on which the verification is carried out.

(41)

41

Where N is the number of the boxes present and P is the portion of each single box in which the event occurs (fcst -> expected, obs-> observed).

• The FSS ranges from 0 (complete disagreement) to 1 (perfect agreement). • The FSS is equal to 0 if there are no expected events but if they occur or vice versa if no events that had been foreseen have occurred.

• The FSS value above which the forecast is considered useful (better than the random data) is given by FSSuseful = 0.5 + fo / 2, where fo is the average of the

portion of the domain covered by the observed event. The smallest spatial window for which FSS ≥ FSSuseful is considered the "useful scale".

• As the dimensions of spatial windows increase, the index will tend asymptotically to a value between 0 and 1. The closer this value is to 1, the less the forecast is affected by bias.

• The FSS is sensitive to rare events (ie: intense precipitation peaks on limited areas).

Probably Of Detection

Probability of Detection (POD), is the fraction of events that were correctly forecasted to occur (range:0-1, perfect value=1); it is also known as the hit rate (H).

False Alarm Rate

False Alarm Ratio (FAR), is the proportion of forecasts of the event occurring for which the event did not occur (range:0-1, perfect value=0); it measures the probability of false detection measuring the fraction of obsesrved non-events.

These three indices have different meaning, indicating respectively the simple matching among QPE and QPF precipitation patterns, the probability of detection

(42)

42

of a certain kind of event and, finally, the rate at which is likely to forecast an event which actually doesn't occur. (Fuzzy verification of high resolution gridded forecasts: a review and proposed framework - Elizabeth E Ebert)

Method for object-based diagnostic evaluation (mode)

Evaluating model performances on single events is another step to follow accomplishing the task to inspect where and when the forecast were less accurate in the respect of the observed event: by the chance evaluating model's performances along a wide time span is useful, but the analysis of a single event is fundamental as well.

This was made using MODE, which is an algorithm that allows the comparison between QPF and QPE identifying precipitation structures in both forecast and observed fields and performs a spatial evaluation of the model capability of reproducing the identified observed objects; considering some shape parameters MODE produces indices such as centroid distance, angle difference, area ratio, symmetric difference and percentile intensity together with some classical statistical scores:

 Frequency BIAS (FBIAS), measuring the ratio of the frequency of forecast events to the frequency of observed events, indicates whether the forecast system has a tendency to underforecast (FBias <1) or overforecast (FBias >1) events. FBIAS does not measure how well the forecast corresponds to the observations, only measures relative frequencies;

 Probability of Detection Yes (PODY), is the fraction of events that were correctly forecasted to occur (range:0-1, perfect value=1);

 False Alarm Ratio (FAR), is the proportion of forecasts of the event occurring for which the event did not occur (range:0-1, perfect value=0);

 Critical Success Index (CSI), is the ratio of the number of times the event was correctly forecasted to occur to the number of times it was either forecasted or occurred (range:0-1, perfect value=1);

 Hanssen and Kuipers discriminant (HK), it measures the ability of the forecast to discriminate between (or correctly classify) events and non-events (range:-1 -1, perfect value=1);

(43)

43

 Heidke Skill Score (HSS), it is a skill score based on Accuracy, where the Accuracy is corrected by the number of correct forecasts that would be expected by chance (range: -∞ 1, perfect value=1).

These statistical parameters were derived from a contingency table that shows the frequency of "yes" and "no" rain forecasts and occurrences. The four combinations of forecasts (yes or no) and observations (yes or no) generate four different output of the table:

1) hit: rain forecasted and occurred; 2) miss: rain not forecasted and occurred; 3) false alarm: rain forecasted and not occurred;

4) correct negative: rain not forecasted and not occurred.

From this output it is possible to compute the statistical scores with the following formulations:       where:

(44)

44

Table 1. MODE indices description.

Index Description

CEN DIST

Centroid Distance: Provides a quantitative sense of spatial displacement of forecast (Best score 0).

ANG DIFF

For non-circular object gives measure of orientation errors (Best score 0)

AREA RATIO

Provides an objective measure of whether there is an over- or under-prediction of areal extent of forecast (Best score 1). INT

AREA

Area of intersection between corresponding objects (Best value equal to observed area).

UNION AREA

Total area of two corresponding objects summed together (Best value equal to observed area).

SYMM DIFF

Provides a good summary statistic for how well Forecast and Observed objects match (Best value small).

P90 INT

Provides objective measures of near-Peak (90th percentile) intensities found in objects (Best score 1).

TOTAL INTR. is the summary of statistics derived; the results proximal to 1 are positive.

This was made choosing a region in the western side of the Northern Italy called "Liguria" and focusing the attention on specific events in the considered time domain which caused the issuing of second risk level weather alert ("orange" alert), in particular :

 January 7th , threshold 72 mm

 August 24th, threshold 48 mm

 October 7th, threshold 72 mm

 October 28th,threshold 72 mm

For these events we made a comparison between forecast data (WRF, 1.5 km Open Loop) and the rain gauges - radar merging for 48 hours long time series of the maximum cumulative rain per hour, the areal average on every alert area on