Hydrocontroller: a monitoring system of the hydro resource based on statistical approaches and artificial intelligence

(1)

Dipartimento di Ingegneria dell’Energia, dei Sistemi, del

Territorio e delle Costruzioni

Corso di Laurea in Ingegneria Elettrica

Tesi di Laurea Magistrale

Hydrocontroller: a monitoring system of

the hydro resource based on statistical

approaches and artificial intelligence

ADVISOR

Prof. Mauro TUCCI

CANDIDATE Riccardo PICHI COMPANY TUTOR i-EM S.r.l. PhD. Fabrizio RUFFINI

Anno Accademico 2019-2020

(2)

(3)

Abstract

In this thesis a new basin modelling approach is described. Traditional physical models applications often struggles with parameters evaluation: a large measure-ment campaign must be conducted in the geographic area of study. Black Box modelling can overcome this problem.

Different kind of data source have been used: weather stations, satellite observa-tions and reanalysis datasets. Spatial interpolators have been applied to weather stations data: large distance between stations and basin inhibit direct data use. Satellite data consist of Soil Moisture Concentration maps of the area. Satellite data have low availability due to satellite orbits movement: to improve this aspect, a neural net ensemble has been trained. The ensemble consider as input the fol-lowing variables: air temperature, air humidity, precipitation and wind speed. The target is the mean Soil Moisture Concentration value over the area.

In the final part of the thesis a Non-linear Auto Regressive eXogenous input (NARX) model has been trained to simulate basin level. Weather and basin data (turbine flow rate and discharge flow rate) have been considered has inputs. The target variable is basin level. Different delays simulations have been made.

In this thesis a new basin modelling approach is proposed and analysed. Tra-ditional physical models applications often struggles with parameters evaluation, as they require a large measurement campaign in the geographic area of study. Black Box modelling can in general overcome this problem. Different kind of data source have been used (weather stations, satellite observations) with the aim of predicting Soil Moisture Concentration (SMC) in first place and the basin level in second place. Large distance between weather stations and basin inhibit di-rect data use, for this reason spatial interpolators have been applied to improve the geographical resolution of weather stations data. Soil Moisture Concentration maps of the area are available via Satellite data, but with low availability due to satellite orbits movement. To improve this aspect, a neural network ensemble has been trained to predict the mean SMC value over the area of interest using previ-ously improved weather data as input (air temperature, air humidity, precipitation and wind speed). In the final part of the thesis a Non-linear Auto Regressive eX-ogenous input (NARX) model has been trained to simulate basin level. Weather, predicted SMC, and basin data (turbine flow rate and discharge flow rate) have been considered has inputs. Different delays simulations have been made, obtaining satisfactory results.

(4)

List of Figures

2.1 General Description of a hydromodel . . . 10

2.2 Example of a Seasonal tank: Vagli Lake . . . 11

2.3 Example of Run-of-River plant: Borgo a Mozzano . . . 12

2.4 Example of chronological flow discharge diagram of a basin . . . . 12

2.5 Example of a duration diagram of a basin . . . 12

2.6 Scheme of a High Head plant . . . 13

2.7 Scheme of a Low Head plant . . . 13

2.8 Hydro system in the area of Garfagnana . . . 14

2.9 Serchio flooding in 2009 . . . 15

2.10 Examples of hydrodynamic curve . . . 16

2.11 Serchio basin: overview . . . 17

2.12 Pontecosi Basin . . . 17

2.13 Pontecosi location in Garfagnana . . . 18

3.1 Observation systems used by ECMWF . . . 20

3.2 Sentinel-1 . . . 20

3.3 Landsat-8 . . . 21

3.4 Satellite Image - Soil Moisture Measurement . . . 22

3.5 Acquisition Modes of the Sentinel-1 . . . 23

3.6 Example of a rain sensor plus anemometer and transmitting system 24 3.7 Example of Pt100 (RTD) . . . 24

3.8 Example of NTC . . . 25

3.9 Comparison between RTD and NTC curves . . . 25

3.10 Example of pressure sensor . . . 25

3.11 Example of air humidity sensor . . . 26

3.12 Example of anemometer with speed and direction measurements . . 26

4.1 Example of an empirical variogram . . . 28

4.2 Main characteristics of a fitted variogram . . . 29

4.3 Example of a covariogram . . . 29

4.4 An example of an RBF Mesh . . . 32

4.5 Example of RBF Contour Plot . . . 33

4.6 Different Gaussian Function with different shape parameter . . . 33

4.7 Sibson weights method . . . 35

4.8 Laplace weights method . . . 35

4.9 Human neuron structure . . . 36

4.10 Neural Net neuron . . . 36

4.11 A multi-layer feedforward network . . . 37

(7)

LIST OF FIGURES

4.13 Three different behaviour . . . 39

4.14 Validation set method . . . 39

4.15 An example of sigmoid - Logistic curve . . . 40

4.16 Hyperbolic tangent curve . . . 41

4.17 ReLu Curve . . . 41

4.18 An example of a NARX network . . . 42

4.19 Open Loop Setup . . . 42

4.20 Closed Loop Setup . . . 43

4.21 K-Fold Cross validation . . . 44

4.22 Leave One Out Cross Validation . . . 45

5.1 Hydrocontroller Project . . . 47

5.2 Weather Stations Location . . . 49

5.3 An example of SMC Satellite Image (greyscale) - IFAC . . . 50

5.4 UTM Zones . . . 51

5.5 Pontecosi Area - SMC . . . 52

5.6 Basin Level . . . 53

5.7 Turbine Flow Rate . . . 53

5.8 Discharge Flow Rate . . . 54

5.9 Level histogram . . . 54

5.10 Turbine Flow Rate histogram . . . 55

5.11 Discharge Flow Rate event . . . 55

5.12 Level during discharge flow rate . . . 56

5.13 Example of interpolator surface - Air Humidity . . . 57

5.14 Example of interpolator contour plot - Air Humidity . . . 57

5.15 Scheme of Cooperative Ensemble . . . 59

5.16 NARX - 1 Hour Ahead . . . 60

5.17 1 Hour Ahead - Level Test Result . . . 61

5.18 MAE obtained with various delay parameter values . . . 61

5.19 6 Hour Ahead - Level Test Result . . . 62

(8)

List of Tables

2.1 Main characteristics of Serchio basin - Autorit`a di Bacino del Fiume

Serchio . . . 16

4.1 Main fitting function . . . 28

5.1 Example of data quality table . . . 49

(9)

Acronyms

AM Aeronautica Militare. 50 CV Cross Validation. 43

ECMWF European Centre for Medium-Range Weather Forecasts. 51 EF Environmental Flow. 48

EW Extra-Wide swath. 23

i-EM Intelligence in Energy Management. 46 IFAC Istituto di Fisica Applicata Nello Carrara. 50 IFS Integrated Forecast System. 51

IR InfraRed. 20

IW Interferometric Wide swath. 23

LOOCV Leave-One-Out Cross Validation. 43, 44 MAE Mean Absolute Error. 43, 44

ML Machine Learning. 9, 19 MSE Mean Square Error. 37

MSPE Mean Square Prediction Error. 30 MW MicroWave. 20

NARX Non-linear Auto Regressive eXogenous input. 1, 8, 41, 64 NN Neural Network. 35

NTC Negative Temperature Coefficient. 24 NWP Numerical Weather Prediction. 19 OLI Operational Land Imager. 21

(10)

Acronyms

PWS Personal Weather Stations. 49 RBF Radial Basis Function. 31 ReLu Rectified Linear units. 40

RTD Resistence Thermal Detector. 24

SAR-C Synthetic Aperture Radar (C-band). 22 SM Stripmap. 23

SMC Soil Moisture Concentration. 46 TIRS Thermal Infra-Red Sensor. 21 UPS Universal Polar Stereographic. 50 UTM Universal Transverse Mercator. 50 WM Wave Mode. 23

(11)

Chapter 1 Introduction

Water is one of most important resources of the World. Monitoring rivers, basins and seas is a very important challenge. Water uses are very different one from another: from agricultural utilization to electric energy production. Wrong man-agement can lead to natural calamities such as floods and dry rivers. From an energy production point of view, basin level and turbine water flow are deeply linked. In addiction weather conditions can influence not only the basin level but also the plant operations. Garfagnana valley hydroelectrical system is composed by several basin and power plant connected each other. Managing the water re-source all over the valley is a complex task: knowing in advance basin level could give a great effort. This is the principle of the Hydrocontroller project: to cre-ate a support decision platform for the Pontecosi basin in Garfagnana. In this thesis an archive version of the platform is discussed. The archive version starts from 3/6/2017 and ends to 27/02/2019. Weather variables data are provided by several weather stations located in the northwestern area of Tuscany. Spatial interpolators help to have a more detailed information about weather variables in Pontecosi area. Satellite data have been used in this project: soil moisture map of the interested region have been studied and managed. Data imputation has been done due to lack of observation. Weather variables together with basin operation data contributes to create a NARX model which can successfully forecast basin level from 1 to 6 hour ahead.

In chapter 2 hydrological modelling toghether with Garfagnana hydroelectrical sys-tem are briefly described.

In chapter 3 different data sources used in this thesis are discussed. In chapter 4 statistical and Machine Learning tools used are explained. In chapter 5 Hydrocontroller project implementation is deeply described.

(12)

Chapter 2 Hydrology

This chapter discuss how the water resource can be studied and how can be used to produce electrical energy; a short review on hydroelectric power plants follows. Garfagnana regional hydrological system is briefly described.

Hydrology is the study of the hydro resource in many environment such as seas, lakes, rivers and groundwaters. The aim of this part of Environmental Science is to describe the circulation of water in a specific area: to achieve this goal, Hydrology considers natural events such as rainfall, runoff, drought, and flood [1]. Hydrological modelling provides the basic laws and resolution algorithms for each of the following phenomenon:

Rainfall

Evapotranspiration Infiltration

Runoff

Subsurface flow

One of the possible general classification of hydrological models is [2]: Physical Models

Grey Box Models Black Box Models

Physical models are the oldest ones: they aim to be an ideal representation of the real phenomenon. They use physical laws for each aspects of the hydrology and the defined state variables are measurable and physically defined. They require the evaluation of many parameters, typically using empirical laws.

Black Box models use Machine Learning (ML) tools to describe and predict the flows in specified points of the hydro-area (river, basin, lake,etc.). Black Box models are statistical models which requires certain amount of data to be generated [4].

(13)

CHAPTER 2. HYDROLOGY

Figure 2.1: General Description of a hydromodel

Grey Box models can be considered as a mix of the previous two models: physical equation are used together with statistical models. They require the eval-uation of fewer parameters than Physical models [3].

The need of hydrological model is justified by the environmental challenges that nowadays are constantly on the spotlight. Knowing the status of the water resource can be crucial to avoid natural calamities.

The hydro resource can be used in many ways: Management of flood and dry periods

Drinkable uses Farming watering Industrial processes Power plant cooling uses Fishing and tourism Fire extinguisher reservoir Electrical Energy Production

2.1 Hydroelectric Basin

The water collected in a lake can be seen as an opportunity to create an hydroelec-tric power plant. Hydroelechydroelec-tric power plant use the gravitational potential energy to induce the mechanical movement on the turbine. Depending on the orography of the area, hydroelectric plant can be divided into:

(14)

Figure 2.2: Example of a Seasonal tank: Vagli Lake Modulation Basin

Run-of-River

The kind of plant realized depends not only on the orography of the sites but also on energetic considerations: various producibility diagrams can be analyzed. The most common one is the chronological flow discharge diagram (Figure 2.4): it displays the flow discharge of the basin in a certain year. It clearly represents not only the amount of water but also how the water resource seasonality occur in the area.

Another useful diagram is the duration diagram (Figure 2.5): it represents the amount of time for which a certain flow discharge has occurred. This tool is very useful to forecast the producibility of a hypothetical hydro site.

Based on this diagrams and more detailed analysis, the hydroelectric power plant can assume various forms. If the height between the basin and the plant is bigger than 200 m, a High Head plant is the best solution; for height between 20 m and 200 m it will be realized a Medium Head plant and for lower height a Low Head Plant.

The difference in height between the three setups lead to a long or short pen-stock: this strongly influence the powerhouse construction and turbine selection.

The main law which rules the producibility of a hydroelectric basin comes from the Bernoulli equation applied to the inlet section and to the outlet of the penstock.

p + ρu

2

2 + ρgh = constant (2.1) with

(15)

Figure 2.3: Example of Run-of-River plant: Borgo a Mozzano

Figure 2.4: Example of chronological flow discharge diagram of a basin

(16)

Figure 2.6: Scheme of a High Head plant

Figure 2.7: Scheme of a Low Head plant p is the pressure on the considered area

ρ is the density of the water u is the speed of the water g gravitation constant

h is the distance between the considered area and a common reference If the kinetic term can be neglected, the equation assume this compact form

p1+ ρgh1 = p2+ ρgh2 (2.2)

ρg(h1− h2) = p2− p1 (2.3)

So higher the head ∆H = h1− h2, higher the pressure on the turbine inlet, higher

the specific energy that can be transformed into mechanical work. Multiplying each side of the equation by the flow rate Q,

P = ρgQ∆H (2.4)

with P as power production.

This fundamental equation express an important aspect: a low head plant can produce as much as a high head plant if it exploit a bigger flow rate.

A flow rate classification can be expressed as: Small flow → Q < 10 m3_/s

Medium flow → Q > 10 m3_{/s and Q < 100 m}3_/s

(17)

Figure 2.8: Hydro system in the area of Garfagnana

2.2 Garfagnana hydroelectric system

The presence of large amount of hydro resource and its various distribution on the valley area has influenced the realization of a complex system of river, basin and power plant.

The production of a specific power plant depends from the production of all the other power plants upstream. For example, if an upstream plant does not use the turbine for a certain period, the downstream basin will receive less water than usual. On the other side the upstream basin will increase its water level.

Optimizing the hydro resource all over the Garfagnana valley is one of the goal of the owner of this plant. This kind of optimization is very complex because it requires a high level coordination between all power plants of the area. The other main goal is to prevent the risk of flooding: the management of the hydro basin/plant must be handled in a safe way for the surrounding area. Knowing in advance the increase of water levels can help in taking the right decision and avoiding flooding.

From a production point-of-view, having different basins at different altitudes need a deep study focus on optimizing the global hydro resource. Higher basins could influence lower altitude basins: this depends on the water flow processed by each power plant. Knowing the basin status and predict it can strongly help the optimization process.

Another useful diagram is the hydrodynamic curve. In the horizontal axes of the diagram, the surface of the basin is represented. In the vertical axes the corresponding altitude. The red line in Figure 1.13 represent the hydrodynamic curve. The surface of the basin could be converted in hypothetical flow discharge if the geomorphology of the basin is known.

(18)

Figure 2.9: Serchio flooding in 2009

The diagram in Figure 2.10 gives a preliminary result on a possible production setup: a first inlet penstock section is created at 2000m and its corresponding power plant is located at 1000m.The grey area (1) between the two altitudes is called hydrodynamic value: it is representative of the potential production of the power plant. This procedure can be repeated with new inlet penstock section and a new power plant placed at lower altitude than the first ones. The white areas between the red curve and the hydrodynamic value are representative of an amount of energy not used. The final goal of this analysis is to maximize the quantity of energy available on the hydrological system considering economical and environmental constraints.

2.3 Garfagnana hydrological system

The Garfagnana hydrological system is the most important on the Tuscany region together with Arno basin: the main river is Serchio (111 km) which source is located in Monte Sillano (1864 mt). Serchio goes through the Garfagnana valley until it reaches the Tirreno sea. The most important tributaries are:

(19)

Figure 2.10: Examples of hydrodynamic curve Serchio di Gramolazzo Lima Turrite Secca Edron Characteristic Value Total Surface 1565 km2

Total Catchment Surface 1408 km2

Total height of rain (1 year) 1946 mm Maximum flow 2200 m3_/sec

Average flow 46 m3_/sec

Minimum flow 4 m3/sec

Table 2.1: Main characteristics of Serchio basin - Autorit`a di Bacino del Fiume Serchio

This thesis focuses on the Pontecosi basin which is located in the upper area of Garfagnana. It is an artificial lake with a 30 meters dam which serves as hydro-tank for the Castelnuovo hydroelectric power plant, located close to the lake. Casteln-uovo power plant is the third biggest of the area; that justifies the importance of an accurate study of the hydro resource. Detailed basin data are in the Appendix C.

(20)

Figure 2.11: Serchio basin: overview

(21)

(22)

Chapter 3 Weather Data

In this chapter three different kind of weather data are described: Reanalysis dataset, Satellite observations and Weather Station measurements.

Studying weather using different sources can help to have a deep comprehension of the environmental variables.

3.1 Reanalysis Dataset

Reanalysis dataset were initially created to improve the performance of the Nu-merical Weather Prediction (NWP) models. NWP models needs data about the initial state of weather variables: reanalysis data can provide it.

Reanalysis dataset are generated by several data assimilation schemes and mod-els. Data assimilation is a class of techniques which mixes different data sources, in order to obtain an output with less uncertainty than the original data. In this case data assimilation uses ML algorithms instead physical laws: this is mainly pushed by the complexity of the laws which rule all weather variables. In addiction there would be the necessity of knowing several parameter that are difficult to estimate. Reanalysis is applied to an historical dataset of weather observation (e.g. Weather stations, Satellites observations, aerostatic balloon,. . . ): so the output is influenced by various data sources.

Using an historical dataset means that the reanalysis dataset is related to past periods.

Reanalysis is usually made by an ensamble of ML algorithms, which is period-ically retrained. Using a ensamble improves the performance of reanalysis instead of using only one ML algorithm.

(23)

CHAPTER 3. WEATHER DATA

Figure 3.1: Observation systems used by ECMWF

Figure 3.2: Sentinel-1

3.2 Satellite Observations

Nowadays, satellite observations are widely used in many fields: from the environ-mental study to military use. Satellitary observations can provide data all over the world without using a local measurement station thanks to their orbit around the Earth. In this thesis, the main focus is on the weather observations. Each satellite has on-board instruments which are able to measure weather variables in a specific way. The main physics principles used are the InfraRed (IR) and Mi-croWave (MW) in addiction to the traditional visible imager instruments. Below is represented the on board instruments of the Landsat-8 and Sentinel-1 satellites and the main variables measured. [10]

(24)

Figure 3.3: Landsat-8 Landsat-8 Operational Land Imager (OLI)

Snow cover

Soil moisture at surface Soil type

Biomass

Fraction of Absorbed PAR (FAPAR) Fraction of vegetated land

Land cover

Leaf Area Index (LAI)

Normalised Difference Vegetation Index (NDVI) Vegetation type

Thermal Infra-Red Sensor (TIRS) Integrated Water Vapour (IWV) Fire fractional cover

Land surface temperature Sea surface temperature Cloud cover

Cloud top height Cloud top temperature Fire radiative power

(25)

Figure 3.4: Satellite Image - Soil Moisture Measurement Fire temperature

Sentinel-1

Synthetic Aperture Radar (C-band) (SAR-C) Dominant wave direction

Dominant wave period Fraction of vegetated land Land cover

Oil spill cover Snow cover

Snow status (wet/dry) Snow water equivalent Soil moisture at surface

Wave directional energy frequency spectrum

Two main aspects of the Satellite observations are Spatial and Time resolution. Spatial(Horizontal) and Vertical resolutions define the 3D-grid where are acquired the variables. So, each point of the grid has a unique value for that variable. Time

(26)

resolution is referred to how much wide is the period between one observation and the following one on the same area. This parameter depends mainly on the orbit of the satellite. In addiction, Satellite acquisition can be performed in various ways [11]:

Sentinel-1 Acquisition Modes Stripmap (SM)

Interferometric Wide swath (IW) Wave Mode (WM)

Extra-Wide swath (EW)

Figure 3.5: Acquisition Modes of the Sentinel-1

The chosen acquisition mode determines the output data resolutions (space and time). The satellite acquisition data usually are stored into .tif or .geotiff format. This kind of file contains all the informations regards to:

Name of Satellite

Datetime of the observations

Number and type of layers (variables) Coordinate system type

Coordinate limits

These informations are necessary to manage the file and to filter the data over a spe-cific region: each pixel in the image have a corresponding value of the layer(variable).

(27)

Figure 3.6: Example of a rain sensor plus anemometer and transmitting system

Figure 3.7: Example of Pt100 (RTD)

3.3 Ground Data

Weather stations are the most direct way to measure a weather variable: usually they are very complex sensors system with not only the sensing element but also a data acquisition and a trasmitting system. High quality weather stations corre-spond to expensive products, so usually they are installed in key points of the area to be observed.

Weather stations are equipped with different sensors, depending on the variable acquired.

Temperature sensors Commercial temperature sensors belongs to two main categories: Resistence Thermal Detector (RTD) and Negative Temperature Coef-ficient (NTC). RTD consist of a platinum resistance which varies linearly with the temperature. Platinum is the selected material because it has better response lin-earity than other cheaper material like copper or aluminium. Commercial RTD are Pt100 and Pt1000, with the value standing for the resistance value at 0◦C. NTC are semiconductor which resistance decrease when temperature increase. They have a hyperbolic-like R-T curve and smaller accuracy than RTD: on the other side, NTC are cheaper than RTD.

Barometric pressure sensors Several different pressure sensors can be found on the market, which can be based on different physical principles. Commercial sensors often belongs to

(28)

Figure 3.8: Example of NTC

Figure 3.9: Comparison between RTD and NTC curves Piezoresistive strain gauge

Capacitive elements Hall effect devices Piezoelectric devices

All of this directly measure the force applied to the sensing element: by dividing it for the contact surface area, the pressure is obtained. Depending on the category, they could often need an amplifier system.

(29)

Figure 3.11: Example of air humidity sensor

Figure 3.12: Example of anemometer with speed and direction measurements Air humidity sensors This sensors are often capacitive elements which capaci-tance varies linearly with the amount of H2O dissolved in air. Usually they provide

relative air humidity percentage and more accurate model uses a temperature sen-sor to calibrate the reading.

Wind Sensor - Anemometer Another classical weather instrument is the anemometer which is composed by a rotating system (often cup-shaped) and a rotation speed sensor in its base. Can be equipped with another rotating system which measure the direction of wind with a sensor position.

(30)

Chapter 4 Statistical Modelling and Machine

Learning Algorithms

In this chapter statistical tool used in this thesis are described. Spatial interpolators have been used for describing weather variables over a wide area. Machine Learning algorithms are generally described: a more specific focus will be done in next chapter.

4.1 Spatial Interpolators

Pointwise data like Weather Stations (WS) provide can not efficiently describe a weather variable in a wide area. Determine a mean value on a specific area by simply averaging the values measured by the WS can lead to big estimation errors. One of the recommended way is to use a Spatial Interpolator: it is a mathematical tool which can express a relationship between the measured variable and the points among the measurement site. There is a large variety of spatial interpolators: in this thesis are described three of them.

Kriging [16]

Radial Basis Function [17] Natural Neighbour [18]

In the following, the measurement sites are defined as “sample points”.

4.1.1 Kriging

The term “Kriging” comes from Danie Krige, a Southafrican engineer who first developed this particular method. Kriging is used in different versions and methods: in this thesis Ordinary Kriging is discussed.

The main tool of Kriging is the variogram, a graph where the spatial variance between sample points is represented. Spatial variance is defined [15] as follows:

γ(h) = n(h) X i=1 (z(xi+ h) − z(xi))2 n(h) (4.1)

(31)

CHAPTER 4. STATISTICAL MODELLING AND MACHINE LEARNING ALGORITHMS

Figure 4.1: Example of an empirical variogram where

h is a class of distance

n(h) is the number of sample points on a certain class of distance xi is a sample points in a certain class of distance

z(xi) is the value of the variable in the sample points xi

γ is the spatial variance

After building the empirical variogram, the next step is to choose the function that will fit it best: this is the key point of this method. There is no specific way to know in advance if a function will fit well because it depends not only on the variable selected but also from its spatial distribution.[15]

Name Function Exponential γ(h) = C0+ C1(1 − e− h θ) Gaussian γ(h) = C0+ C1(1 − e− h2 θ2) Spherical γ(h) = C0+ C1(3₂(−_θh) −1₂(−h_θ)3)

Table 4.1: Main fitting function

As can be seen in table, typically function expressions contains two unknown parameters C0 and C1, the distance h and a shape parameter θ. The shape

param-eter θ (taken as a positive number) is intended as a free choice of the user: different θ values will cause different C0 and C1parameters. C0 and C1 are determined, after

selecting θ, with a least square problem resolution.

Whatever function and parameters chosen, a fitted variogram should appear similar to the one in Figure 3.2. There can be found 3 principal descriptive char-acteristics: the sill that is the maximum value (or close to it) of γ(h), the range that is the distance at where sill occurs and the nugget that could be associated as a bias. Theoretically at h = 0, γ(0) = 0 but at an infinitely small distance, we can often observe a nugget effect : this is caused by measurements error. Instead of

(32)

Figure 4.2: Main characteristics of a fitted variogram

Figure 4.3: Example of a covariogram

spatial variance, it is better express it in form of spatial covariance. According to [13] the relationship is

C(xi, xj) = sill − γ(xi, xj) (4.2)

Kriging predictor The basic form of Kriging predictor is a linear combination of the values vector with a weights vector to be determined

ˆ z(x0) = n X i=1 λiz(xi) (4.3) with

x0 is the point where the prediction is made

z(x0) is the value of the variable in x0

xi is a sample points

(33)

λi is a coefficient (weight)

n is the number of sample points

When the weights vector is known, the value at an unknown point can be determined with the equation above.

Determining weights vector Ordinary Kriging is intended as an exact inter-polator: this means that the estimated value at sample points location must be the input value. This is also called unbiased predictor. Weights vector are deter-mined by minimizing the Mean Square Prediction Error (MSPE) [12] that can be expressed as:

σ2_e = V ar(z0− ˆz0) (4.4)

V ar(z0− ˆz0) = V ar(z0) + V ar(ˆz0) − 2Cov(ˆz0, z0) (4.5)

σ_e2 = σ2+ n X i=1 n X j=1 wiwjCij − 2 n X i=1 wiCi0 (4.6)

One of the most used methods used to minimize this function is to use the method of Lagrange multipliers. The constraint is expressed from the unbiased predictor condition, that can be written as Pn

i=1wi = 1. So the Lagrangian

equa-tion is: L = σ2+ n X i=1 n X j=1 wiwjCij − 2 n X i=1 wiCi0+ 2λ( n X i=1 wi− 1) (4.7)

Calculating the partial derivative respect to the weights the resulting system is:            2Pn j=1wjC1j − 2C10+ 2λ = 0 .. . 2Pn j=1wjCnj− 2Cn0+ 2λ = 0 2(Pn j=1wj− 1) = 0

This system can be expressed in a matrix notation as:

Cij =      C11 . . . C1n 1 .. . ... 1 C1n . . . Cnn 1 1 . . . 1 0      w =        w1 .. . .. . wn λ       

(34)

CHAPTER 4. STATISTICAL MODELLING AND MACHINE LEARNING ALGORITHMS Cj0 =        C10 .. . .. . Cn0 1        [Cij][w] = [Cj0] (4.8)

The weights can be found by simply resolving the system above. So the final expression is:

[w] = [Cij]−1[Cj0] (4.9)

4.1.2 Radial Basis Function

Another spatial interpolation method uses the Radial Basis Function (RBF) as interpolator. RBF are often used as a variant of Neural Networks: instead of using a sigmoid function, they use a gaussian function. In addiction the number of neurons corresponds to the number of sample points. A function g : Rd _{→ R, depending}

only on the magnitude of the input, is called radial.[17] In spatial interpolation the argument of the function often relates to the physical distance between sample points: defining φ a RBF, φ will have same values for same distances. As Ordinary Kriging, RBF are intended as exact interpolator, so the output values, in the observation points, will have the same value of the sample points input. This property can be expressed as:

z(xi) = fi, i = 1, 2, ..., n (4.10)

with n number of sample points location, xi sample point location and fi value

of the variable at sample points location. RBF interpolator are usually defined as linear combination of distance function calculated at an unknown location x.

ˆ z(x) = n X i=1 λiφ(||x − xi||), x ∈ Rd (4.11)

Based on 4.11, the exact interpolator condition will result z(xj) = n X i=1 λiφ(||xj − xi||) = fj i, j = 1, 2, .., n (4.12) In form of matrix φ =      φ(||x1− x1||) φ(||x2− x1||) ... φ(||xn− x1||) φ(||x1− x2||) φ(||x2− x2||) ... φ(||xn− x2||) .. . ... ... ... φ(||x1− xn|| φ(||x2− xn|| ... φ(||xn− xn||      λ =     λ1 λ2 ... λn    

(35)

Figure 4.4: An example of an RBF Mesh

f =     f1 f2 ... fn     It can be summarized in:

[φ][λ] = [f ] (4.13)

The distance matrix [φ] is simmetric and can be inverted, so the weights vector [λ] can be easily determined as:

[λ] = [φ]−1[f ] (4.14)

Distance Function The most frequently used distance function is the Gaus-sian.

In RBF applications it can be expressed in a compact way as:

φ(r) = e−(r)2 (4.15) and the Inverse Multiquadric

1 √

r2₊2 (4.16)

(36)

Figure 4.5: Example of RBF Contour Plot

(37)

4.1.3 Natural Neighbour Interpolator

Natural Neighbour Interpolator approach works with a similar principle to Kriging and RBF: the form of the interpolator is:

ˆ z(x) = n X i=1 wiz(xi) (4.17) with:

ˆz(x) is the predicted value at x location wi is a component of the weights vector

z(xi) is a sample point value

n is the number of sample points

Dermine weights There are two main methods for determining weights: Sibson and Laplace methods. Each of them can be applied both to Delaunay triangulation or Voronoi tessellation to the sample points. To predict values, the algorithm firstly calculate the the Delaunay triangulation (or Voronoi diagram) to the sample points [18]; after that there will be a new tessellation with the predicted point location. Adding a new point generates a new map, so the areas and the contour are modified. Sibson weights focus on the area and weights calculation can be expressed as:

wi(x) =

A(xi)

A(x) (4.18)

with:

A(x) is the new area

A(xi) is the intersection between the new area and the old area in xi

Laplace weights on the other hand focus on the contour.

wi(x) = l(xi) d(xi) Pn k=1 l(xk) d(xk) (4.19) with:

l(xi) is the length of the contour of the ith sample point in the old map,

included inside the new point area

(38)

Figure 4.7: Sibson weights method

Figure 4.8: Laplace weights method

4.2 Machine Learning Algorithms - Neural

Net-works

Neural Network (NN) is computational tool which purpose is to statistically model a process [20]. NN are used when the laws which rules the process are unknown or highly non linear. The adjective Neural recalls to the human brain structure: indeed, the way the network is generated is similar to the learning process that happens in the human brain. As the human brain, the basic unit of the NN is one individual component called the neuron.

In Figure 4.10 the neuron is schematically represented: on the left side there is the Input layer where all the inputs are collected. The input layer goes through the Synaptic weights: here the value of the input is multiplied by the corresponding weight value. All the connections goes inside a Summing junction where all values are added. An additional value which goes into the junction is the bias. The result of the addiction is the argument of a specific Activation function which produce the final output.

According to [20] the NN neuron can be described by these two equations: uk = m X j=1 wkjxj (4.20) yk = φ(uk+ bk) (4.21) where

(39)

Figure 4.9: Human neuron structure

(40)

Figure 4.11: A multi-layer feedforward network xj is the j-th input

m is the number of the inputs wkj is the corresponding weight

bk is the bias

φ is the activation function yk is the output

The single neuron can produce a numerical output when the values of the weights and bias are known. Generally, NN are composed by more than a neuron: so the descriptive diagram increase in complexity has shown in Figure 4.11. Each input is connected to each neuron in the Hidden Layer and the final output is the sum of the single neuron outputs.

Increasing the number of neurons can help in describing more non-linear process. Nowadays several kind of NN can be created, in this thesis it will be described the Multi-Layer Feed Forward Neural Network.

Training Process Determining the weights and the bias values is the task that it is accomplished in the Training process. The data used in the training process is often a partition of the entire dataset and it is called Training Set. The prin-ciple which rules the Training process in a NN is the comparison between inputs and outputs. Changing the weights value means changing the output value: the training process aims to minimize the error between the real output and the out-put predicted by the NN. The main algorithm used for achieve this goal is the backpropagation algorithm. The term “backpropagation” expresses a relationship between the obtained error and how the weights change: this is somehow similar to a feedback. This algorithm, with an iterative process, reduce the error on the prediction. The mainly used statistical index in backpropagation algorithm is the Mean Square Error (MSE). The algorithm works in this way

(41)

Figure 4.12: Gradient Descent 1. Randomly initialize the weights

2. Calculate predicted output and error of each observation in the training set 3. Calculate MSE of the entire training set

4. Modify weights

5. Re-calculate MSE of the entire training set 6. Reiterate until MSE has reached is minimum

The way the weights are modified is based on the Gradient Descent which his principle is described in Figure 4.12. The backpropagation algorithm so consider not the MSE value but its gradient. In particular the algorithm modifies weights in order to reduce the gradient between the actual and past MSE.

Overfitting The training process aims to reduce at minimum a statistcal metric, for example the MSE between the input and the targets. This process could often lead to an overfitting problem: the NN will model very well the input of the training set but, with slightly different input, will have huge difference between predicted and target output.

To avoid this problem, different techniques could be used. One of the most used is the Validation set technique: training set is splitted in two parts. The bigger is still the training set, which is used to reduce MSE and update weights. The smaller one is called validation set: the NN is evaluated also on this partition of the data. The error made on validation data is constantly monitored by the algorithm and when it does not decrease for a defined number of iterations, the training will be stopped. A graphical explanation of this method is in Figure 4.14.

Activation Function NN activation function is the only element that introduces a non-linearity in a feedforward NN. Modelling non-linear problem is one of the main reasons for why NN are used. Most common activation function are:

(42)

Figure 4.13: Three different behaviour

(43)

Figure 4.15: An example of sigmoid - Logistic curve Sigmoid

Hyperbolic Tangent

Rectified Linear units (ReLu)

Sigmoid function is a class of function characterized by a typical S-shape. The typical sigmoid function used is the Logistic Curve.

y(x) = 1

1 + e−x (4.22)

The logistic curve has smooth trend in the outer values and a more steep trend in the internal values. This means that a small increase of the input tend to generate higher values on the output. It is important to notice that the logistic function is limited on the output to [0,1]: the output generated will be always limited.

The hyperbolic tangent is another common activation function. f (x) = tanh(x) = 2

1 + e−2x − 1 (4.23)

The function can be rewritten in terms of logistic curve

tanh(x) = 2logistic(2x) − 1 (4.24) This means that hyperbolic tangent is a scaled logistic curve.

In this case the function has the possibility to assume negative output values. On the other side, the curve is steeper than the logistic curve.

The ReLu curve is another common activation function. It can be described by the expression

f (x) = max(0, x) (4.25) Differently from the previous two, it has no upper bound: so the output can be virtually unlimited. On the other side if the input is negative, the neuron will be turned off.

(44)

Figure 4.16: Hyperbolic tangent curve

Figure 4.17: ReLu Curve

4.2.1 Nonlinear Autoregressive Exogenous Model

NARX network is one of most common architecture used to model non-linear pro-cesses. The application which they are more suitable is to model time-dependent processes. The basic principle is that future state of the process depends of past states: this can be mathematically express in this way:

y(t) = F (y(t − 1), y(t − 2), . . . , u(t), u(t − 1), ...) (4.26) with

y(t) is the output of the process

F () is the non-linear function which describe the process y(t − d1) are the past output of the process with d1 > 1

u(t) is the present value of the input

u(t − d2) are the past values of the input with d2 > 1

The F function is usually modelled with a Neural Network. The d parameters are usually defined as input delay and feedback delay: they are usually different one from other. In Figure 4.18 is represented an NARX network with input delays setted from 0 to 2 and feedback delays setted from 1 to 3.

(45)

Figure 4.18: An example of a NARX network

(46)

Figure 4.20: Closed Loop Setup

As common Neural Networks, NARX networks have a training phase: in this case the training phase have a particular behaviour. The standard training phase setup is the open loop setup, shown in Figure 4.19. In the open loop setup, the feedback inputs y(t) used are the real outputs. The ˆy(t) model output can be considered as one step ahead output. If the task is to go beyond a one step ahead prediction, the open loop setup can be turned on a closed loop setup, shown in Figure 4.20. In closed loop setup, the feedback inputs used are the output of the NARX net, previously trained in the open loop setup. Using an closed loop setup can extend the range of the prediction, going far as the exogenous input u(t) is known.

4.3 Model Selection

The statistical task of Model Selection it is a recurrent topic in ML world. Doing a model selection means to select a model among all others that has been generated. In particular, focusing on a single algorithm, this means finding the hyperparameter that generalize best the process. A well-generalized model is a model which can predict well never seen input. In this case it can be considered opposite of overfitting which on the other side, predicts very well training data but makes big error on test data. Model Selection is usually done with Cross Validation techniques: in this thesis are described two kind of Cross Validation (CV).

K-Fold CV

Leave-One-Out Cross Validation (LOOCV)

4.3.1 K-Fold Cross Validation

K-Fold CV is most general method applied in model selection. Training set is randomly subdivided in K subset, which has approximately the same size. The procedure consist of training the model on K-1 subset and then test on the hold out subset. This has to be done such as every subset has been the test set once [20]. All of this with a fixed value of the hyperparameter. The error made on each fold is often associated with an index, like Mean Absolute Error (MAE), and the final index is the average of the indexes previously calculated.

M AEcv = 1 k k X i=1 M AEk (4.27)

(47)

Figure 4.21: K-Fold Cross validation

The entire procedure is repeated for another hyperparameter value. The hyperpa-rameter which reaches the lowest MAE is the one the generalize best the process.

In Figure 4.21 the procedure is schematically represented. The number of fold in which the training set has to be divided is not unique: standard choice are 3, 5 and 10 fold cross validation.

4.3.2 Leave One Out Cross Validation

LOOCV is a particular K-Fold Cross Validation with K=N, with N the number of observations. In spatial interpolators described before, the number of observa-tions take a different meaning than usual: the basic dataset is composed by one observation for each station. So there is not a relationship between two subsequent observation from same stations. Each set of values coming from the sample points is a different dataset and in each of them has to be done a LOOCV.

So taking N points, X = {x1, . . . , xN} location and Xj = X/{xj}, j = 1, 2, . . . , n

the subset of location that excludes the jth location we can define

sj(xj) = N −1 X k6=j a(j)_k gλ(||x − xk||) (4.28) with

sj(xj) the prediction in the Left-Out point

a(j)

k weights calculated without jth point

gλ(||x − xk||) gaussian distance function calculated with a selected shape

factor

The error is defined in the Left-Out point as

ej = yj − sj(xj) (4.29)

with yj the real value in the Left-Out point. This procedure is repeated for each

point of the dataset. At the end, the error vector will result as eλ = {e1, . . . , ej, . . . eN}.

This is often associated with MAE.

The entire procedure is repeated for different λ values. The λ value with lowest MAE is defined as best shape factor.

(48)

Figure 4.22: Leave One Out Cross Validation

This is valid for one dataset of observations. The entire procedure has to com-pletely repeated for another set of observations (e.g. Different day measurements of the same variable).

(49)

Chapter 5 Case Study - Hydrocontroller

Project

In this chapter, the Hydrocontroller project is illustrated. All phases are described and discussed: starting from the input layer, going through the interpolator uses and ending with basin status model.

5.1 Hydrocontroller Project - Description

The Hydrocontroller project belongs to Programma Operativo Regionale (POR), which is a Tuscany economic development platform. The main goal of the project is to develop a monitoring platform of a hydro basin. The candidate has produced, as an internship with i-EM (Intelligence in Energy Management (i-EM) company), an archive version of the monitoring platform: the system has been tested on a dataset starting from 3/6/2017 to 28/02/2019. The project has been focused on Pontecosi Basin, in the Garfagnana Valley, close to Lucca.

The Hydrocontroller project aims to exploit different data sources, like ground sensor and satellite data, to describe the basin status: to achieve this goal, the data as to be comparable. Spatial interpolators have been used to generate the spatial distribution of a weather variable. After that, standard ML tools are described to predict the Soil Moisture Concentration (SMC) over the basin area. The result will be finally used to predict the basin status.

In Figure 5.1 the flow chart of the Hydrocontroller project is represented. On the top side, there is the input layer, where all the data sources are acquired: the data undergoes to various preprocessing phases before they get deeply processed. After that, spatial interpolators have been used to transform the pointwise data into a spatial distribution surface. The first analytics tool step is representative of the ML Algorithms used to model the SMC spatial distribution over Pontecosi area. The second analytics tool step is the final ML model which aims to determine the basin status.

Basin Status The final goal of the Hydrocontroller project is to determine the present and future basin status. Basin status is not uniquely defined: it depends on the point of view of the user. From an energetic point of view is the actual

(50)

CHAPTER 5. CASE STUDY - HYDROCONTROLLER PROJECT

(51)

and future producibility of the hydroelectric power plant: this informations can be relevant in the production schedule. From an environmental safety point of view is the water level of the basin: upper and lower bounds are set by environmental laws. Going over upper bound can cause floods; on the other side, a minimum level of water must be preserved to ensure the Environmental Flow (EF).

In this case, Basin status is considered as the height of the water respect to the bottom of the Pontecosi lake. This indicator is often called as net head. Net head in a hydroelectric basin depends on other variables: the flow used by the turbine, for example, is a human activity that influence the net head value. Higher the turbine generation, higher is the amount of water withdrawn from the basin. Another aspect that influence the net head value is the EF, a minimum water outflow imposed by environmental laws to ensure a proper life quality of the river ecosystem [21]. The EF must be guaranteed in all conditions by the owner of the plant.

5.2 Project Dataset

5.2.1 Weather Stations Data

The weather stations used in this project are located close to the Garfagnana Valley: due to its localization, it has been considered as relevant the position of the measurement sites. To reach an adequate quantity of weather stations, different providers have been chosen:

Wunderground

Plant Owner Weather Stations AM

The weather variables considered in this project are: Mean daily temperature (C)

Mean daily air humidity (%) Total daily precipitation (mm) Mean daily wind speed (km/h)

Data preprocessing An important issue of Data Science is the preprocessing. Preprocessing is the class of operations made to prepare data to be managed in the next phases. Preprocessing has to be applied to raw data, that is data im-ported into the computation tool as they are with no treatment. In this case, the providers deliver data in a different way one from another: for example, AM Syrep bulletin directly gives the four variables in the correct form. On the opposite side, Wunderground and Enel have more observations per day: average and sum calcu-lations has been performed to have the correct values. Another important aspect of the preprocessing is to determine data quality: in this project, each variable of all stations has been described by tables similar to the one showed in Table 5.1. The

(52)

Figure 5.2: Weather Stations Location

tables can quickly resume the most important factors of the data source studied. Complete tables are in Appendix.

Name of the Table:— itoscana290temp Report generated on:— 17-Sep-2019 12:24:09 First Datetime:— 20170603T00:00:00+0000 Last Datetime:— 20190228T00:00:00+0000 Time Zone:— UTC

Datetime Format:— yyyyMMdd’T’HH:mm:ssZ Max Value:— 29.80 Min Value:— -7.25 Mean Value:— 13.90 Median Value:— 14.93 Mode Value:— 14.30 Std Value:— 7.60 Var Value:— 57.76 Lon Value:— 10.32 Lat Value:— 44.12 Availability:— 84.59

Table 5.1: Example of data quality table

Wunderground Data Wunderground is a Personal Weather Stations (PWS) platform which offers weather observation dataset. PWS used in this thesis are located close to the Garfagnana area. Wunderground dataset is the core of the input data: 6 weather stations (marked with a yellow circle in Figure 5.2) providing all 4 variables with worst-case availability of 90%

(53)

Figure 5.3: An example of SMC Satellite Image (greyscale) - IFAC

Plant Owner Weather Stations Data Plant Owner has made a weather mesurements campaign in the Garfagnana valley. This dataset contain temper-atures and air humidity: they are available only in Spring 2018 and Summer 2018. They are located very close to Pontecosi area, so they have high relevance with respect to the other providers.

AM Data Aeronautica Militare (AM) provides different kind of weather bulletin: the one used in this thesis is Syrep bulletin. AM Syrep stations are located far from the other stations: however, they have been selected to cover lack of availability of other providers. In addition, AM is the only authority which can certificate weather observations in Italy, so the data quality is generally high.

5.2.2 Satellite Data

One of the partners of the project is Istituto di Fisica Applicata Nello Carrara (IFAC). It provides satellite data that are used in this platform. More in-depth, SMC values are mapped on a wide area of Tuscany.

Satellitary images are gridded data: it means that values are stored in a matrix, which is georeferenced through a raster object. Raster object is usually based not on classical coordinates system as latitude and longitude: instead, they use the Universal Transverse Mercator (UTM). UTM system is based on Mercator Earth Projection which is the standard in satellite observation and it is used from 80°N to 80°S. For polar zone, Universal Polar Stereographic (UPS) projection is used.

UTM system divide the Earth into 1200 zones: each of them has a size of 6°in longitude and 8°in latitude. Pontecosi zone is named 32N. Inside the zone, each point is defined within a grid. The images used in this project have a 8074x9885 grid. To select a smaller area than the original, latitude and longitude can be converted in UTM coordinates: so a smaller matrix can be extracted. In Figure

(54)

Figure 5.4: UTM Zones

5.5 it is represented the Pontecosi area extracted from the original image.

5.2.3 Reanalysis Dataset - Era Interim

Reanalysis dataset contains a large set of weather variables, extended on a large datetime range. The decision to use also this kind of data source is related to have more inputs to the Basin status model.

The dataset used in this thesis is ERA-Interim [23] which is released by European Centre for Medium-Range Weather Forecasts (ECMWF). This dataset is based on a 2006 Integrated Forecast System (IFS) system. The main properties are:

4-D variational analysis 80 km of horizontal resolution

60 vertical levels (from the surface to 0.1 hPa)

As for the data from satellite based observations, Reanalysis dataset is provided in a gridded way: a matrix is associated with a specified variable and time. Latitude and longitude reference are included in the dataset files. In comparison with satellite data, reanalysis has a bigger resolution: the accuracy on a small area around Pontecosi can not be too high. Considering this, the variable selected on reanalysis dataset are:

Net solar radiance Snow depth

These variables are less influencing the Basin status, like rain and SMC does, so a low accuracy can be accepted.

(55)

Figure 5.5: Pontecosi Area - SMC

5.2.4 Pontecosi Basin Data

Pontecosi Basin data has been necessary to generate a basin model. Variables included in this dataset are:

Basin Level

Turbine Flow Rate Discharge Flow Rate

Each of them is given with a hourly timestep for all days included in the dataset. It is important to be noted that discharge flow events are heavy events: the discharge flow outlet is activated to prevent flooding. The amount of water released by the basin during this kind of event is huge and it deeply influences the basin level. An example of this event is shown in Figure 5.11 and 5.8.

5.3 Interpolators implementation

Three kinds of interpolators have been tested on the selected dataset: Kriging, RBF and Natural Neighbour: they have been tested on each variable separately, with different setup. This was necessary because different variables can have dif-ferent behaviours. RBF and Kriging require at least one hyperparameter, that has to be known in advance. Natural Neighbour does not require it. The hyperparam-eter chosen should be the one that generalizes best the spatial distribution of the variable. The procedure used in this project is described in the following steps:

1. Divide the dataset in daily observations.

(56)

Figure 5.6: Basin Level

(57)

Figure 5.8: Discharge Flow Rate

(58)

Figure 5.10: Turbine Flow Rate histogram

(59)

Figure 5.12: Level during discharge flow rate

3. Perform a LOOCV on the remaining stations and determine the hyperpa-rameter.

4. Test the interpolator on the test station, take as input all the remaining stations. Calculate error.

5. Repeat the procedure until all the station has been the test station once. Calculate the daily MAE.

6. Repeat all these points for each day

This procedure has been applied for each interpolator and variable. The in-terpolator which has the lowest average MAE has been selected as the winning interpolator.

This procedure has been selected thanks to its robustness: the interpolator has been tested on every station and validated on the remaining. The resulting model has been realized to perform well in each station of the dataset. According to this, values found in the Pontecosi area are close to the real.

Final results have collected in Table 5.2. Kriging has been selected as the winning interpolator for each variable. Detailed results are in the Appendix.

Averaged MAE

Interpolator Temperature Air Humidity Precipitation Wind Speed

Kriging 2.17 8.76 0.23 1.68

RBF 2.81 18.99 0.25 3.51

Natural Neighbour 2.25 11.12 0.28 2.09 Table 5.2: Average MAE on the entire dataset

(60)

Figure 5.13: Example of interpolator surface - Air Humidity

(61)

5.4 Soil Moisture modelling

An important step of the project is the SMC modelling. SMC represent the vol-umetric water concentration: it is expressed as the ratio between the volume of water and and the total volume of considered soil.

SMC = Vwater Vsoil

(5.1) Typical values goes from 0.0 (completely dry) and 0.6 (full wet). Values above 0.6 are usually associated with rivers, seas, etc. so they are not related to the soil and has to be discarded.

Satellite SMC images like IFAC provides are available only twice a week: this depends mainly on the satellite orbit around the Earth. Days without observations so have to be “imputed”: data imputation is a procedure which consists in assigning a reasonable value to a missing or not-trustworthy value. In this case, it means to assign SMC values to days without observations. To accomplish this goal, it has been realized an ML model, based on 80 observations of SMC in the datetime range of the archive version (from 3/6/2017 to 28/02/2019). The input of the model are the output values of the interpolators step:

Mean daily temperature on the area Mean daily air humidity on the area Total daily precipitation on the area Mean daily wind speed on the area

This has been decided thanks to the high availability of this data and the intimate relationship between them and SMC. The value of SMC considered is the mean value in Pontecosi area. A detailed description of the input and target data is in the Appendix.

Neural Net Ensamble The algorithm that has been selected to model SMC, is an ensemble of Neural Network: in particular, is a cooperative ensemble of neural networks. A cooperative ensemble has more than one neural networks whose outputs are averaged. All the neural networks have the same input but they can have different outputs. This mainly depends on the number of neurons chosen for each neural net. The hyperparameter has been chosen through a Model Selection: a 3-Fold Cross Validation has been performed.

The dataset has been randomly divided in 4 fold: one has been held out and considered as the test fold. On the other three fold, 3-Fold CV has been applied 3 times.

Training on 1+2 and test on 3 Training on 1+3 and test on 2 Training on 2+3 and test on 1

(62)

Figure 5.15: Scheme of Cooperative Ensemble

Each cross validation gives a hyperparameter which has the lowest error on that specific cross validation. In total 3 values of hyperparameter has been considered. Generally they can be different one from another: this depends on different training set used by each net. This result is applied in the ensemble training. Three new neural network are trained: the training set used correspond to the fold 1+2+3. The ensamble is tested on the fold 4: comparison with target data has been made and error has been taken into account. Average Mean Absolute Error limit has been set to 0.05 . In Figure 5.15 the ensemble scheme is represented. Detailed result can be found in Appendix B. A final training has been performed with all the fold as training set: after this SMC data imputation has been done.

5.5 Basin modelling

Modelling a hydroelectric basin is the core of the Hydrocontroller Project. To model a hydroelectric basin like Pontecosi basin means to know the future basin level. Actual basin level strongly depends not only on recent past levels but also on past turbine production and past weather events. [22]

According to this, in this thesis NARX model has been created and tested: the possibility to have as inputs past level values make the NARX choice the most suitable to this application.

5.5.1 Basin Model - 1 hour ahead

The NARX model to be realized can be expressed by the equation below ˆ

H(t) = F (H(t − 1), Q(t − 1), W (t − 24)) (5.2) with

ˆH(t) is the level value, the output of the NARX model at instant t Q(t − 1) are turbine and discharge flow rate at instant t − 1

(63)

Figure 5.16: NARX - 1 Hour Ahead W (t − 24) are weather inputs at instant t − 24

A clarification must be made: the turbine and discharge flow are mean values calculated on the hour before: so Q(t − 1) are the mean values of flow rates realized between (t − 2) and (t − 1).

In addition, weather inputs are daily values so must be considered values on the day before the prediction. This model is realized in an open-loop setup so the H(t − 1) values are real measured values.

This NARX model is schematically shown in Figure 5.16 The training dataset consists of first 500 observations: training a NARX net implies using successive observation. Observations from 501 to 550 has been taken as test set. The training method is a moving window method: to predict the next day, the last past day has to be inserted into the training set.

Results are represented in Figure 5.17. Each point of the orange line is com-puted based on past observations. This model can’t be used for a more advanced prediction without knowing in advance the turbine production program. The fol-lowing model was conceived to have this capability.

5.5.2 Basin Model - Extended Forecast

The model can be used for an extended forecast,s beyond the one-hour horizon. The model equation can be generally expressed as

ˆ

H(t) = F (H(t − d), Q(t − d), W (t − 24 − d)) (5.3) that is identical to the equation (5.2) except for d that is the delay parameter. Various d values have been tested as in 5.5.1 - 1 hour ahead model. In Figure 5.18 we represent the MAE value obtained with a specified delay parameter, expressed in hours. MAE values lower than 0.10 has been considered acceptable. The model can provide acceptable results with a maximum delay parameter of 6 as shown in Figure 5.18. This means that basin level can be known 6 hours ahead with acceptable accuracy. In Figure 5.19 we represent the result with delay parameter set at 6 . The model has produced a huge error in two events: this events are

(64)

Figure 5.17: 1 Hour Ahead - Level Test Result MAE = 0.02 mt

(65)

Figure 5.19: 6 Hour Ahead - Level Test Result

discharge event, that contributes to abruptly vary the basin level. This can not be well modelled with a 6 hours delay parameter.

(66)

Figure 5.20: Dashboard Prototype

5.6 Future developments

The Hydrocontroller project expect to be implemented on a real time platform. For this application, three new weather stations have been installed around the basin area. Web software is being developed to show weather and basin status around the area. The prototype dashboard is shown in Figure 5.20. Weather variables measured by the stations are:

Soil Moisture Air Temperature Wind Speed Liquid Snow Solar radiance Precipitation Snow coverage

(67)

Chapter 6 Conclusion

In this thesis, the candidate has developed a prototype of the Hydrocontroller system. The Hydrocontroller goal is to develop a decision support system for hy-drological basins. The candidate has realized an archive version of the system: it is composed by several modules, which the final one is the basin model. As for the project specifics, different data sources have been used. Weather station data have been managed through the use of spatial interpolator: this was necessary because of large distance between weather stations and basin location. Different kind of spatial interpolator have been tested and compared.

Satellite observations dataset have low data availability: this mainly depends on satellite orbits over the area. Satellite data quality has been improved with data imputation method: a neural network ensemble has been trained and tested. A NARX model has been created to predict basin level using weather and basin data. Weather data used come from the previous modules. Different delays sim-ulation has been done: acceptable results are achieved within 1 to 6 hour ahead. Real Time web platform will be developed in the short time.

Hydrocontroller: a monitoring system of the hydro resource based on statistical approaches and artificial intelligence

Dipartimento di Ingegneria dell’Energia, dei Sistemi, del

Territorio e delle Costruzioni

Corso di Laurea in Ingegneria Elettrica

Tesi di Laurea Magistrale

Hydrocontroller: a monitoring system of

the hydro resource based on statistical

approaches and artificial intelligence

Anno Accademico 2019-2020

Contents

List of Figures

List of Tables

Acronyms

Chapter 1

Introduction

Chapter 2

Hydrology

2.1

Hydroelectric Basin

2.2

Garfagnana hydroelectric system

2.3

Garfagnana hydrological system

Chapter 3

Weather Data

3.1

Reanalysis Dataset

3.2

Satellite Observations

3.3

Ground Data

Chapter 4

Statistical Modelling and Machine

Learning Algorithms

4.1

Spatial Interpolators

4.1.1

Kriging

4.1.2

Radial Basis Function

4.1.3

Natural Neighbour Interpolator

4.2

Machine Learning Algorithms - Neural

Net-works

4.2.1

Nonlinear Autoregressive Exogenous Model

4.3

Model Selection

4.3.1

K-Fold Cross Validation

4.3.2

Leave One Out Cross Validation

Chapter 5

Case Study - Hydrocontroller

Project

5.1

Hydrocontroller Project - Description

5.2

Project Dataset

5.2.1

Weather Stations Data

5.2.2

Satellite Data

5.2.3

Reanalysis Dataset - Era Interim

5.2.4

Pontecosi Basin Data

5.3

Interpolators implementation

5.4

Soil Moisture modelling

5.5

Basin modelling

5.5.1

Basin Model - 1 hour ahead

5.5.2

Basin Model - Extended Forecast

5.6

Future developments