• Non ci sono risultati.

Approximate dynamic programming techniques for microgrid energy management

N/A
N/A
Protected

Academic year: 2021

Condividi "Approximate dynamic programming techniques for microgrid energy management"

Copied!
83
0
0

Testo completo

(1)

POLITECNICO DI MILANO

Scuola di Ingegneria dell’Informazione Corso di Laurea Magistrale in

Ingegneria dell’Automazione

Approximate dynamic programming techniques

for microgrid energy management

Relatore: Prof.ssa Maria Prandini Co-relatore: Ing. Riccardo Vignali Co-relatore ASP Prof. Carlo Novara

Tesi di Laurea Specialistica di: Francesco Borghesan Matr. 762702

(2)
(3)

There is nothing more exciting than thinking of a new idea. There is nothing more rewarding than seeing a new idea work. There is nothing more useful then a new idea that helps you meet a goal.

Edward De Bono

A mia madre,

(4)
(5)

S O M M A R I O

L’utilizzo sempre più diffuso di fonti di energie rinnovabili e la con-seguente decentralizzazione dei sistemi di generazione dell’energia richiedono una riprogettazione del sistema di gestione della rete elet-trica, che da centralizzato e unidirezionale (trasporto dell’energia da poche grandi centrali di generazione agli utenti consumatori) deve diventare distribuito e supportare flussi energetici bidirezionali da parte degli utenti che svolgono ora il duplice ruolo di consumato-ri/produttori. La nascita di piccole reti autonome locali “microgrid” connesse alla rete di distribuzione globale semplifica in parte il pro-blema, in quanto consente di accorpare sottoparti del sistema, e di scomporre il problema di gestione della rete in quello di gestione del-la singodel-la microgrid e di gestione deldel-la rete di microgrid. In questa tesi, si affronta uno specifico problema di gestione che si presenta nell’ambito di una microgrid, e cioè la minimizzazione dei costi rela-tivi al consumo di energia. Questo obiettivo può essere perseguito at-traverso un’opportuna gestione delle varie unità presenti nelle micro-grid: carichi (edifici con i loro sistemi di illuminazione, riscaldamento e condizionamento, ed eventualmente anche con apparecchiature di produzione), generatori (che sfruttano fonti di energia convenzionale e non), e sistemi di accumulo dell’energia (come batterie che posso-no sopperire alla discontinuità nella produzione di energia da fonti rinnovabili, piuttosto che dispositivi di accumulo di energia termica). La riduzione dei costi deve essere ovviamente subordinata al manteni-mento di un livello adeguato di prestazione del sistema, valutato per esempio in termini di comfort delle condizioni ambientali all’interno di un edificio. In questo lavoro sono state analizzate due configurazio-ni piuttosto semplici di microgrid. Nella prima configurazione non sono presenti unità di generazione e immagazzinamento dell’energia, ma solo un carico rappresentato dall’impianto di condizionamento di un edificio che deve essere mantenuto ad una certa temperatura ed è soggetto ad un disturbo stocastico rappresentato dal profilo nell’arco della giornata dei suoi occupanti; mentre nella seconda configurazio-ne vieconfigurazio-ne integrata un’unità di immagazzinamento di econfigurazio-nergia termica per il raffreddamento (cold thermal storage). Nonostante il numero limitato di componenti, entrambe le configurazioni presentano le ca-ratteristiche salienti che si riscontrano in una microgrid, più precisa-mente, la presenza di dinamiche sia continue che discrete, di disturbi stocastici, e di vincoli sullo stato, che rendono il problema di gestione ottima dei consumi energetici difficile ma al tempo stesso interessante da affrontare. Il problema della gestione ottima energetica viene for-mulato come un problema di controllo ottimo stocastico vincolato in

(6)

cui si cerca la legge di feedback sullo stato che minimizza i consumi energetici. Tale problema può essere decomposto in due fasi. La pri-ma fase consiste nel determinare la ripartizione ottipri-ma della potenza refrigerante richiesta all’impianto di condizionamento tra le sue uni-tà componenti in modo da ridurre la potenza elettrica complessiva richiesta alla rete di distribuzione esterna. Questa prima fase si ridu-ce alla soluzione di un problema di ottimizzazione statica non lineare. La seconda fase consiste nell’ottimizzazione della potenza refrigeran-te richiesta all’impianto di condizionamento, mirata a farlo funzio-nare a livelli di efficienza elevati e con consumi di potenza elettrica ridotti. La potenza refrigerante in questione può essere modulata va-riando il set point di temperatura all’interno dell’edificio (seppure di una quantità limitata per mantenere un livello adeguato di comfort) e, nel caso tale elemento sia presente, utilizzando il cold thermal sto-rage che viene quindi caricato/scaricato per mantenere l’impianto di condizionamento ad operare nelle condizioni di massima efficienza. La seconda fase può essere ricondotta ad un problema di program-mazione dinamica per un sistema ibrido stocastico a tempo discreto, la cui soluzione necessita di ricorrere a tecniche di programmazione dinamica approssimata a causa della presenza delle variabili di stato continue e del disturbo stocastico. Le tecniche qui proposte sono due e si basano, la prima sull’utilizzo del principio di equiparazione alla certezza e della grigliatura dello spazio di stato, mentre la seconda sull’astrazione del sistema ibrido stocastico ad una catena di Markov controllata. In entrambi i casi il calcolo della politica ottima si avva-le della simulazione del sistema che modellizza la microgrid. Nella tesi vengono riportati esempi numerici per mostrare i risultati otteni-bili con le due tecniche di programmazione dinamica approssimata in entrambe le configurazioni della microgrid. Per quanto riguarda il contributo di questa tesi, le tecniche sviluppate si possono considera-re innovative non solo da un punto di vista applicativo ma anche da quello metodologico, dato il numero limitato di risultati in letteratu-ra sul controllo ottimo e sull’astletteratu-razione a modelli discreti di sistemi ibridi stocastici.

(7)

A B S T R A C T

Lately, the exploitation of renewable resources has been growing sig-nificantly. The consequent increase in the decentralization of power generation systems is calling for the redesign of the current electric grid, which should move from a centralized architecture, where en-ergy is produced in few big electric power plants and then delivered to consumers, to a distributed one, where users play the two-fold role of consumers and producers and energy flows are bidirectional. The introduction of small autonomous local grid (the “microgrids”) connected to the main distribution grid partly simplifies the problem, decomposing it into two sub-problems: the management of a single microgrid and that of a grid of microgrids.

This thesis focuses on the energy management problem for a mi-crogrid, which consists in minimizing its energy consumption costs. This goal can be pursued by suitably operating the microgrid com-ponents, that is: loads (buildings comprising lighting, heating and air conditioning systems and, possibly, also manufacturing systems), generators (based on renewable or conventional energy sources), and energy storage units (e.g. thermal energy storage or high capacitance batteries). Cost reduction strategies should anyway guarantee some expected level of performance for the system, such as, for example, adequate ambient conditions in terms of temperature and humidity for the occupants in a building.

Two simple microgrid configurations are analyzed in this work. In the first one, neither power generation nor energy storage units are present and the model includes as main components a chiller plant with two chillers and a cooling load representing a building, affected by a stochastic disturbance given by the number of its occupants dur-ing the day. In the second one, a cold thermal storage is integrated in the system. Despite the limited number of components, both configu-rations show the key features that characterize a microgrid model: continuous and discrete dynamics, stochastic uncertainty, and the presence of constraints on the state. These features make the energy management problem challenging to solve. The problem is formu-lated as a stochastic optimal control problem with constraints, where the objective is to determine an optimal feedback policy that mini-mizes the energy costs spent over some look-ahead time horizon. The problem is then decomposed into two phases: The first phase involves solving the optimal chiller plant scheduling problem, which consists in splitting the cooling power requested to the chiller plant between the chillers so as to reduce the electric power request to the main grid. This reduces to a static nonlinear optimization problem. The second

(8)

phase involves optimizing the cooling power requested to the chiller plant, which should be kept operating in high efficiency conditions, thus reducing the electric energy consumption. This is achieved by modulating the building temperature set point of some small amount not to cause discomfort, and by using the thermal storage, if present, to release or accumulate the difference between the cooling load de-mand and the cooling power provided by the chiller plant. The sec-ond phase can be reduced to a dynamic programming problem for a discrete time stochastic hybrid system, which then requires suitable approximate dynamic programming (ADP) techniques to be solved, due to the presence of continuous state variables and of stochastic uncertainty. Two ADP solutions are proposed: the first one is based on the certainly equivalence approach and on state space gridding, whereas the second one is based on a controlled Markov chain ab-straction of the system. In both cases, the computation of the opti-mal policy requires simulating the system modeling the microgrid. Numerical examples referring to both the microgrid configurations illustrate the results achieved with both the ADP solutions.

As for the contribution of this thesis, the developed techniques ap-pears innovative, not only from the perspective of the application, but also from a methodological viewpoint, given that only a few re-sults are available in literature on the optimal control and the finite abstraction of stochastic hybrid systems.

(9)

C O N T E N T S

1 i n t r o d u c t i o n t o t h e c o n c e p t o f m i c r o g r i d 11

1.1 Current power system set up and outlook . . . 11

1.2 Microgrids . . . 13

1.3 Thesis content and organization . . . 14

2 o p t i m a l e n e r g y m a na g e m e n t o f a m i c r o g r i d 15 2.1 Microgrid energy management problem . . . 15

2.2 Problem setup and formulation . . . 17

2.3 Controlled system equations . . . 22

2.3.1 Model of the Local Power Network . . . 22

2.3.2 Distribution Grid . . . 23

2.3.3 Chillers . . . 24

2.3.4 Zone . . . 24

2.3.5 Occupants . . . 26

2.3.6 The outside temperature . . . 28

2.3.7 Chilled water circuit . . . 28

2.4 Stochastic optimal control problem with constraints . . 28

2.5 Chiller plant optimization . . . 30

2.6 Temperature set point optimization . . . 30

3 a p p r o x i m at e d y na m i c p r o g r a m m i n g s o l u t i o n s b a s e d o n a b s t r a c t i o n a n d s i m u l at i o n 33 3.1 ADP solution based on the certainly equivalence ap-proach . . . 33

3.2 ADP solution based on a controlled Markov chain ab-straction . . . 34

3.2.1 Definition of the controlled Markov chain ab-straction . . . 35

State and control sets . . . 35

Transition probability function . . . 36

3.2.2 Definition of the transition costs . . . 37

3.2.3 Assessment of the quality of the controlled Markov chain abstraction . . . 39

3.2.4 DP equations for the Markov chain abstraction 39 4 n u m e r i c a l r e s u lt s 41 4.1 Chiller plant optimization . . . 41

4.2 Temperature set point optimization . . . 44

4.2.1 ADP solution based on the certainly equivalence approach . . . 47

4.2.2 ADP solution based on a controlled Markov chain abstraction . . . 48

Definition of thee-coverage occupancy tube . . 50

Evaluation of the best approximating occupancy profile . . . 51

(10)

Assessment of the quality of the controlled Markov

chain abstraction . . . 52

4.2.3 Comparative analysis of the two ADP approaches 53 5 e x t e n s i o n t o a m e d i u m s i z e m i c r o g r i d i n c l u d i n g t h e t h e r m a l s t o r a g e 55 5.1 Thermal storage . . . 55

5.1.1 Functionality . . . 55

5.1.2 Stratified thermal water storage . . . 57

5.2 Controlled system equations . . . 59

5.2.1 Zone . . . 60 5.2.2 Chillers . . . 60 5.2.3 Thermal storage . . . 60 5.2.4 CHWC . . . 61 Storage branch . . . 61 Pipe branch . . . 62 Chiller branch . . . 63

5.3 The resulting hybrid system . . . 64

5.4 Stochastic optimal control problem with constraints . . 65

5.5 Temperature setpoint and chiller power request opti-mization . . . 65

5.6 Numerical results . . . 67

5.6.1 Chiller plant optimization . . . 67

5.6.2 Temperature set point and chillers power re-quest modulation . . . 67

5.7 Conclusions . . . 70

6 c o n c l u s i o n s a n d f u t u r e w o r k 73 7 a p p e n d i x 75 7.1 The scenario approach . . . 75

(11)

L I S T O F F I G U R E S

Figure 1 Principal scheme of a microgrid . . . 14

Figure 2 An example of a microgrid configuration . . . 18

Figure 3 Configuration of the considered small-scale mi-crogrid analyzed . . . 18

Figure 4 Energy saving obtained incrementing the zone set point ¯TZA = 20 °C of a fix amount all the day long, as a function of the set point increment. 19

Figure 5 Control scheme of the microgrid configuration in Figure3 . . . 20

Figure 6 A conceptual view of the control scheme in Figure5 . . . 21

Figure 7 Local Power Network . . . 22

Figure 8 Representation of the LPN as a hybrid automa-ton . . . 23

Figure 9 Heat produced by a single person, as a func-tion of the environment temperature. . . 25

Figure 10 Occupancy profile . . . 27

Figure 11 Number of occupants needed to saturate the chillers as a function of TZA SPwhen TOA =32°C. 35

Figure 12 An example of e-coverage occupancy tube com-puted through the scenario approach . . . 37

Figure 13 Different possible approximating profiles . . . 38

Figure 14 Outside ambient temperature . . . 41

Figure 15 Nominal occupants profile . . . 42

Figure 16 The efficiency curves of the two chillers . . . . 43

Figure 17 Results of the chiller scheduling optimization . 45

Figure 18 Chiller plant optimization test . . . 46

Figure 19 Certainly equivalence-based policy: zone tem-perature set point behavior. . . 48

Figure 20 Policy for the optimal modulation of the zone temperature set point: “Green”: + 0°C; “Or-ange”: +0.5°C; “Red”: +1°C . . . 49

Figure 21 Two realizations over time of the temperature set point with the optimal policy . . . 50

Figure 22 Plot of thee-coverage tube withe =0.01. . . . 51

Figure 23 Different strategies for the use of the storage. . 57

Figure 24 The three different areas in a stratified thermal storage . . . 58

Figure 25 Temperature profile of the Sharp’s model. Ef-fect of the number of considered layers to the temperature profile at a certain instant. . . 59

(12)

Figure 26 Configuration of the medium-scale microgrid with storage . . . 59

Figure 27 Scheme of the Chilled Water Circuit . . . 62

Figure 28 Effect of the condenser temperature over the chillers COP . . . 68

Figure 29 Outside ambient temperature . . . 70

(13)

L I S T O F TA B L E S

Table 1 List of parameters used for the first configuration 42

Table 2 The grid used in the certainly equivalence-based ADP computations. . . 47

Table 3 List of parameters used for the second config-uration . . . 68

(14)
(15)

1

I N T R O D U C T I O N T O T H E C O N C E P T O F

M I C R O G R I D

1.1 c u r r e n t p o w e r s y s t e m s e t u p a n d o u t l o o k

Current electricity networks are mainly based on large bulk power units, certain transmission and distribution line capacities and mostly passive loads. Generation is primarily based on thermal and hydro power plants. The demand side is only partly predictable and subject to uncertain deviations. Energy delivery is based on long (weeks, months) to mid-term (days) scheduling of power plants. The system adapts to uncertainty in demand and supply (wind power generation) by rescheduling of available units and using ancillary services.

Increasing ambitions for an extensive deployment of either small and/or large scale renewable energy production systems lead to new challenges from unit dispatch and transmission grid planning to long term planning like generation portfolio and system design. Europe is one of main supporter of renewable energy, both for the care of pub-lic opinion for environmental topics as for geopolitical and energetic independence reasons. At the end of 2008, the European parliament promulgated the “20-20-20” targets, a set of binding legislation that committed Europe by 2020:

• To reduce emissions of greenhouse gases by 20% with respect to values of 1990

• To increase energy efficiency to save 20% of EU energy con-sumption

• To reach 20% of renewable energy in the total energy consump-tion in the EU

After the Fukushima disaster, a new boost to renewable energy was given, when Germany and Japan found the renewable sources as a valid substitute of nuclear plants. Germany committed itself to a more ambitious goal than the European “20-20-20” targets. The project Energiewende (“energy transition” in English), with respect to values of 1990, aims to:

• To reduce by 40% greenhouse emissions within 2020 and by 80% within 2050

• To reach 35% of renewable energy in the total energy consump-tion within 2020

(16)

Germany has already produced 20% of its electric energy from re-newable sources in 2012 obtaining, after the closing of some nuclear plants and the start of the project Energiewende, an increase of energy prices and an higher risk of instability of its electrical network due to the stochastic nature of renewable energy production and due to dif-ficulties in transmitting electricity from the north of country, where most of wind farms are present, to the south, the most industrialized area of the country. Energiewende is considered the most ambitious project worldwide regarding the integration of renewable source in current power networks and it is attracting important investments.

Indeed, one of the main issue regarding renewable energy genera-tion, such as wind power generagenera-tion, is its dependence on environ-mental conditions so that it becomes critical for the electricity system to have detailed weather forecasts. The main problems related to weather based energy production are:

• Ground based forecasts still lack in spatial-temporal accuracy • Non-linear conversion of wind-speed to electrical power

Furthermore, since it is not possible to control the production of elec-tricity with renewable source, the loads, represented for example by building provided by solar panels, not only can absorb energy from the network but, in case they produce more electricity than they need, provide energy to the power network, creating the need of transmis-sion grids able to provide a bidirectional energy flow to and from the consumers.

Due to the stochasticity of renewable energy production, it is cru-cial for voltage/frequency stability reasons, to have a dynamic adapt-able system. Nowadays many services are still working on a rela-tive rough time-basis (in general day ahead scheduling). Due to ad-vanced control mechanisms, it should be possible to operate on a much shorter time basis.

A further limitation of current power system to the introduction of renewable energy sources is that, at the moment, the demand side can participate in general in power markets only with large demand units. This happens by special contracts or on the forward to day-ahead market. Stochastic generation and grid restriction make it nec-essary to enlarge the possibilities of reliable system services, e.g., by aggregating small demand units and treating them as single load.

Due to its long planning terms (years to decades), capital inten-sity and long-term life span of existing assets, the ”future” energy network will therefore be a combination of:

• Stochastic regenerative generation units of large scale

• A wide area transmission grid able to support a bidirectional flow of energy in addition to a large-scale conventional power plants and hydro storage

(17)

• A conglomerate of Microgrids with different kinds of control-lable demand units, small scale generation and small scale stor-age units

1.2 m i c r o g r i d s

Microgrids are local energy grids that can be connected to large dis-tribution grid or can operate autonomously. Microgrids are primarily considered as electricity networks, but in the essence, they can in-clude any type of local energy generation, distribution, consumption and storage elements. Frequently, generators produce not only elec-tricity, but also heating and cooling energy. Microgrids elements can be categorized into three groups:

• Local generation represents various generation sources that feed the local grid with electricity. These sources can be split into two major groups - conventional energy sources (i.e. diesel gen-erators) and renewable generation sources (i.e. wind turbines). Electricity distribution grid operated by the utility company is always considered as the major source of electricity supply, ex-cept when the microgrid is operating in islander mode.

• Consumption constitutes the elements that consume electricity from the local grid and each element corresponds either to a specific device or to an aggregated load. An aggregated load typically represents various building systems such as lighting or heating, ventilation, air conditioning or equipment that is used e.g. for manufacturing purposes.

• Energy storage elements can bring significant advantages partic-ularly when the microgrid includes intermittent energy sources such as wind turbines or photovoltaics whose electricity produc-tion varies over time. Energy storage is an important element that adds more flexibility, but also increases the microgrid oper-ational complexity.

These elements are interconnected by a fixed network whose physical properties have to be considered in the optimization (i.e. transfer ca-pacity, transport delay etc.). From the utility company point of view, the microgrid acts as a single large customer who can buy or sell electricity and the microgrid as a whole is usually able to perform demand response actions. Trading conditions are given by contract between the microgrid owner and the utility company. Demand re-sponse action is invoked by the grid operator that sends the request to the energy management system which will adjust the microgrid de-mand in an optimal way while satisfying the grid operator’s request. Figure 1 provides a schematic view of a microgrid which illustrates

energy flows and the relationship between utility company and the microgrid.

(18)

Figure 1: Principal scheme of a microgrid

1.3 t h e s i s c o n t e n t a n d o r g a n i z at i o n

This thesis focuses on the energy optimization of a microgrid, which is formulated a finite-horizon constrained optimal control problem for a stochastic hybrid system (SHS), characterized by discrete and continous dynamics and subject to stochastic uncertainty.

A dynamic programming (DP) reformulation is given to the control problem, and two approximate dynamic programming techniques based on the SHS abstraction and simulation are conceived. The first technique rests on the certainty equivalence principle in that the stochastic uncertainty is neglected when addressing the solution of DP equations. The second technique rests on the abstraction of the SHS to a controlled Markov chain and on the solution of the DP equa-tions for the corresponding optimal control problem on the Markov chain.

Two microgrid configurations have been analyzed: the first is a small scale microgrid with neither power generation nor energy stor-age system. The second introduces an energy storstor-age element.

The thesis is structured as follows. Chapter 2 describes in some

detail the microgrid energy management problem and then focuses on the small scale microgrid configuration, the SHS model of the system and the DP formulation of the constrained optimal control problem.

Two ADP solutions are introduced in Chapter 3and compared in

Chapter4on a numerical instance of the small scale microgrid.

Chapter 5 treats the modeling and the control of the second

mi-crogrid configuration, which includes the energy storage. The DP formulation and ADP solution based on the controlled Markov chain abstraction are extended to this.

(19)

2

O P T I M A L E N E R G Y M A N A G E M E N T O F A

M I C R O G R I D

2.1 m i c r o g r i d e n e r g y m a na g e m e n t p r o b l e m

Microgrid energy management stands for a complex optimization task where technical and economical aspects have to be taken into account. The microgrid energy optimization problem consists of two optimization subtasks: power stabilization and economical optimiza-tion. As for the first subtask the energy supply should be stable, safe and reliable: imbalances between supplied and consumed power have the tendency to destabilize the power flows in the microgrid, which can lead to blackouts in extreme cases. As for second subtask, operating costs of a microgrid can be minimized by coordinating and dispatching multiple generation, consumption and storage elements connected to the microgrid. Major operating costs are given by fuel costs, costs for energy storing and costs for electricity bought from the utility. If the microgrid is connected to the distribution grid, the redundant electricity produced in the microgrid could be sold to the utility company.

The energy management system in charge of the described opti-mization subtasks, can be divided into two layers:

• Low level control ensures control of voltage and frequency of each generation and storage element in the grid. In this case the response times are typically in milliseconds or in tens of milliseconds. Low level controllers are working autonomously but can receive control signals from upper control layers that en-sure system-level coordination. On this control layer, problems are closely related to the control of physical devices like gas tur-bines, wind generators etc. These devices can be modeled by continuous dynamic where nonlinearities and uncertainty can be included. Only the power stabilization subtask is addressed at this layer.

• Supervisory control has the responsibility for optimal operation of the microgrid, which is achieved by coordinating and dis-patching multiple generation, consumption and storage elements connected to the grid. At this control level, a microgrid is considered as a grid of devices interconnected each other and changes in the status of these devices are determined. These changes can be described through some discrete dynamics which may lead to a complex combinatorial problem. Additional com-plexity is added to the problem by the continuous dynamics

(20)

of the physical devices and of the variables of interest such as, for instance, the temperatures of the buildings in the microgrid. Particular devices connected to a microgrid can be modeled by complex continuous dynamic including nonlinearities. Gas tur-bine are typical example of such device. Furthermore the net-work status is optimized under time-varying conditions subject to uncertainty associated e.g. with forecasting of renewable gen-eration, future energy demands and energy prices. Both power stabilization and cost can be addressed at this level.

Therefore the problem formulation leads to a large scale system which can be modeled as a stochastic hybrid system since it involves :

• Discrete variables: microgrid configuration (device modes), in-trinsic discrete variables (e.g. number of occupants of the build-ing, see later)

• Continuous variables: power of conventional generators variables related to energy flows in the microgrid, temperatures of ther-mal loads

• Uncertainty: renewable generation, power demands, dynamic energy prices, weather condition

To be more concrete, we depict in Figure 2 a schematic view of a

microgrid, which includes the following main elements:

• The Grid Power represents the main distribution grid, assumed capable of supplying an unlimited amount of power. Actually in critical situations, the main distribution grid can have a fail-ure (blackout) and, hence, no grid power will be supplied. A property of the distribution grid that is relevant to the energy management problem is the price per energy unit (kWh) which is typically time-varying.

• The Local Power Network (LPN) is the local electricity network interconnecting the main distribution grid and the power loads. Stability of the LPN depends on the power unbalance between power supply and demand, and is not an issue here, since we as-sume that the power unbalance is instantaneously compensated by the power grid.

• The Electrical Load stands for power demand that can be rep-resented as power for equipment, lights, lifts etc. These loads feature a stochastic behavior. Therefore can not be controlled by energy management system.

• The Chillers are electrical devices that convert electrical energy into cooling energy. The performance of each chiller depends on the outside ambient temperature, the temperature of the cooling medium and the requested cooling power [10].

(21)

• The Cooling Load represents a zone (which can be a room, sev-eral rooms or a partitioned space in a room) whose temperature should track a given reference profile. Cooling power is pro-vided to this purpose by the chillers through the CHWC. The cooling power request is determined by two sources of heat-ing power, that is the outside temperature and the internal heat gains due to the presence of people, office equipment, light-ing, etc. These sources of heat are tipically uncertain and some stochastic model is adopted to describe their behavior.

• The Chilled Water Circuit (CHWC) represents the cooling energy distribution system using water as the distribution medium. • The Thermal Storage represents a dynamic element that enables

energy storage. Incorporation of this element opens new possi-ble strategies for the improvement of energy optimization. • The Thermal load or Heating load represents the heating that should

be provided to the zone so that the temperature can track a given reference profile. The heating power is mainly affected by the outside temperature.

• The Boiler represents a device that combusts gas and produces heating power, while the CHP (Combined Heat and power) is a microturbine able to produce electric power in addition to the heating one. Issues related to CHP are the times needed to start it up and shut it down. The heat energy distribution system is represented by the Hot water circuit (HWC) where water is used as distribution medium.

• The Renewable Power Source can represent a wind turbine. The power that the turbine can provide is dependent on the wind speed, which has a stochastic behavior.

Note that if microgrid is connected to the distribution grid, the op-timization can focus only on the cost minimization problem, because the ancillary services of the distribution grid eliminate the microgrid unbalances automatically. Only under specific conditions the stabil-ity of the microgrid is more important than economical aspects and optimization has to focus on power stabilization. This is typically the case of military bases, which are usually operated in the islanding mode.

2.2 p r o b l e m s e t u p a n d f o r m u l at i o n

In this thesis, we study the problem of optimal energy management of a microgrid which can be modeled as a stochastic hybrid system (SHS). Control of SHS is an area of research that has recently attracted the interest of both communities in control and computer science [14]

(22)

Figure 2: An example of a microgrid configuration

Figure 3: Configuration of the considered small-scale microgrid analyzed

and that is quite challenging. Here we focus our study on the small scale microgrid sketched in Figure3and then further extend our

ap-proach to the case when microgrid includes also the thermal storage (see Chapter5).

The considered microgrid has no local power source and fully de-pends on the main distribution grid for the electrical energy supply. Its main components are the chiller plant, composed by two chillers, the chilled water circuit and the cooling load while no storage ele-ment is present. Note that, despite of the fact that we do not include the renewable power source in the microgrid, we still preserve the main ingredients of the problem (discrete, continuous and also the stochastic component due to the cooling load), which make it chal-lenging to solve.

The objective is to operate the microgrid so as to best satisfy the cooling energy demand while minimizing the electrical energy costs. This goal is pursued through two joint actions: on the one hand, the cooling power request is appropriately split between the two chillers so as to optimize the performance of the chiller plant; and, on the other hand, the zone temperature set point is modified to some ex-tent with respect to some reference profile so as to decrease the cool-ing power request and, hence, save energy. The maximal allowed

(23)

Figure 4: Energy saving obtained incrementing the zone set point ¯TZA =

20 °C of a fix amount all the day long, as a function of the set point increment.

variation of the set point represents a compromise between saving and discomfort, and is the result of an agreement between the grid operator and the users.

The control scheme for the microgrid shown in Figure5introduces

additional elements, that is:

• The low level controllers, namely the chilled water temperature controller and the thermostat, that regulate the temperature of the chilled water circuit and the temperature of the zone • The energy management system to be designed.

The energy management system is composed of two blocks:

• The chiller plant optimizer that decides how the requested cooling power should be split between the chillers

• The temperature set point modulator that decides how to modify the zone temperature set point with respect to some given ref-erence profile so as to decrease the cooling power request As shown in Figure 4 referring to this microgrid configuration,

sig-nificant energy savings can be obtained with a limited variation of the set point well within the comfort bounds set by the ISO norm on thermal comfort, [12].

Let[t0, tf]denote the look-ahead control horizon. The zone temper-ature set point TZA SP is obtained by modifying some reference profile

¯

TZA of at most some (small) amount∆maxand only a few times dur-ing [t0, tf], for a maximum total discomfort level dmax representing the integral over[t0, tf]of the set point variations. This translates into the following equation:

(24)

Thermostat Zone Chilled Water Circuit Chilled Water Temperature Controller Chiller Plant Optimizer Ambient Conditions Local Power Network Distribution Grid TZA TZA SP eC,Z XC,Z TCW SP TCW eC,CW QC SP QC QC,Ch2 QC,Ch1 TOA TOA TOA PG Electrical energy Thermal energy Disturbances Control signals Chiller #2 Chiller #1 TCW PL,Ch1 PL,Ch2 TCW TCW TCW Occupancy Profile npeople Temperature Set-point Modulator

Figure 5: Control scheme of the microgrid configuration in Figure3

where the control variable∆ZA of the set point modulator is subject to the following instantaneous and integral constraints:

|∆ZA(t)| ≤∆max ∧ ˆ t

t0

|∆ZA(η)| ≤dmax, t∈ [t0, tf].

An additional state variable d can then be introduced to account for the integral constraint on the discomfort:

˙

d(t) = |∆ZA(t)|, d(t0) =0, subject to d(t) ≤dmax, t∈ [t0, tf].

The underlying implicit assumption here is that TZA SP is represen-tative of the actual behavior of TZA (i.e., the lower-level controllers have been appropriately designed so as to guarantee a satisfactory tracking performance) and if dmax = 0, then no discomfort is intro-duced.

The chiller plant optimizer is assumed to satisfy the cooling power request QC SP,compatibly with the maximum cooling power QC, max, that the two chillers can provide jointly, i.e.:

QC(t) =Φ[0,QC,max](QC SP(t)),

whereΦ[a,b](·)is the saturation function

Φ[a,b](x) =          a, x< a x, x∈ [a, b] b, x>b.

(25)

Thermostat Zone Chilled Water Circuit Chilled Water Temperature Controller Chiller Plant Optimizer Ambient Conditions Local Power Network Distribution Grid TZA TZA SP eC,Z XC,Z TCW SP TCW eC,CW QC SP QC α TOA TOA TOA PG Electrical energy Thermal energy Disturbances Control signals Chiller #2 Chiller #1 TCW PL,Ch1 PL,Ch2 TCW TCW TCW Occupancy Profile npeople Temperature Set-point Modulator ∆ZA T* ZA SP

Figure 6: A conceptual view of the control scheme in Figure5

QC is split between the two available chillers according to

QC,Ch1(t) = (1−α(t))QC(t) QC,Ch2(t) =α(t)QC(t)

where α ∈ [0, 1] is a scheduling parameter that denotes the fraction of cooling power QC assigned to chiller 2, the remaining 1−α

frac-tion being assigned to chiller 2. QC,Chi denotes the cooling power requested to chiller i satisfying 0 ≤ QC,Chi ≤ Qmax

Chi. The maximum cooling power that chiller can provide is thus given by:

QC,max =QmaxC.Ch1+QmaxC.Ch2

Note that the scheduling strategy adopted by the chiller plant op-timizer affects only the power demand, which is given by the sum of the electric powers PL,Ch1PL;Ch2 required by the chillers to provide the cooling powers QC,Ch1 and QC,Ch2, and has no influence on the dynamics of the zone and chiller water circuit temperatures. These dynamics infact depend on the overall cooling power QC provided by the chiller plant and not on how it is split among the two chillers composing the plant. This is shown in the conceptual view of the control system in Figure 6, where the control variables∆ZAand α of

the energy management system are also represented.

Before formulating the energy management problem as a constrained optimal control problem, we first describe the equations of the micro-grid model.

(26)

Figure 7: Local Power Network

2.3 c o n t r o l l e d s y s t e m e q uat i o n s 2.3.1 Model of the Local Power Network

The local power network constitutes a junction point where power supply meets power demand, as shown in Figure 7where the main

distribution grid is considered as a special kind of generator.

Based on the aggregated value of demand and supply power, power unbalance (and frequency deviation) is computed and the operational mode (grid status) of the local power network is then determined via a stability criterion.

We model the power network through two algebraic equations: a power balance equation and an equation relating frequency deviation to the computed power unbalance. The power balance equation is given by ∆P(t) = N

j=0 cL,j(t) ·PL,j(t) − M

i=0 cG,i(t) ·PG,i(t) (1)

where P denotes power and c is a flag that is 1 when a given device is connected to the network and 0 otherwise. The topology of a power grid is thus captured by the set of all flags. Subscript G refers to supply side elements (generators) and subscript L to demand side ones (loads). The constants M and N denotes respectively the number of supply side elements (including the power grid) and of demand side elements.

As for the algebraic equation relating frequency deviation to power unbalance, Figure8aplots an example of a static characteristic. Based

on the frequency deviation, grid operation status is assessed in the stability criterion block. Figure8b shows an example of stability

cri-terion block that is given by a timed automaton with three discrete modes:

(27)

(a) Static∆P−∆ f characteristic (b) Discrete states of a local power network

Figure 8: Representation of the LPN as a hybrid automaton

• Operational: frequency deviations are small and power network is stable. In other words, generators connected to power net-work are able to satisfy the demand.

• Emergency: frequency deviations of medium entity arise. Gen-erators are not able to satisfy loads in long term. More genera-tors have to be connected or some load have to be disconnected within some defined time otherwise a transition to the failure mode will occur.

• Failure: frequency deviation has overcome a critical value and power network is shut down. Generators and loads are not interconnected and some restoration time is required before the power network is able to become operative again.

2.3.2 Distribution Grid

This element represents power supplied by the main distribution grid to the local power network of a microgrid. Whenever the microgrid is connected to the main distribution grid, a feedback control mecha-nism is active which makes the grid power PGcompensate the power unbalance ∆P in (1) through equation:

˙

PG =kG·∆P.

If the dynamical behavior of the main grid is neglected, then, the microgrid power unbalance ∆P(t) can be set equal to zero at each time t, which results in the following equation:

M

i=0 cG,i(t) ·PG,i(t) = N

j=0 cL,j(t) ·PL,j(t), (2)

(28)

In our microgrid case study there is no electric generator in the mi-crogrid and equation (2) becomes

PG(t) =PL,Ch 1(t) +PL,Ch 2(t) (3) where PL,Ch i denotes the electric power requested by chiller i.

2.3.3 Chillers

Chillers are electrical devices that remove heat from a liquid via vapor compression or absorption cycle. According to the static nonlinear Gordon-Ng model [10], the basic chiller equation has the following

form TCW TOA  1+ 1 COP  −a4= TCW QC a1+a2 (TOA−TCW) TOAQC +a3QC TOA  1+ 1 COP 

As for the parameters aj, j = 1, 2, 3, 4, a1 is the total internal en-tropy ∆ST, a2 is the equivalent heat leak Qleak,eqv and a3 is the heat exchanger thermal resistance R, whereas a4 represents a bias term which is equal to 1 in the original Gordon-Ng model.

From the previous equation it follows that if the chiller is on, then, PL,Ch depends on the outside ambient temperature TOA, the CHWC temperature TCW, and QC,Chaccording to;

PL,Ch = ai,1TOATCW+ai,2(TOA−TCW) TCW−ai,3QC,Ch (4) + ai,4TOAQC,Ch TCW−ai,3·QC,Ch −QC,Ch Obviously if the chiller is off, then PL,Ch =0.

2.3.4 Zone

The zone representing some buildings is modeled as a big room whose temperature is affected by three heating contributions:

1. The heat exchange through convection with Chilled Water Circuit with the water in its pipelines

2. The heat produced by its occupants

3. The heat exchanges with outside environment through conduc-tion with the walls

The temperature TZA of the zone evolves according to the following equation:

(29)

Figure 9: Heat produced by a single person, as a function of the environ-ment temperature. CZA dTZA dt = |X(t)kcw(TCW{z(t) −TZA(t))} 1 +QPeople(t) | {z } 2 (5) +kout(TOA(t) −TZA(t)) | {z } 3 where:

• X stands for the opening of the valve which controls the inflow of air in the heat exchangers. It is limited between 0 and 1. A PI controller (the thermostat) regulates TZA and guarantees that it follows the desired temperature set point

• TCW stands for the chilled water temperature

• QPeople is the heat produced by the occupants of the zone • kcw and kout respectively represent the heat transfer coefficient

of the heat exchangers and of the walls • TOAstands for the outside temperature.

As for QPeople, it is given by the following expression: QPeople(t) =a1TZA2 (t) +a2TZA(t) +a3 nP(t)

where nP is the number of occupants of the zone and the other factor represents the heat generated in the zone by a single person when the temperature is TZA. The quadratic expression is obtained by fitting data measured for different values of environment temperature [1]

(30)

2.3.5 Occupants

Occupants constitute a main source of heating in densely occupied buildings, such as offices and shops, and, with a constant increase in building thermal insulation performances, they are becoming an even more important factor. Research is focusing on finding models or tools for predictions of occupant profile and this is assessed through statistical methods or using techniques typical of artificial intelligence such as neural networks or supported vector machines [20, 23, 13]. The work of J. Page [17] provides a probabilistic formulation of the

prob-lem through the introduction of a Markov chain which indicates the probability a certain person is going to be present or not in the follow-ing time interval. The model can be trained and fitted to the sfollow-ingle person but it is hard to be used for describing the overall occupancy of a building.

In this thesis, we model the number of occupants nP in a build-ing through a birth-death process with time varybuild-ing birth (arrivals) and death (departure) rates. Assuming that at time t0the building is empty, we can define nP as follows:

nP(t) =max(∆I N[t0, t] −∆OUT[t0, t], 0),

where∆I N[t0, t]denotes the number of arrivals within[t0, t]and∆OUT[t0, t] the number of departures within [t0, t]. ∆I N[t0, t]and ∆OUT[t0, t]are independent Poisson processes with time varying rates λI N(·) and

λOUT(·), that is:

Pr(∆I N =k) = e− ´t t0λI N(η)´t t0λI N(η) k k! Pr(∆OUT =k) = e− ´t t0λOUT(η)´t t0λOUT(η) k k! Given that E[∆I N[t0, t] −∆OUT[t0, t]] = ˆ t t0 λI N(η)− ˆ t t0 λOUT(η)dη,

we can define the rates λI N and λOUTbased on a nominal occupancy profilenP¯(t), t ∈ [t0, tf], as follows λI N(t) =    ˙¯ nP(t) nP˙¯(t) >0 0 nP˙¯(t) ≤0 λOUT(t) =    ˙¯ nP(t) nP˙¯(t) <0 0 nP˙¯(t) ≥0

Figure 10a plots some possible realization of nP given the nominal

(31)

(a) Some realizations of the occupancy profiles

(b) Nominal occupancy profile

(32)

2.3.6 The outside temperature

The outside temperature TOA is assumed to be given by some accu-rate forecast and treated as a deterministic signal. Indeed if the insu-lation level of the building is high, fluctuations around the forecast value have a limited impact and the effect of the internal heat gain is dominant.

2.3.7 Chilled water circuit

The chilled water circuit (CHWC) is modeled as volume of water, with a certain thermal capacity, whose temperature is influenced by heat exchanges with the building and the chillers:

CCW dTCW

dt =Qext(t) −QC(t) (6) where Qext(t)is the cooling power exchanged with the zone at tem-perature TZA

Qext(t) =X(t)kcw(TZA(t) −TCW(t)) (7) and QC(t) is the cooling power provided by the chiller plant QC(t). Heat losses in the circuit are neglected.

TCW is kept at some set point value TCW SP by a joint action of a proportional controller and a load disturbance compensator. The resulting cooling power QC SP given by:

QC SP(t) =Φ[0,QC,max](kP,CW· (TCW(t) −TCW SP(t)) +Qext(t))

where kP,CW is the proportional gain, and Φ[a,b](·) is the saturation

function. Thus, if the chiller plant can satisfy the cooling power re-quest (i.e. QC =QC SP), by plugging QC SP into equation (6), we have

that TCW is governed by: CCW

dTCW

dt = −kP,CW(TCW(t) −TCW SP(t))

2.4 s t o c h a s t i c o p t i m a l c o n t r o l p r o b l e m w i t h c o n s t r a i n t s In this section, we define the energy management problem the con-trolled system described in the previous section.

This is a stochastic hybrid system with three continuous state vari-ables (the CHWC temperature, the zone temperature, and the state variable of the PI controller regulating the temperature of the zone) and two discrete state variables (the number of occupants in the zone and the on/off status of the chillers). The stochastic inputs acting on the system are given by the Poisson processes determining the evo-lution of the birth-death process modeling the number of occupants

(33)

nP. The discrete state component describing the local power network mode is neglected since we assume that the distribution grid can (in-stantaneously) compensate its power unbalance.

The on/off status of the chillers will be modeled implicitly, through the scheduling parameter α (when α = 0, then chiller 2 is off and, when α = 1, then chiller 1 is off, whereas for all other cases both chillers are on).

Let us denote by s the state of the stochastic hybrid system, includ-ing the discomfort variable d, and by S the state space. Consider the state-feedback control policy

π :S × [t0, tf] → [0, 1] × [−∆max,∆max]

that maps s ∈ S and t ∈ [t0, tf] into an appropriate value for the scheduling parameters α, and the set point control variable ∆ZA ∈

[−∆max,∆max]to be applied at time t when the state value is equal to s. The objective is to determine an optimal policy π?that minimizes the

energy cost spent over the time horizon [t0, tf], while not exceeding the maximum discomfort level dmax. In formulas:

min π E π s0 hˆ tf t0 cG(t)PG(t)dt i (8) subject to: d(t) ≤dmax,∀t∈ [t0, tf],

where PG(t) denotes the power requested to the main distribution grid and cG(t)the price per unitary power request, at time t∈ [t0, tf]. Here, s0 denotes the state value at time t0 and Eπs0 denotes the

ex-pected value when the initial state is s0 and the control policy π is applied, since different initial state values and/or control policies in-duce different probability distributions on the system trajectories and, as a consequence, over the realizations of PG(t).

In view of decoupling between chillers optimization and the rest of the dynamic of the system, the solution to problem (8) can be

structured into two phases: 1. Design an optimal policy π

α :S × [t0, tf] → [0, 1]for the schedul-ing of the chillers, and, based on the obtained policy

2. Design an optimal policy π

ZA :S × [t0, tf] → [−∆max,∆max]for

discomfort modulation

It is worth noticing that the solution to the second phase depends on the designed map πα, which affects the power requested to the

distribution grid. Policy π? is then given by

π? = (πα?, π?ZA) : S × [t0, tf] → [0, 1] × [−∆max,∆max] where πα? and π?

ZA denote the optimal policies obtained in the two

(34)

We next address the solution to both phases and show that phase 1 reduces to a static optimization problem, whereas phase 2 can be reduced to a Dynamic Programming (DP) optimization problem for a discrete time stochastic hybrid system.

2.5 c h i l l e r p l a n t o p t i m i z at i o n

The α scheduling parameter has no influence on the evolution of the controlled system state variables and on the cooling power QC but only affects the power request PGto the main grid. Since PG is given by equation (3), where PL,Ch1 and PL,Ch2 are either zero (when the

chiller is off) or static functions of α, QC, TOA and TCW 4, then, the

optimal map πα? : S × [t0, tf] → [0, 1] can be obtained by solving a nonlinear static optimization problem where PG is minimized with respect to α for each given set of values for QC, TOAand TCW. Indeed, the resulting α?(QC, TOA, TCW)function can be rewritten as a function of the state s ∈ S since TOA and TCW are state variables, and QC is a static function of TCW, TZA and X.

2.6 t e m p e r at u r e s e t p o i n t o p t i m i z at i o n

The optimal policy πZA : S × [t0, tf] → [−∆max,∆max] for the set point modulator can be determined by solving the following con-strained optimization problem:

min π∆ZAE π∆ZA s0 hˆ tf 0 cG(t)PG?(t)dt i (9) subject to: d(t) ≤dmax,∀t ∈ [t0, tf],

where PG?(t)is the power demand when the optimal scheduling pol-icy π?α is used.

If we assume that the set point is not modulated continuously but is changed every τ = tf−t0

M , and that the control variable∆ZAtakes only a finite set of values, say U, then, problem (9) can be rephrased as

a finite-horizon control problem for a discrete time stochastic system with a discrete control input set. The discrete time stochastic system executions are obtained by sampling the executions of the original continuous time system, with the understanding that the control in-put is held constant over each time frame [τk, τk+1), where we set

τk :=t0+kτ for ease of notation.

Let uk := ∆ZA(τk) and xk := s(τk) respectively denote the control and state variables at the discrete time instant k, and wkthe stochastic inputs affecting the system evolution in the time frame[τk, τk+1). Ob-serve that the sampled version zk :=d(τk)of the discomfort variable is a discrete state variable evolving according to

(35)

Based on the introduced notations, problem (9) can be rewritten as follows: min ν E ν s0 hM−1

k=0 ck(xk, uk, wk) i (11) subject to: zk ≤dmax,∀k∈ [0, M],

where ν= (ν0, ν1, . . . , νM−1)with νk(·) =πZA, τk):S → U denotes the discrete time policy corresponding to the piecewise constant con-tinuous time policy πZA, and

ck(x, u, w) = ˆ τk+1

τk

cG(t)PG?(t)dt,

is the one-step-cost representing the energy cost spent in [τk, τk+1) when s(τk) = x, ∆ZA(τk) = u, and the stochastic inputs within

[τk, τk+1)are given by w.

Problem (11) can be solved, in principle, through the DP approach.

Here, we shall refer to the Q-iteration method.

Let us define the Q-functions Qk :S × U → <+, k=0, 1, . . . , M−1,

where Qk(x, u) represents the expected cost incurred over the time window[k, M), when the state at time τkis x, the control input at time

τk is set equal to u and maintained constant for the next τ minutes,

and an optimal modulation policy is applied from time τk+1onwards. The Q-functions can be computed according to the following back-ward iterative procedure:

Qk(x, u) =Ewk h ck(x, u, wk) +minu0∈U (z0)Qk+1((˜x0, z0), u0) i (12) ∀x= (˜x, z) ∈ S, u∈ U (z)

for k=0, 1, . . . M−2, initialized at k = M−1 with

QM−1(x, u) =EwM−1[cM−1(x, u, wM−1)] (13)

In equation (12), x ∈ S is the state at time τ

k and x0 = (˜x0, z0) ∈ S denotes the value taken by the next state (i.e., the state at time τk+1) when the control input at time τk is set equal to u for the next τ min-utes, z0 being the value of the discomfort variable. Ewk[·]denotes the

expectation with respect to the stochastic inputs that are responsible for the probabilistic evolution of the system in the k-th time-frame

[τk, τk+1), and U (z0) represents the set of admissible values for the control input u when the value of the discomfort variable is z0, given that discomfort cannot exceed dmax,∀k ∈ [0, M].

For instance ifU = ( 0,∆max 2 ,∆max ) , then

(36)

U (z0) =            U dmax−z0 ≥∆maxτ n 0,∆max 2 o ∆max 2 τ≤dmax−z 0 < maxτ {0} dmax−z0 < ∆max2 τ.

Based on the Q-functions, the optimal policy ν?k for set point mod-ulation can be expressed as

νk?(x) ∈arg min

u∈U (z)Qk

(x, u), x= (˜x, z) ∈ S.

Unfortunately, computing an exact solution to the DP equations (12) and (13) is impracticable mainly due to the fact the Q-iteration

involves computing the expected value Ewk[·] with respect to the

stochastic inputs affecting the system dynamics within[τk, τk+1). The state has continuous components and numerical computations re-quire the Q-function to be finitely parametrized. One has then to head for numerical solutions based either on state space gridding or a finite parametrization of the Q-functions, (see, e.g., [3,19]).

In the next chapter we propose two approaches to the approximate solution of the DP equations: the first one based on a certainly equiv-alence approach and the second one based on a controlled Markov chain abstraction of the system. These two solutions will then be compared on a numerical instance of our case study (see Chapter4).

(37)

3

A P P R O X I M AT E D Y N A M I C P R O G R A M M I N G

S O L U T I O N S B A S E D O N A B S T R A C T I O N A N D S I M U L AT I O N

In the previous Chapter2, it has been shown how the optimal energy

management problem for the considered microgrid can be decom-posed into two phases: the chiller plant optimization, and the optimal zone temperature set-point modulation. The latter phase involves solving a finite-horizon optimal control problem for a discrete time stochastic hybrid system, which can be addressed through dynamic programming. The goal of this chapter is to present two approaches to the approximate solution of the resulting dynamic programming equations.

The first approximate dynamic programming approach is based on the certainly equivalence principle and calculates the optimal zone tem-perature modulation policy by neglecting the stochastic uncertainty affecting the microgrid. The second approach allows to take into ac-count this uncertainty, and rests on the approximation of the system through a controlled Markov chain and on the solution of the corre-sponding control problem for such a Markov chain abstraction.

3.1 a d p s o l u t i o n b a s e d o n t h e c e r ta i n ly e q u i va l e n c e a p p r oa c h

The idea of the certainly equivalence approach is to replace the stochas-tic inputs to the system with their nominal (determinisstochas-tic) component thus neglecting the uncertainty affecting the system evolution, and then compute the optimal policy ¯νk : S → U, k = 0, 1, . . . , M−1, for the so-obtained deterministic system. This way, the computation of the expected values in equations (12) and (13) is avoided and the

Q-iteration algorithm is reformulated as follows: ¯ Qk(x, u) =ck(x, u, ¯wk) + min u0∈U (z0) ¯ Qk+1((˜x0, z0), u0), k=0, 1, . . . , M−2, (14) ¯ QN−1(x, u) =cN−1(x, u, ¯wN−1).

where x0 = (˜x0, z0) is the state at k+1 when inputs u and ¯wk are applied from state x at the discrete time instant k. Based on the so-obtained ¯Q-functions, we can compute the temperature set point modulation policy:

¯νk?(x) ∈arg min u∈U (z)

¯

(38)

which is optimal for the nominal system, but sub-optimal for the original stochastic system.

The actual performance of the policy ¯ν?can be evaluated by

esti-mating the average cost

Es¯ν0?h M−1

k=0 ck(xk, uk, wk) i

through Monte Carlo simulations. This involves running n simula-tions of the system fed by n independently extracted realizasimula-tions of the stochastic input w(i), i=1, 2, . . . , n, and computing the empirical

mean 1 n n

i=1 M−1

k=0 ck  xk(i), ¯ν?k(xk(i)), w(ki) !

where x(i)is the state realization associated with realization w(i)when policy ¯ν? is applied.

Note that the state x = (˜x, z) of the system has two components: a discrete state component z that represents the discomfort variable and a continuous state component ˜x, which comprises the zone tem-perature TZA, the thermostat control variable X, and the chilled water temperature TCW. In order to solve equations (14) numerically, we

shall then partition the continuous state space and take a grid point for each element of the partition. The recursion is then computed over the chosen grid points, taking the computed ¯Q-function at the earlier step as constant over each element of the partition. Each iteration in-volves computing the one-step cost, which is done by simulating the system over the corresponding time-frame.

Regarding the continuous state component ˜x, it is worth noticing that if the heating power transferred from the zone to the chilled water circuit is always perfectly compensated by the cooling power provided by the chillers, then, based on equation (6) governing the

dynamics of CHWC temperature, it is possible to take TCW(t) = TCW SP,∀t. This entails that TCW is constant and can be actually re-moved from the set of continuous state variables which is then com-posed of TZAand X. This simplification helps reducing the size of the grid and holds under the assumption that the cooling power request does not exceed QC,max

3.2 a d p s o l u t i o n b a s e d o n a c o n t r o l l e d m a r k ov c h a i n a b s t r a c t i o n

In this section, we describe an ADP approach to solve the DP equa-tions in (12) and (13) based on a Markov chain abstraction of the

discrete time stochastic hybrid system modeling the microgrid. We start by describing the controlled Markov chain and then the costs

(39)

Figure 11: Number of occupants needed to saturate the chillers as a function of TZA SPwhen TOA=32°C.

associated with its (controlled) transitions. This will then lead to the reformulation of the DP equations on the Markov chain abstraction, and to the computation of the corresponding zone temperature set point modulation policy.

3.2.1 Definition of the controlled Markov chain abstraction

A discrete time controlled Markov chain is defined by a triple{Q,A, p}

whereQis the state set,Athe control set, and p :Q × A × Q ×N→ [0, 1] is the controlled transition probability function. Specifically, p(q, a, q0, k)is the probability that a transition to q0 ∈ Qoccurs at time k∈N when the control input a∈ Ais applied from state q∈ Q.

State and control sets

The control input set of the Markov chain is the same of the original hybrid model, i.e. A = U.

The state q of the Markov chain accounts only for the state vari-ables TZA, nP, and d at the sample times τk, k =0, 1, . . . , M, sinceTCW is assumed to be constant and equal to TCW SP as in the certainly equivalence approach.

As for the zone temperatureTZA, if the chiller plant is able to pro-vide the cooling power request QC SPfor the admissible cooling loads, then the control system is able in stationary conditions to make TZA track its set point valueTZA SP. If ¯TZA is constant, the set of pos-sible values of TZA at each sample time τk are therefore given by

{T¯ZA+∆ZA : ∆ZA∈ U }. Figure 11 shows some numerical example

(actually the one in Chapter ) where the number of occupants needed to saturate the chiller plant has been evaluated as a function of the zone temperature set-point TZA SP when the outside ambient temper-ature is TOA=32°C.

(40)

As for the discomfort variable d, it takes values in some finite set as determined by the equation (10) governing its sampled version and

by the upper bound dmax.

As for the number of occupants nP, we describe in the next para-graph how we confine its range through the notion of e-coverage occupancy tube.

t h e e-coverage occupancy tube The problem of determining the values taken by the component of the Markov chain state q cor-responding to the number of occuonats nP at the sample times τk, k = 0, 1, . . . , M, is reformulated as the problem of determining an e-coverage tube containing all possible occupancy profiles along[t0, tf], except for a set whose probability is smaller than some e ∈ (0, 1). This leads to the following chance-constrained problem:

min hlow,k≥0,hup,k≥0,k=1,2,..,M M

k=1 hlow,k+hup,k  subject to: (16) Pr −hlow,k≤ nP(τk) −E[nP(τk)] ≤hup,k,∀k ≥1−e

which can be solved through the scenario approach (see Section7.1).

The scenario solution rests on the extraction of N profiles n(Pi)(t), t∈ [t0, tf], i = 1, 2 . . . , N, and on the solution of the following convex optimization problem: min hlow,k≥0,hup,k≥0,k=1,2,..,M M

k=1 hlow,k+hup,k subject to: (17) −hlow,k≤ nP(τk) −E[nP(τk)] ≤hup,k,∀k, i=1, . . . , N

where the constraint in probability is replaced by its sample version. In Figure 12 we plot an example of e-coverage tube where, by

in-terpolating the points where the number of occupants is the highest at each sample time τk, we obtain the upper profile np,max, and, sim-ilarly, by interpolating the points where the number of occupants is the smallest at each sample time τk, we obtain the lower profile np,min. Transition probability function

Dealing with a controlled Markov chain, the probability p(q, u, q0, k)

that the Markov chain evolves from q = (TZA, nP, d) at time k to q0 = (TZA0 , n0P, d0)at time k+1 clearly depends on the control action u∈ U applied at τk given that TZA0 and d0must satisfy TZA0 = T¯ZA+u and d0 =d+ |u|τ. This means that p (TZA, nP, d), u,(TZA0 , n0P, d0), k if either TZA0 6=T¯ZA+u or d0 6=d+ |u|τ. In the case when TZA0 =T¯ZA+ u and d0 = d+ |u|τ, instead, p (TZA, nP, d), u,(TZA0 , n0P, d0), k is the probability of having |n0P−nP|arrivals or departures (depending on

(41)

Figure 12: An example of e-coverage occupancy tube computed through the scenario approach

the sign of ∆P = n0P−nP) in the interval [τk, τk+1]for the birth-death process describing the occupancy profile. To have p (TZA, nP, d), u,

(TZA0 , n0P, d0), k well defined as a probability, i.e., summing up to 1 when n0P ranges within the e-coverage tube, we assign to the exteme values admissible for n0P at time τk+1 the probability associated to all arrivals/departures ∆P within [τk, τk+1] that will make nP+∆P exit thee-coverage tube.

3.2.2 Definition of the transition costs

Given the controlled Markov chain abstraction of the discrete time stochastic hybrid system modeling the microgrid, we need to asso-ciate to each admissible transition from state qk to qk+1 when the con-trol input uk is applied at time k a cost ˆck(qk, uk, qk+1) representing the energy consumption for that transition.

Admissible successor states qk+1 of qk = (TZA, nP, d) when the control input uk = ∆ZA is applied at time k take the form qk+1 =

(TZA+∆ZA, nP0 , d+ |∆ZA|τ) and the corresponding transition cost

can be determined by simulating the original system behavior within the time interval [τk, τk+1 = τk+τ). This requires appropriately

ini-tializing the system at time τk, setting the zone temperature set point equal to TZA SP(t) = T¯ZA+uk, t∈ [τk,τk+1), feeding the system with the outside ambient temperature profile TOA(t), t ∈ [τk, τk+1) and with a suitable profile for the arrivals/departures bringing the num-ber of occupants from nP at time τk to n0P at time τk+1, and, finally, evaluating the energy consumption during the time window[τk, τk+1). As for the initialization of the system at time τk, the zone temper-ature, the number of occupants, and the discomfort variable are set equal to the corresponding values in qk = (TZA, nP, d) . The other two state variables of the original microgrid system, i.e., the CHWC temperature and the thermostat variable, TCW is set equal to TCW SP,

(42)

Figura 13: Different possible approximating profiles

whereas the thermostat variable X is set equal to the value obtained by considering equation (5), describing the dynamics of the zone

tem-perature, at the equilibrium:

0=Xkcw(TCW−TZA) +QPeople+kout(TOA−TZA). Thsi leads to the following initialization for X :

X= −QPeople+kout(TOA−TZA)

kcw(TCW−TZA)

. (18)

as a function of the zone temperature,the CHWC temperature, the outside ambient temperature and the number of occupants at time

τk.

As for the definition of a suitable profile for the arrivals/departures bringing the number of occupants from nP at time τk to n0P at time

τk+1, four different profiles are considered:

• Beginning: the occupants profile in the interval[τk, τk+1)is con-stant and equal to the final value n0P at instant τk+1

• End: the occupants profile in the interval [τk, τk+1) is constant and equal to the initial value nP at instant τk

• Middle: the occupants profile in the interval[τk, τk+1)is equal to the initial value nP in the first half interval[τk, τk+ τ

2)and to the final value n0P in the second half interval [τk+

τ

2, τk+1)

• Triangular: the occupants profile is obtained by interpolating the initial value nP at instant τk and the final one n0P at instant τk+1 In Figure13a graphical representation of the different approximating

profiles is provided. We shall see in the following chapter that the triangular approximating profile is better suited in our context.

Riferimenti

Documenti correlati

On the other hand, stocks with negative exposure to innovations in volatility should earn high future returns in order to compensate investors for the higher

Allora, persuaso della loro indispensabilità, persuaso dal molto da correggere, proposi al governo di cominciare a far fare una rassegna dei militi a cavallo dai rispettivi

Alla luce dei recenti e radicali cambiamenti occorsi nell’ultimo decennio e dell’irruzione silenziosa di una dimensione giurisprudenziale anche nel diritto penale

En la obra de la más reciente poesía argentina, nos referimos a algunos de los poetas que empiezan a publicar alrededor de los años 70 y 80, como María Negroni,jorge

In this study simulation are performed using 6 years load patterns and corresponding wind and solar generation data (2007- 20112) based on hourly load data. The wind and solar

Una matura distinzione tra le due categorie delle leggi costituzionalmente necessarie e di quelle a contenuto costituzionalmente necessario si è avuta solo con

Analizzando però i livelli di soddisfazione per le singole aree non è possibile rilevare differenze significative: i punteggi sono fra loro analoghi ed in linea con quanto riferito

64 l.f., prevedendo l’acquisizione automatica alla massa fallimentare – mediante trascrizione della sentenza dichiarativa di fallimento – dei beni oggetto degli atti a titolo