Decentralized Cloud Computing

(1)

Decentralized Cloud Computing

Thesis Summary Orchestration Service Cloud 1 PUBLIC BLOCKCHAIN Physical Resources Hypervisor Layer

OpenStack Marathon Bare Metal Registry

Cloud 2 Cloud 3

Cloud1Cloud2Cloud3 _App1 _App2 _App3

SOSM SOSM SOSM

Gateway Service Orchestrator UI Service Optimization Engine Component Administration Network PnP PnP PnP Florin-Adrian Sp˘ataru

Keywords: Distributed Systems, Cloud Service Orchestration, Blockchain, Smart Contracts

Timi¸soara and Pisa 2021

(2)

(3)

3.1 Service Optimization Engine . . . 8 3.2 User Interface . . . 9 3.3 User Experience . . . 9 4 Self-organization and Self-management in Resource Management Systems 11 4.1 Framework . . . 11 4.2 Coalition Formation Strategies . . . 12 5 Small-Scale CloudLightning System Experimentation and Evaluation 15

6 Resource assignment using Ethereum Smart Contracts 17

6.1 Scheduling Methods . . . 17 6.2 Experimental Evaluation . . . 19 6.3 Discussion . . . 20

7 A fault tolerant decentralized Cloud 21

7.1 Component Administration Networks . . . 22 7.2 Fault tolerant Orchestration . . . 23 7.3 Discussion . . . 25

8 Conclusion 26

(4)

(5)

Chapter 1 Introduction

During the rise of the Internet, fellow researchers have envisioned a system of interconnected computers, structured as a peer to peer network, where each one is able to make its content accessible by the others, without the presence of a centralized server. Although this remains valid, the majority of today’s Internet services are designed as centralized applications, which store and process impressive amounts of data in warehouse-scale computing centers. This has been advanced by the field of Cloud Computing, where resources in large data centres are rented to users. Cloud Service Providers are generally considered trusted actors that provide Quality of Service mechanisms.

In this context, peer to peer systems have continued to be used for file-sharing services and volunteer computing applications. However, the field of Internet of Things has introduced a need for storage and computational infrastructure which needs to be close to the devices at the edge of the network. Using a peer to peer network to provide these needs can help the sensors and devices to access computational and storage resources closer to their location. This, in turn, would reduce network congestion by sending the data close to the source for preprocessing, rather than sending it to the Cloud directly. Additionally, it can provide an economic return for the owner of the infrastructure which hosts such Services.

Lately, the emergence of Blockchain technologies, such as BitCoin [15] and Ethereum [4], has created economic incentives for participation in a globally distributed network of computers. Blockchains construct a public immutable ledger of transactions which serves two main purposes: transparency and auditability. The Ethereum network consists of over 25, 000 nodes that collaborate to store and update the Ethereum Virtual Machine – a global replicated state machine capable of executing arbitrary code instructions assembled into a Smart Contract. The state machine is updated by transactions that are organized into blocks on a Blockchain. This, in turn, requires each node participating in the network to execute all transactions in a block locally, which hinders the performance of the system. The advantage is that each node can access the state machine information locally.

A Blockchain should not be used as a means to store high amounts of data, but rather the minimum amount of necessary information needed to ensure the Application and Business logic. Even if the Blockchain paradigm provides a decentralized mechanism for advancing its state, users must rely on external components to create a decentralized application. This opens a market for providing the storage and computational power needed to create a fully decentralized application. Several companies are trying to fill this gap and are presented in the next section. This thesis proposes a Decentralized Cloud Platform, where Virtual Machine Instances, Containers, and accelerators like Graphic Processing Units (GPU), Many Integrated Core (MIC) cards or Data Flow Engines (DFE) can be provisioned from a peer-to-peer network maintained using Smart Contracts deployed on a public Blockchain. In order to achieve this, we leverage existing open-source technologies for Cloud Orchestration and design a Decentralized Resource

(6)

2 CHAPTER 1. INTRODUCTION

Selection mechanism and Scheduling Protocols for resource allocation. The Cloud Orchestration software provides facilities for defining and deploying services that depend on infrastructure and on other services.

The Decentralized Resource Selection mechanism is facilitated by a Smart Contract on a public Blockchain and a Self-Organizing Self-Managing Resource Management System. The Smart Contract exhibits operations for registering/removing worker nodes, and for creating Application Contracts. At first, we inspect the operational constraints and costs associated with outsourcing the selection logic to the Smart Contract. Then, we devise a selection mechanism that takes place outside the Smart Contract, which accepts an assignment based on resource reservation promises signed by the resource manager.

The main contributions of this thesis are:

1. The implementation of the Gateway Service, which bundles together several components which allow for the definition, composition, optimization, and deployment of Cloud Ser-vices which use HPC infrastructure like GPUs, MICs or DFEs. Additionally, the flexibility of the Cloud Service Provider is enhanced by allowing a User to design an Application composed of Abstract Services, for which an implementation is chosen based on the current state of the resources managed by the Cloud.

2. The design of a resource management framework which allows for the self-organization of collaborative management components which have individual goals. This Self-Organizing Self-Management (SOSM) framework allows for setting and updating global goals which are transmitted down the hierarchy, influencing the behaviour of the components down-stream in order to align with the global objective. The system is experimentally evaluated with respect to the Service optimization process and decisions made by the SOSM frame-work, and with respect to the overhead incurred by our System.

3. The design of a decentralized Cloud platform managing privately owned resources and ensuring fault tolerance in case of Service or Component failure. Component fault tolerance is enforced through a novel concept of Component Administration Network, which is responsible of monitoring components state, assigning them work and saving checkpoints related to their work. Then, Service fault tolerance is managed by an Orchestrator Component which saves checkpoints related to the Service state.

The thesis is organized as follows. Chapter 2 presents a survey of the literature related to Cloud Services, Peer to Peer networks, consensus protocols and Blockchain technology. Section 2.3 presents the current effort in the same direction as our thesis.

Chapter 3 presents the implementation of the Gateway Service components and provides a comparison with current approaches for Service Delivery. Chapter 4 presents the SOSM Framework and two predictive methods to create coalitions of resources that can be used to ease the resource selection process. The framework is evaluated by experimentation on an open source trace data set. Chapter 5 presents an experimental evaluation of the CloudLightning System on a small scale testbed.

Chapter 6 investigates the operational constraints associated with outsourcing the resource selection mechanism to a Smart Contract. It provides both a static analysis of the cost and a dynamic analysis over the cost and latency, using the same open source data set. Chapter 7 constructs a decentralized, fault-tolerant Cloud platform based on the results of the previous chapter. It defines the concept of Component Administration Networks and presents protocols for joining, leaving, and monitoring the network and components which are managed by the network. Finally, conclusions are given in Chapter 8.

(7)

Chapter 2 Context

This chapter outlines the technologies and literature related to this thesis. First, it presents the Cloud Computing paradigm together with Peer to Peer systems that provide similar function-ality, without the guarantees offered by Clouds. The field of Distributed Ledgers is surveyed in the search for a consensus mechanism that can be used to validate and secure a set of operations that a Cloud system requires in order to function. Finally, efforts moving in the same direction are presented.

2.1 Cloud Computing

NIST defines Cloud Computing as “a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction” [13].

Cloud Computing has lead to advancement in the field of computer applications, reducing the time-to-market through ease of deployment, management and cost reduction. Clouds are enlarging the spectrum of capabilities they are offering, adding Graphic Processing Units (GPUs), Many Integrated Cores units (MICs), and lately Field Programmable Gate Arrays (FPGAs). Nevertheless, these resources are mostly available to experienced users because they require in-depth configuration.

In Table 2.1 we outline the computing facilities offered by Cloud Service Providers (CSPs) to be integrated in consumer applications. Amazon Web Services (AWS), Google Compute, and Microsoft Azure are leading in terms of customers, infrastructure and service variety.

Table 2.1: Compute infrastructure available in Clouds

Compute type Support Details Virtual Machine Very Common

The virtual machine is the most common compute infrastructure, including a separate operating system than the host.

Container Common A process running in a different namespace, isolated from other

applications running on the same host.

GPU Common GPUs can be attached to VM instances on-the-fly.

FPGA AWS AWS offers a development kit for writing FPGA images

Serverless Common The Serverless computing paradigm allows users to trigger code in response to events.

Docker [14] is a technology that performs virtualization at the operating system level using the Linux kernel to limit the view on the operating system. Opposed to Virtual Machines, which

(8)

4 CHAPTER 2. CONTEXT

include a separate operating system, Containers are built on layers which may be shared by similar applications. Kubernetes [3] is an open-source resource management platform offered as a Service by most Cloud Providers. Mesos [11] is a similar technology, but does not offer orchestration features, in contrast to Kubernetes.

2.2 Peer to peer systems

The concept of Peer to Peer (P2P) has been made popular by file sharing services, beginning with Napster in 1999. In general, a peer to peer network is composed of nodes which play both the role of a client and the role of a server, contributing their resources (storage, bandwidth, computation) to the network. In Table 2.2 we list several peer to peer applications and some of their usage scenarios.

Table 2.2: Peer to peer applications

Application Examples Usage

File sharing Bittorrent, Gnutella, IPFS

Content delivery networks, software distribution (Linux, games)

Anonymity I2P, Tor Anonymous and Censorship resistant protocols relay user traffic through several peers out of tens of thousands using end-to-end encryption Desktop

grids

BOINC[1], XtremWeb [8]

Peer-to-peer file transfers and volunteer computation for the scientific community. Distributed Ledgers BitCoin, Ethereum, etc.

The rise of digital currencies gives possibility to any peer to impose the order of transactions (mine a block) and append it to the distributed ledger.

2.3 Distributed Ledgers

The development of algorithms for building resilient servers and distributed systems through replication starts with work of Lamport on Byzantine agreement [12, 17], and evolved over the years, and is well summarized in [6]. The problem of achieving consensus in a group of nodes can rely on two parts: “(1) a (deterministic) state machine that implements the logic of the service to be replicated; and (2) a consensus protocol to disseminate requests among the nodes, such that each node executes the same sequence of requests on its instance of the service.” [5]

A Distributed Ledger is a distributed state machine which is replicated across a network of peers using a Consensus Protocol, without the need for centralized storage or management. The most popular example of a Distributed Ledger is a Blockchain, where transactions a organized in blocks which are linked trough cryptographic hashes, forming a chain.

The majority of Blockchain implementations are for crypto-currency systems and only allow the transfer of money, and eventually a small scripting language to take care of conditionals. A Smart Contract is piece of code that allows for arbitrary computation on the distributed state machine, including loops and calling other Smart Contract functions.

Depending on who is allowed to append blocks, we distinguish three types of Blockchains: permissionless, permissioned, and private.

Permissionless Blockchain protocols allow anyone to join the network and run a node which is able to broadcast transactions and contribute to the state of the system by mining blocks.

(9)

2.3. DISTRIBUTED LEDGERS 5

On the other hand, permissioned Blockchains are managed by known parties which allows them to choose which nodes are able to modify the state of the ledger, and eventually select which nodes are able to send transactions into the system. If a Blockchain is managed by one single entity, this is called a private Blockchain.

Permissioned systems can benefit from the vast literature on consensus, state replication and transaction broadcast in asynchronous networks where the connectivity is uncertain, or the nodes are subject to crashes or subversion by an adversary. Nevertheless, there are many start-ups which are developing Blockchain protocols based on pure intuition, without relying on established research.

In Table 2.3 we present several distributed ledger technologies, and indicate their type and whether they support Smart Contract execution.

Table 2.3: Popular Blockchain platforms

Ledger Type Smart Contracts

BitCoin Permissionless No

Ethereum Permissionless Yes

HyperLedger Sawtooth Permissionless No

HyperLedger Fabric Permissioned Future

Hashgraph Permissioned No

Ripple Permissioned Future

Stellar Permissioned No

Several companies are tackling the offering of cloud services through the means of Blockchains, and are presented in Table 2.4. Ethereum itself is taught as the world computer, though the capabilities of storing data and execution are drastically limited by the price of smart contracts. Thus, the technologies rely on using the Ethereum Blockchain in order to create tokens for their platform, and raise investment funds through Initial Coin Offerings (ICOs).

Table 2.4: Blockchain based Technologies for Cloud Services

Name Blockchain Market Storage Services

Golem Ethereum No No SaaS

SONM Ethereum Yes Yes IaaS

iExec Ethereum Yes Yes PaaS

Enigma External Yes Encrypted PaaS

Decenter Ethereum Yes Not defined IaaS

The current efforts decouple the Blockchain infrastructure from the scheduling logic, using a trusted oracle (for randomized matchmaking) and smart contracts that play the role of schedulers. The incentive is that workers will tend to join worker pools where the payoff is better, so this may cause the centralization under several large worker pools. In this scenario, the side-chain and scheduling logic should be able to scale with the number of placement alternatives. This thesis investigates the cost and latency associated with such a system and provides design guidelines. Finally, a decentralized architecture is proposed for the management of arbitrary resource types and for the enforcing the fault tolerance of Services and Management Components.

(10)

Chapter 3 Gateway Service

The Gateway Service is a collection of components allowing for the definition, composition, optimization and deployment of HPC Services using the Cloud paradigm. The key contributions are:

• the modelling of infrastructure (VMs, Containers, Bare Metal servers, hardware Acceler-ators), services and the relationships between them using the TOSCA specification • the modelling of abstract services which can be instantiated by different explicit

imple-mentations through the process of Service Optimization

• the implementation of a User Interface by extending the Alien4Cloud platform with plugins that allow for the Service Optimization process and deployment on CloudLightning infrastructure

• the design of the specification and protocol, as well as the implementation for the Service Optimization process, allowing for communication between the Gateway and the resource manager.

• the implementation of the CloudLightning Orchestrator, which is able to deploy Applica-tions composed of Services using heterogeneous resources (e.g. a VM Service linked to a Container Service which makes use of a hardware accelerator).

A preliminary version of this work has been published previously [7]. This is revised and enriched to present the definitive version of these components. The Gateway Service allows Application Developers publish Service Definitions and requirements. The End User is able to select and link several of these services in an Application. Depending on the parameters chosen by the End User (cost, performance) and the state of the system (load, energy efficiency), a scheduling system will recommend the placement of the services on the infrastructure. The Gateway then proceeds with deploying the Application services and informs the user about operational metrics: service status, service endpoints, credentials.

Several components are required to manage the life-cycle of an Application. A User Inter-face allows for the management of service definitions and application deployments. A Service Portfolio provides the means for storing Service Definitions (e.g., requirements, dependencies) and Application Topology definitions, consisting of one or multiple Services and the relationship between them. An Application Developer (AD) is using this component to store such definitions, which can later be used by an Application Operator (AO) to create a new Application Topology or deploy a version designed by the developer.

A novel concept of Application Abstraction allows a user of the platform to define an Applica-tion Topology consisting of Abstract Services. These kind of Service DefiniApplica-tions are abstracApplica-tions of the explicit Service Implementations, defining only dependencies on other services, but no

(11)

7

Figure 3.1: Gateway Service components and interactions

requirements on the hardware type or accelerators. The Service Optimization Engine (SOE) allows for the inspection of the Service Portfolio in search for the explicit implementations and provide the SOSM System with a Blueprint of all combinations of implementations for the Application. After choosing the most suitable resources for a Blueprint, the user is presented with the explicit Application Topology. This topology is deployed by an Orchestrator, which manages the life-cycle of the Application.

Entities definitions

TOSCA (TOpology Specification for Cloud Applications) [16] is a specification designed by OASIS that allows for modelling the full software stack for an application, in order to de-sign portable applications which have reproducible behaviour when migrating from one Cloud platform to another. The specification details a meta-model for the definition of both the structure and management of applications. The Services composing an Application and the relationships between them and the infrastructure are defined using a Topology entity. Services and Infrastructure are represented as nodes, and relationships must have a source and a target node; additionally relationships can have parameters that may be of use to the Orchestration Engine.

Figure 3.2 presents example relationships for the nodes we previously described. In general, a CLService and an Accelerator are hosted on a CLResource. A GPUSoftware and a GPU are hosted on a CLResource, and the GPU satisfies the GPUSoftware’s requierement. A MICContainer and a Many Integrated Cores (MIC) unit are hosted on a DockerHost, and the MIC satisfies the MICConainer’s requirement for a MIC. The AcceleratedBy relationship is depicted in a thin green line.

(12)

8 CHAPTER 3. GATEWAY SERVICE

Figure 3.2: Accelerated Services Examples

There are two Application Topology types: abstract and explicit. An abstract topology defines at least one Abstract Service definition within its constituents. An explicit topology is composed only by explicit services and resources. The Service Optimization Engine provides the instantiation of an abstract topology with explicit Services based on the current state of the Resource Management system. The user can nevertheless create an explicit Application where the SOE only requests the required resources but it is not allowed to modify the topology.

The nodes, capabilities, and relationships provided in this thesis provide a starting point for the definition of heterogeneous Cloud Services, which may use accelerators to improve their performance.

3.1 Service Optimization Engine

A user does not need to have knowledge about the hardware characteristics when designing or using an Application Topology. It does this through the means of abstract service definitions. The Service Optimization Engine is responsible for encapsulating all possible implementations and the Resource Manager system to select the most suitable one, given a set of constraints.

A ResourceBlueprint is constructed from the TOSCA Topology. Each service gets a cor-responding ServiceRequest in the blueprint. The SOE will gather all implementations of a given service and will create a ResourceRequest following the requirements of the TOSCA Implementation (e.g. resource type, accelerator type, CPU cores, RAM amount). The Resource Manager System is responsible to select a single resource request to satisfy a service request.

Figure 3.3 presents the connection of the two abstract nodes denoting the UI and the rendering engine, respectively.

(13)

3.2. USER INTERFACE 9

There are two implementations for the UI and two implementations for the engine, totalling to 4 possible instantiations for this application, though only the container implementation for the UI is presented, for brevity. One the left part we have the SimpleRayTracingEngine container implementation with no specific requirements for its host, leaving the resource assignment to the managing Orchestrator. On the right part we have the PhiRayTracingEngine container implementation. This implementation requires the deployment on a specific host which has a MIC attached that satisfies the Engine’s requirement for a specific accelerator.

3.2 User Interface

A friendly User Interface has been developed to aid the Cloud Users to visually define Cloud-Lightning Services and run CloudCloud-Lightning Applications. The implemented plugins aid the user to design, optimize, deploy and monitor the state of a Application. The Components interface allows for the upload and inspection of TOSCA definitions. All CloudLightning Base Types are packed as a zip archive, using the Cloud Service ARchive (CSAR) format [2], which is uploaded using this interface. There are two other interfaces: Topology templates and Applications. The Topology templates interface allows for the design of an Application Topology template and save it in the Application Portfolio for later use. The Applications interface allows for the creation of an Application Topology from scratch or by using a template defined in the previous described interface. Moreover the Applications interface also contains a panel associated with the Service Optimization process and provides the means for Application deployment.

The Orchestrator plugin sends a request for one service at a time, checking each one’s health before proceeding with any service that depends on it. For example, it waits for the rendering engine to be available and accessible on the exposed port before deploying the UI, which depends on this information to be provided during container start-up. This process can be observed using the Runtime panel, present on navigation bar of the Applications Interface. Figure 3.4a presents the PhiRayTracingEngine starting up, and the RayTracingWebService on hold. Once the rendering has is accessible, the Orchestrator can retrieve the host and port information and use it to set-up the request for starting the UI. In figure Figure 3.4b we show the healthy rendering engine and the UI starting up.

(a) Rendering Engine start-up (b) UI start-up

Figure 3.4: Ray Tracing Application deployment

3.3 User Experience

Traditional deployment

Traditional service deployment involves having granted access to an HPC infrastructure where the user logs into a head node and is able to submit an application description to the queue of

(14)

10 CHAPTER 3. GATEWAY SERVICE

a resource manager. Usually, the application runs on the host OS and therefore all dependent libraries and software stack must be installed at this level. An alternative to this is to run the application as a VM or Container and have all dependencies bundled within the image of the application.

The majority of HPC application users are usually experts in using the application they execute, but not in its configuration and most probably not experts in managing heterogeneous infrastructure. Therefore, in order to facilitate the migration of HPC application in the Cloud, the End-User experience should not imply the configuration and installation of the Application and the management of its underlying resources, but rather the design of the Application and some business level constraints (e.g. performance, cost). Besides, the availability of heterogeneous accelerators is strictly restricted to the technical expert, with low focus on the end-user which will actually benefit from the performance improvement.

Traditional on-premises HPC clusters require an IT department with expert training in infrastructure management and software configuration. Even then, substantial effort is put into configuration, assuring all library dependencies are met without breaking the dependencies of other applications. Instead, using the CloudLightning System, applications can be configured using a high level language as described in Section 2.3 and the technical staff can focus on experimenting with different configurations to ensure the best performance.

Our approach

The migration of HPC-aware services to Cloud Deployments introduced challenges concerning their performance, encapsulation, and definition. There is little support for Applications that use heterogeneous resources, and it usually involves the selection of a specific version of software that works with a specific model of hardware accelerator. The End-User is responsible for managing both software and resource selection, which may lead to over-provisioning and conflicts in software-to-software and software-to-hardware communication.

The Gateway Service Components presented in this Chapter provide the unification of the solutions to the identified challenges. It provides a Service Portfolio, for storing service defini-tions, and an architecture defined by CloudLightning Base Types, which defines the guideline for defining HPC-like services in a portable manner. It facilitates the design of Applications and Application templates through an easy to use interface, and manages the deployment of heterogeneous Services both in terms of hardware (conventional machines, accelerators) and encapsulation (i.e., Bare Metal, VM, and Container).

The challenges are overcome through several processes. First, Services and dependent libraries are packed as VM or Container images by an Application Developer, which can experi-ment with different configurations and determine the appropriate hardware characteristics for a given performance level. Second, we leverage the portability of the TOSCA specification to define the relationships between services, conventional infrastructure and hardware accelerators. Third, through the means of the Resource Discovery and Service Optimization processes, we improve the flexibility of Application design and performance (for the End-User) and the flexibility of Resource Selection (for the Cloud Service Provider).

(15)

Chapter 4 Self-organization and Self-management in Resource

Management Systems

The motivation for designing a hierarchical system is to divide the search space for finding a subset of specific components on the bottom level of the hierarchy. This level is represented by computing infrastructure and the purpose of the management components (in the intermediate levels) is to reduce the space for reaching specific compute resources. Each management com-ponent possesses two types of Strategies: self-management and self-organizing, which dictate the actions needed to move the component closer to an individual goal. The key contributions of this chapter are:

• the design of a novel, generic, framework for self-organization and self-management in hierarchical systems; this includes mechanisms for communicating the desire of the top level components down the hierarchy and to assess the state and efficiency of bottom level components up the hierarchy.

• the design and implementation of two Coalition Formation strategies for the bottom level of the hierarchy, responsible for matchmaking Service requirements with physical machines capabilities.

• the experimental evaluation of a self-organizing self-managing resource manager prototype using open source trace data; our system outperforms the original resource assignments (from the data set) in terms of resource utilization.

Section 4.1 revises and extends with further examples a paper published previously [9]. Section 4.2 revises, extends, and integrates two previously published papers [21, 19] with the presented framework.

4.1 Framework

Our proposed framework is visually represented in Figure 4.1. Generally, actions taken by a management component will affect the state of the components in the neighbouring levels. Some actions are communicated to a component down the hierarchy in order to update individual goals with the purpose of aligning them with the global goal. Such an action is further referred to as Impetus and the process of transmitting Impetus is referred as Directed Evolution. Since underlying components posses individual goals, the Impetus will generally be integrated taking them into account. The Impetus is transmitted as a vector of Weights.

A management component evaluates the actions taken by underlying components by receiv-ing a vector of Metrics, which offer the managreceiv-ing component a Perception of the evolution taking place in the underlying levels. These metrics will further influence the Directed Evolution

(16)

12 CHAPTER 4. SOSM RESOURCE MANAGEMENT

Figure 4.1: Proposed Framework

actions. Finally, the computing infrastructure properties and state must be assessed by the managing components. Our architecture considers the use of Assessment functions which can be weighted corresponding to an individual component’s goal, and thus determine the performance of the underlying infrastructure.

Management Components engage in Self-organization in order to minimize the manage-ment cost of the level they are a part of. The Self-organization process takes place within a single level, and can have as outcome any of the following operations: component creation, destruction, splitting and merging, transfer of underlying nodes between components. We define an individual goal of the managing components as reaching an equilibrium state. This state is reached when the Directed Evolution actions, transmitted from a superior level, result in no significant variation in the Perception of the inferior components. To determine the amount of contribution a component is providing we use the notion of the Suitability Index. Therefore, the individual goal of each managing component is to maximize the its Suitability Index, and take actions that result in a greater Suitability Index for the managed components.

4.2 Coalition Formation Strategies

In Figure 4.2 we present a mechanism to aggregate historical execution data and create coalitions that may be useful for satisfying future requests. An Aggregator will consume the tasks request history and create histograms specific to a Coalition Formation Strategy. Another instance of the Aggregator will consume the tasks usage history and determine the usage for each machine. The resources are then filtered and machines which have available slots can participate in Coalition Formation for the next epoch. When a resource request reaches a bottom level component, a solution is searched within the formed coalitions. If no suitable solution is found, then coalitions can be enlarged with resources from other coalitions.

(17)

4.2. COALITION FORMATION STRATEGIES 13

Figure 4.2: Data processing for Coalition Formation

Size Frequency Similarity is a Coalition Formation Strategy which uses the size of previ-ouslycreated coalitions to determine the most frequent coalition cardinality. The more frequent a coalition size, the more coalitions of this size will be created. We assume that a server can only join one coalition in a given epoch. However, considering containerization/virtualization of the resources in a coalition, it can assign multiple services on the same coalition, given that the requirements do no exceed the capacity of any server. To achieve this, the strategy must employ coalition formation and selection procedures which aggregate usage information in order to filter out unavailable servers

In a similar manner, clusters can be constructed considering Constraint Frequency Similarity and in relation to recently requested constraints in task requirements. For example, in a dataset describing physical machine from a Google data centre, resources present 42 attributes, a subset of which have values in the realm of hundreds. [18]. Using all of them in order to compute similarity will prove not only computing inefficient, but also be yielding small and very similar clusters which do not represent improvement in finding a suitable coalition for specific jobs.

Using an open source trace data set, experimental simulation has shown that this mechanism is able to efficiently match resource queries to predicted coalitions, except when the job size exceeds the capacity of the current managing component. Our proposed method is able to achieve better resource utilization across the set of machines. This is achieved by considering unused servers as idle, and prolonging the delay until they are started by packing as many Services as a resource can handle before moving to the next one.

In the hierarchical framework we proposed jobs are submitted to the Cell Manager which chooses one pRouter to drive the task down the hierarchy, following the pSwitch, then the vRM, always choosing the component with the highest Suitability Index. When the vRM assigns the job to a set of resources, it does this by applying a VM compaction strategy, preferring servers that already have VM instances running to the ones that are idle. If not enough resources are managed by the selected vRM, then a merging action can be performed with another vRM under the same pSwitch in order to obtain the required set of servers. This action can boomerang up the hierarchy, leading to the merging of two pSwitches under the same pRouter.

In Figure 4.3a we present the average utilization of the system. This is considerably improved compared to the original utilization computed from the trace data. The most important aspect leading to this result is the VM compaction strategy, which keeps servers idle until no other

(18)

14 CHAPTER 4. SOSM RESOURCE MANAGEMENT

solution but to select them is possible.

(a) Utilization of underlying resources with respect to running servers

(b) vRack Managers number evolution

Figure 4.3: Resource utilization for SOSM management and vRM evolution

Figure 4.3b presents the evolution of the number of vRack Managers for each of 8 identified resource types. The increase in the number of such components is due to the profile of the requested resources. Since the majority of requests are for a small number of servers, vRMs will use a small amount of the servers under their control. Since no other vRMs would be willing to transfer some used resources, the best solution to increase the Suitability Index is to reduce the number of managed servers. This causes a split of a vRM in this situation. The process is continued until the vRMs will resemble the task profiles.

(19)

Chapter 5 Small-Scale CloudLightning System Experimentation

and Evaluation

The CloudLightning system has been evaluated through the means of experiments on a small-scale testbed installed at Norwegian University of Science and Technology (NTNU). The contribution of this chapter is twofold:

• presents a detailed visualization of the inner workings of the system presented in Chapter 4, on a small scale; this is omitted in this summary due to the complexity of the experiment • presents evidence that the system does not impact the execution of Services in any

significant aspect of their performance.

The use case services have been evaluated and profiled on the resources, and their perfor-mance values have been stored in the Service Portfolio. Both Genomics implementation run in Docker containers, a parameter that the SOSM system will use to identify the pRouter that manages resources using containers. The first is a CPU-only implementation while the second requires a DFE accelerator to be attached to the computing node. In terms of speed-up, the second implementation has been profiled with a improvement of 8.78. The Upscale is the service associated with the Oil and Gas use case. Both implementations run in containers, however the GPU version does not show a significant performance increase. The RayTracing use case has 4 service implementations. The first two represent the front-end, which can be executed as a VM, Container, or Bare Metal. The back-end implementations follow, both running in containers. The first runs using only the CPU, while the PhiRayTracingEngine requires a MIC card to be attached, and shows a performance increase of 4.0. Finally, Dense Matrix Multiplications are performed using the OpenBLAS or cuBLAS libraries. A CPU container implementation is the baseline and there are two GPU implementations, one running in containers and the other on Bare Metal servers. Both GPU implementations show a performance increase of 4.0.

RayTracing: MIC and CPU execution

For this application ten 3D image with the frame size of 765x512 are rendered at randomly selected viewpoints making use of an open-source 3D church model. Figure 5.1 presents the utilization of the main CPU (Fig. 5.1a) and specialized cores inside the MIC card. The profiles over 20 executions show that the overhead of our system is insignificant. Figure 5.1b presents the average ratio (%) of cores used for rendering the images.

The power consumption of the RayTracing application was also inspected. Figure 5.2a and Figure 5.2b present the power consumption in the two deployment scenarios. Overall, based on the usage graphs, we can conclude that the performance of the RayTracing use case does not show any significant degradation. More in depth graphs can be inspected in Bitbucket repository [10] .

(20)

16 CHAPTER 5. SMALL-SCALE EVALUATION

(a) CPU usage for CPU-only Ray Tracing application

(b) MIC utilization for the MIC Ray Tracing application

Figure 5.1: Comparison between two implementations of Ray Tracing application

(a) CPU only with and without SOSM (b) MIC + CPU with and without SOSM

Figure 5.2: Power Consumption for RayTracing use case

Conclusion

A detailed inspection of the Suitability Index values behind the decisions taken by the SOSM System reveals the flexibility of the System with respect to Service delivery. The evaluation has shown the ability of the System to maximize the user’s performance endeavor, while optimizing the goal of the System to efficiently manage its resources and power consumption.

To summarize the experiments carried on, the only overhead incurred by the CloudLightning system is the delay due to provisioning the resources, which accounts for around 8 seconds (on the small scale testbed), no matter which service is deployed. We reckon that this delay is a small cost compared to the facilities offered by our platform, namely: automatic selection and deployment of a variety of implementations, resource provisioning (and management), and service discovery. Alternative solutions generally require the End-User to manually provision the resources, deploy the services and link them (in case of multiple dependent services).

Except for the provisioning delay (which may also be experienced on other Cloud platforms) our platform accounts for no other negative effect on application performance (execution time) or power consumption, when running the applications in Containers.

(21)

Chapter 6 Resource assignment using Ethereum Smart

Contracts

In this chapter we design and implement a decentralized scheduling mechanism based on Ethereum Smart Contracts and investigate the operational constraints and the gas cost for managing resource assignments. This chapter revises and extends a previously published paper [20], and has the following novel contributions:

• an architecture for a decentralized Cloud platform making use of the Gateway Service presented in Chapter 3 which connects to an Ethereum Smart Contract for resource selection and payment.

• a study on the impact of four scheduling methods regarding transaction cost and latency. • an investigation of the constraints under which asynchronous interaction with the Smart

Contract can take place and offers design principles for maintaining it.

• an experimental evaluation using a scenario built from Cloud usage traces, which allows for a deeper investigation of the cost and latency in a real world setting.

6.1 Scheduling Methods

Four methods for scheduling a service are considered:

1. First Match – the resource array is iterated until the first resource that is available and matches the service requirements is assigned. This is the simplest form of Smart Contract ruled assignment which can be made.

2. Best Match – the resource array is iterated entirely and the resource with the best match in terms of requirements with minimum price is selected. We choose this as an archetype for any parameter minimization or maximization problem.

3. Offline Selected – the resource array is investigated by reading the Cloud Contract and applying optimization function offline; then, the resource is assigned by id.

4. Offline Synchronized – the resource is selected as in the previous method, but the clients can query a synchronization component to assess if the resource has already been requested by another client

(22)

18 CHAPTER 6. SMART CONTRACT RESOURCE ASSIGNMENT

Finish and cleanup

The management of current running services is mediated by the Schedule structure. The Service Instances can be organized into a map, assigning an incremental value for each new instance requested. However, in this manner we limit the total amount of services that can run for a user, because as Service finish, there is no way to return and reuse the space they occupied, other than holding an array of unused indices. Since this involves further manipulation of data, we consider it is best to organize the Service Instances in an array in the first place. When terminated, the place used for this Service instance can be reused using one of two approaches: • Order removal: maintain the order of the Service Instances and shift all following

instances with one position to the left;

• Move removal: give up order and replace the terminating Service Instance with the one on the last position.

The efficiency of the first approach is low and degrades with the increase in size of the array. For one Service Instance, it depends on the position of this instance in the array. The amount of copy operations that have to be made is C = n − i, where n is the size of the array and i is the position of the Service that needs termination. When applications composed of multiple Services need to be terminated, all Service Instances need to be terminated individually. This will have the effect that each instance except for the first one will be copied one or multiple times before being terminated.

Gas usage analysis

A static analysis is presented in Table 6.1. The base gas usage for the assignment operation is similar for the three methods. The column +1R expresses the additional cost incurred by each registered resource. The column +1S expresses the additional cost incurred by each running service. The First Match method will consume a further 772 gas units for each task running in the system, i.e., each resource that is not available. This is the cost for checking the availability of a resource. In addition, the Best Match method add 7295 gas unit for each resource that is registered and available. This is the cost for optimizing for the best solution, more specifically the computation for a resource’s score and storage for the minimum score.

Table 6.1: Cost in gas units (ETH)

Method Base cost +1 R +1 S

Offline Selected 159740 (0.0035) 0 0

First Match 159740 (0.0035) 0 772

Best Match 159740 (0.0035) 7295 (0.00016) 772

Finish 49826 (0.0011) 0 27880

Offline Reject 21268 (0.0004) 0 0

The base gas usage for the Finish transaction is 49, 826 units, and increases with more than half of this amount with every Service Instance that needs to be copied in order to fill the gap. We can observe the cost for moving 100 instances reaches 3 million units, and can deduct that terminating a Service followed by more than 250 instances will reach the gas limit of a block.

We also assess the rejection cost for the Offline Selected method, in case the resource became unavailable in the mean time. This cost is around 13% the cost of the successful transaction.

(23)

6.2. EXPERIMENTAL EVALUATION 19

We can compute the total gas usage cost for running a Service S using the following equation:

C(S) = C(T_Ss) + C(T_Sf) + R X k=1 C(Trk S ), (6.1)

where Ts represents the scheduling transaction, Tf represents the finish transaction, and Trk

represents the kthscheduling transaction that was rejected, with R ∈ {1, 2, .., Rmax} representing the number of rejected scheduling transaction, within a set limit, Rmax.

6.2 Experimental Evaluation

Figure 6.1 shows the total cost (in ETH) due to gas usage. There is no rejection cost for the Contract-selected methods because enough resources are available. Under one half of Offline Selected Service assignments have been rejected, and this cost is fairly small for the whole experiment. The Best Match method presents a very large cost for Service termination, mostly due to the small number of assignments that could be made in one block, leading to multiple copies. 17.24 2.55 _2.12 _2.12 2.58 1.53 _1.17 _1.07 0.00 0.00 0.21 0.00 0 2 4 6 8 10 12 14 16 18 20

BEST FIRST OFFLINE OFFLINE_SYNC

ASSIGN FINISH REJECT

Figure 6.1: Cost for accepted assign transactions (blue), rejected assign transactions (red) and finish transactions (orange)

In terms of latency, we consider two measures: makespan and finish latency. The makespan latency measures the amount of time passed since the assign transaction is sent until the receipt is received. The finish latency measures the amount of time passed since the finish transaction is sent until it is mined and the client informed. On the Ethereum node used, blocks have been mined every three seconds, on average.

Figure 6.2 shows the makespan (start) and finish latency for the four methods. The finish latency of the first three methods are similar. Because the Offline Selected method has higher chance to start Services in the same block, it is able to reduce the cost for the finish transaction and fit more in one block. The Offline Synchronized achieves even better results; because most services belonging to an Application are scheduled in the same block, they can be terminated in reverse order, reducing the cost of this transaction. This, in turn, maximizes the number of finish transaction that can fit in a block. A smaller finish latency translates in a more fair payment for the resources, reducing the total runtime for a Service.

(24)

20 CHAPTER 6. SMART CONTRACT RESOURCE ASSIGNMENT Start Finish De la y (s ) 0 10 20 30 40 50 60

BEST FIRST OFFLINE OFFLINE_SYNC

Figure 6.2: Box Plot for Start and Finish Latency for average block interval = 4s

6.3 Discussion

The most efficient approach is to read the resource information from the Smart Contract and apply a selection optimization function Offline and ask the Cloud Contract to assign a specific resource. This method is significantly improved if the clients synchronize their decisions to limit the chance for selecting the same resource. Applying a simple optimization function in the Smart Contract proves 6.21 times more expensive than the offline synchronized variant, in our experiments. Indeed, depending on the size of the set of resources, the Contract variant may cost up to 50 times more than the offline version, in terms of gas, for 800 resources. This also increases the latency of the system, because few transactions are mined in each block. The inconvenience of the Offline Selected method is that conflicts with other users selecting the same resource may occur. In the experimental evaluation the number if rejected transactions because of this reason is more than half of all transactions. This can be mitigated by having the users synchronize their decisions through an external component.

In order to maximize the number of transactions that can be sent asynchronously, order must be maintained in the Service Instance array of each user. However, this implies a high cost for terminating instances, since all following instances needs to be copied to fill up the gap. We identified this cost to grow with 27880 gas units (accounting for 0.0006 ETH = 0.06 USD) for each instance that needs to be copied. Moreover, the Reverse Order removal reduces the impact of this cost by not copying Service Instances if they need to be removed in the same time frame. If the Ethereum public network is used to implement the presented system, the cost for running a Service will consist of 0.53 USD for the gas usage plus the price of the resource for a given amount of time. The gas price alone is accounting for the equivalent to renting a n1-standard-1 Virtual Machine (1 vCPU, 3.75 GB) on Google Compute Engine for 12 hours. If we consider the price for a resource on our platform to be half the price of Google, then a Service must have a run time of at least 24 hours before starting to benefit of the reduced price offered. Services with a shorter run time will pay more for the gas than for the actual resource utilization. Latency will also be substantially higher because the block rate is slower and the amount other transactions is higher. A better variant is to crate a new network where the only transactions are related to the Cloud platform. The latency of this system will be higher than in our experiments, depending on the number of resources and Cloud Contracts.

(25)

Chapter 7 A fault tolerant decentralized Cloud

The investigation done in the previous chapter has revealed the high cost and latency impli-cations of using a Smart Contract to optimize for resource selection. Therefore, a better solution is to agree off-chain to the particularities of Service requirements and use the Smart Contract only for verifying this agreement. This agreement is later used to deploy the Application and assess its execution status. In this chapter we present an architecture for the decentralized management of a public Cloud platform aggregating privately owned resources, which ensures Application Continuity in the presence of Service failures, Component Continuity in case of Management Component failure and ensures a fair price is paid in the case of node failures which cannot be mitigated. This chapter revises a previously published paper [22] and the key contributions are:

• an architecture allowing for the decentralization of the resource registration and assign-ment, using Smart Contracts

• the design of Component Administration Networks and their corresponding protocols, which act as a bridge between the Smart Contract World and the Software World and ensure Component Continuity by assigning them work, saving checkpoints, and monitoring their availability.

• a model for fault tolerant Application Orchestration using Component Administration Networks which also ensures that the user is taxed only for the amount of time a Service has executed;

In Figure 7.1 we present the proposed architecture. Central to this is a public Blockchain capable of Smart Contract execution. Although any such Blockchain can be used, we are constructing our protocols for the Ethereum network. The following Smart Contracts are to be deployed on the Blockchain:

1. Registry Contract – this is the entry point to our system. It contains information about registered Clouds and their status as well as registered Services and Applications.

2. Cloud Contract – contains information about resources, price, Cell Manager and Plug&Play Service endpoints

3. Application Contract – one is created for each deployed Application and is used to track status and payments.

The Gateway Service does not need to hold a priori knowledge of any Scheduler instance. It can read what Clouds are registered in the Registry Contract and is able to contact them. Service and Application catalogues are also stored on in the Registry Contract. This allows any Gateway Service instance to have access to the same information, which in turn allows the

(26)

22 CHAPTER 7. A FAULT TOLERANT DECENTRALIZED CLOUD Orchestration Service Cloud 1 PUBLIC BLOCKCHAIN Physical Resources Hypervisor Layer

OpenStack Marathon Bare Metal

Registry

Cloud 2 Cloud 3

Cloud1Cloud2Cloud3 _App1 _App2 _App3

SOSM SOSM SOSM

Gateway Service Orchestrator UI Service Optimization Engine Component Administration Network PnP PnP PnP

Figure 7.1: Augmented decentralized architecture

decentralization of this component. The user is not required to reach a node where the Gateway is located, as he/she can also run it locally.

The Orchestration process is decoupled from the Gateway. The nodes running the Or-chestration Service make use a replicated data storage mechanism provided by a Component Administration Network (CAN) that ensures the availability of the Orchestration process and continuity of deployment steps. The CAN nodes are responsible for the availability of the Orchestration Service.

The Plug and Play Service used for registering resources is augmented with Blockchain reading and writing capabilities, acting as an interface between the resources and the Blockchain. The Resource Manager is also capable to deregister a resource if deemed unavailable.

7.1 Component Administration Networks

A Component Administration Network has a two-fold purpose. First, it bridges together the land of Smart Contracts with the land of Software Components. Second, it provides the means for monitoring and enforcing a set of replicas in order to tolerate faults. The network of nodes implements a replicated state machine which has two functions. First, it maintains a ledger of transactions related to the network of nodes and the supervised components, different than the Ethereum ledger. Second, the nodes execute a replicated file system to store data associated with the supervised components.

Figure 7.2 presents the layered architecture of this proposal. On the bottom layer, there is the peer to peer network that collaborates for maintaining the Ethereum Blockchain, where the Registry Contract is deployed. Some of these nodes can be part of the Component Administra-tion Network.

The middle layer is concerned with operations for managing the nodes. This is the first layer where we identify the leader of the network, which is responsible for ordering and validating all transactions related to the CAN. The replica nodes will accept any state update from the leader. For this layer we propose the following protocols:

• Join – protocol for a new node to join the network

(27)

7.2. FAULT TOLERANT ORCHESTRATION 23

Figure 7.2: Layered architecture of a Component Administration Network

• LeadElect – protocol for electing a new leader if the current one has been discovered to be faulty

The top layer is concerned with the administration of Components. Again, the CAN leader is responsible for ordering and validating state updates at this level. This layer is concerned with the following protocols:

• Register – a component gets registered with the network

• Deregister – a component has been unresponsive and is removed • AssignWork – a component is assigned work

• CheckpointWork – a component requests the network to store a Checkpoint • ReassignWork – a component has been removed and work is reassigned to another • FinishWork – a component is requested to terminate the execution of a given piece of work

(termination of a Cloud Application).

7.2 Fault tolerant Orchestration

Using the concepts defined in the previous section, Orchestrators are components which reg-ister with the Component Administration Network. Work is represented by the Application Contracts, which the Orchestrator will read and perform the deployment.

Figure 7.3 presents the deployment continuity of an Application composed of two Service (e.g. Ray Tracing UI and Engine) with Orchestrator failure. After the successful resource discovery process the End-User requests the Cloud Contract to create an Application Contract. the Orchestration CAN (OCAN) leader is then informed about this contract. Alternatively, the leader can subscribe to Ethereum events and get notified when a new Application Contract is created. In both cases the leader will select an Orchestrator replica, broadcast an AssignWork transaction and inform the assigned Orchestrator.

(28)

24 CHAPTER 7. A FAULT TOLERANT DECENTRALIZED CLOUD

Figure 7.3: Example deployment continuity with failing Orchestrator

The Orchestrator reads the content of the Application Contract and proceeds with deploy-ment. After each service deployment the Orchestrator will initiate the checkpoint mechanism and will save information about the relationship of a service definition (from the Application Contract) to a service instance (unique identifier used by the Hypervisor). In this manner, if the assigned Orchestrator fails during the deployment of multiple services, another replica can use the checkpoints.

Additionally, the Orchestrator will issue a checkpoint at at intervals set in the Application Contract. This checkpoint is used to collect payments, proof that all Services executed for the set interval. When a service fails, a checkpoint is made and a set number of redeployments are tried. If the service can be redeployed then a checkpoint is made. If the service cannot be

(29)

7.3. DISCUSSION 25

redeployed then it must be a problem with the Cloud and a forced shut down of the Application is requested to the leader. The leader will call the corresponding function on the Application Contract.

The OCAN leader is responsible to update the Application Contract with checkpoint prop-erties, like checkpoint type, timestamp, proposer; the actual checkpoint data can be further inquired from the OCAN network. The OCAN leader can also request payment of the registered checkpoints periodically. For this, the Application Contract must be filled with currency by the End-User. Nevertheless, the End-User will only pay for the actual time the services have been running, based on the checkpoints.

During the execution of the Application several entities dedicate their resources to enforce the fault tolerance of the System and Application, and therefore need to be reimbursed. The entities are the resources, the Cloud managing the resources, the Component Administration Network and the Orchestrator(s). We consider a method for making interim payments in order to lower the burden of computing all payments in a single transaction which may run out of gas.

7.3 Discussion

This chapter has presented a decentralized, fault tolerant mechanism for running Cloud Ap-plications on privately owned resources. We start from the architecture of the CloudLightning framework and augment it to achieve decentralization and fault tolerance of the Application and Orchestration Service. A Blockchain capable of Smart Contract Execution is considered to intermediate payments and hold information about the entities of the System.

We introduced the concept of Component Administration Networks which provide a bridge from the Smart Contract world to the software world and ensure the fault tolerance of the network and supervised components. We showed, by simulation, that failure rates of up to f = .75 can be tolerated if enough nodes are willing to join the network periodically. These nodes are encouraged to join such a network to collect payments from the Users running Applications. Finally, we have shown how this network is be used to ensure fault toleration of the Orchestrators and Applications, and provided a method for interim payments to all entities ensuring the execution of an Application.

We reckon our solution has the following benefits: it creates a free, decentralized, market for privately owned resources to meet the demands for Cloud Applications; allows for efficient management of resources according to high level business metrics, through the use of the SOSM Framework; it enforces the fault tolerance related to Application Deployment and ensures fair payment.

(30)

Chapter 8 Conclusion

The benefits of the solutions presented in this thesis are multiple. First, the End-User of the Cloud is relieved from the burden of selecting from different software implementations and hardware types and their configuration. The Service Optimization protocol can be implemented by Resource Management Systems to flexibly select resources based on their own business metrics, ensuring the constraints set by the user are met. Second, the modularity of our design allows for the decentralization of the System components. This in turn paves the way for the creation of a free, decentralized, market for privately owned resources to meet the demands for Cloud Applications. Third, our decentralized Cloud platform ensures fair payment and the continuity of Applications in the presence of Service or Orchestrator failures.

Impact

The Cloud Service delivery model and focus has shifted acceleratedly from the provisioning of Virtual Machines to the management of on-demand Containers and a vast amount of software and utilities have Docker Images available in the official repository. The Orchestrator presented in Chapter 3 is the first to allow the orchestration of hybrid applications, composed of mixed Virtual Machine and Container packed software. Moreover, the advances in the fields of Fog Computing and Edge Computing introduce a high level of heterogeneity in Application design, level which was anticipated by the specifications and tools described in Chapter 3.

Proof of Work Blockchains have proved to be a viable solution to the Byzantine Generals Problem, with massive increase in the scale of the network due to the reduced number of messages needed to reach consensus. Additionally, it is permisionless, meaning there is no need for the participating nodes to previously know a trustless block proposer, yet they can agree on the validity of the transactions and the mining proof. The blocks are replicated on all nodes, and provide a transparent and tamper-proof log of events without needing to trust any of block proposers a priori. One of the limiting factors on the throughput of such systems is that the proof needs to be hard enough to avoid a high number of chain forks, which decreases the block production rate.

The analysis we provided in Chapter 6 has impact beyond a specific Blockchain system because it tackles a problem fundamental to information management. For example, Ethereum 2.0 is currently transitioning to a Proof of Stake consensus algorithm, to increase the block production rate. Nevertheless, block gas limits (or other mechanisms for other Blockchains) will be maintained because it remains a public system and denial of service attacks need to be prevented. We reckon the operational constraints and costs investigated by us, as well as the recommendations we make, are of great value to any future research in the direction of resource assignment using Smart Contracts on permissionless Blockchains. The protocols described in Chapter 7 have been designed to minimize the amount of work that needs to be carried on in

(31)

27

the Smart Contract, thus allowing for a higher number of transactions that can be mined in one block.

The volatility of the Ethereum cryptocurrency can be mitigated by using a different, own, chain which is independent of the high fluctuations. Additionally, our proposal allows for nodes in the network to be reimbursed for jobs different than mining, such as checkpoint replication and service orchestration. This has as effect a reduced competition for mining, which reduces the proof difficulty and hence the amount of energy consumed mining a block.

Permissioned Blockchains allow only a group of authority nodes to participate in the block creation process. They generally provide faster consensus and higher transaction throughput. Because authority nodes need to trust each other before consensus, it is easier for this kind of systems to become centralized, if they don’t start this way in the first place. The research conducted in this thesis is nevertheless applicable to any Blockchain capable of executing Quasi-Turing-complete code, optimizing the synchronization complexity.

Open Questions

The usage of Proof of Work Blockchains ensures a transparent and cryptographically secure ordering of valid transactions. This has great value for the assets that are stored on the Blockchain, but it is not always possible to link the asset with the real world object it describes. A weak enforcement of this link makes a system vulnerable to different types of attacks. In our proposed architecture, the assets are the computational resources and the components involved in managing the Cloud System. We have only investigated the fault tolerance of the proposed system, though our measures discourage a byzantine adversary from attempting some types of attacks, like the Sybil attack. We exemplify the attack possibilities in which follows and reckon that are important aspects that require formalization in the domain of Distributed Systems Security.

In a Sybil attack, a malicious adversary would try to register the same resource under different identities with the scope of receiving more work and gain more reimbursements. First, the attacker would need to pass the Resource Manager registration test, involving a benchmark. The benchmark may involve reading some hardware information to prevent duplicate hardware registration, but a determined attacker can interfere with the benchmark a provide different values. If the attacker passes this test, multiple instances of the same resource can be selected to run services. Randomly assigned Orchestrators will monitor the different services, so the attacker needs either to run a service, or to control the Orchestrator that monitors the service. If the attacker does not control any Orchestrator, then running all services assigned to all Sybil copies of a resource will, at some point, result in unresponsiveness. This will be detected by the Orchestrators and the resource will not be payed. For any controlled Orchestrators, the attacker can simulate the execution of the service to receive reimbursement. Nonetheless, the controlled Orchestrator needs to act corresponding to the protocol for all other resources in order not to raise suspicion and get deregistered from the administration network. If the attacker tries to create Sybil copies of Orchestrators, he risks for them to become unresponsive in case of a high number of managed services. The more Sybil copies an attacker makes, the higher the risk for becoming unresponsive and thus losing the opportunity for gains.

At first glance, the Sybil attack cannot succeed without requiring the attacker to either have control of some Orchestrators or to intervene in the monitoring process; a prerequisite for this scenario is for the attacker to initially fool the Resource Manager. The scenario is, nevertheless, complex and requires an in-depth investigation on different granularity levels.

(32)

References

[1] David P Anderson. “Boinc: A system for public-resource computing and storage”. In: proceedings of the 5th IEEE/ACM International Workshop on Grid Computing. IEEE Computer Society. 2004, pp. 4–10.

[2] G Breiter, Frank Leymann, and T Spatzier. “Topology and orchestration specification for cloud applications (TOSCA): Cloud service archive (CSAR)”. In: International Business Machines Corporation (2012).

[3] Brendan Burns, Brian Grant, David Oppenheimer, Eric Brewer, and John Wilkes. “Borg, Omega, and Kubernetes”. In: Queue 14.1 (Jan. 2016), 10:70–10:93. issn: 1542-7730. doi: 10.1145/2898442.2898444. url: http://doi.acm.org/10.1145/2898442.2898444. [4] Vitalik Buterin et al. Ethereum white paper. 2013. url: https://github.com/ethereum/

wiki/wiki/White-Paper.

[5] Christian Cachin and Marko Vukoli´c. “Blockchains Consensus Protocols in the Wild”. In: arXiv preprint arXiv:1707.01873 (2017).

[6] Bernadette Charron-Bost, Fernando Pedone, and Andr´e Schiper. “Replication”. In: LNCS 5959 (2010), pp. 19–40.

[7] Ioan Dragan, Teodor-Florin Fortis,, Marian Neagul, Dana Petcu, Teodora Selea, and Adrian

Spataru. “Application Blueprints and Service Description”. In: Heterogeneity, High Per-formance Computing, Self-Organization and the Cloud. Springer, 2018, pp. 89–117. [8] Gilles Fedak, Cecile Germain, Vincent Neri, and Franck Cappello. “Xtremweb: A generic

global computing system”. In: Cluster Computing and the Grid, 2001. Proceedings. First IEEE/ACM International Symposium on. IEEE. 2001, pp. 582–587.

[9] Christos Filelis-Papadopoulos, Huanhuan Xiong, Adrian Spataru, Gabriel G Casta˜n´e, Dapeng Dong, George A Gravvanis, and John P Morrison. “A generic framework sup-porting self-organisation and self-management in hierarchical systems”. In: Parallel and Distributed Computing (ISPDC), 2017 16th International Symposium on. IEEE. 2017, pp. 149–156.

[10] Adrian Spataru Gabriel Gonzales Castane. CloudLightning Use Case evaluation Bitbucket Repository - Ray Tracing. https://bitbucket.org/cloudlightning/evaluation_use_ cases/src/09ff4fcd0176/raytracer/?at=master. [Last accessed: 15-08-2020]. 2018. [11] Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony D Joseph,

Randy H Katz, Scott Shenker, and Ion Stoica. “Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center.” In: NSDI. Vol. 11. 2011, pp. 22–22.

[12] Leslie Lamport, Robert Shostak, and Marshall Pease. “The Byzantine generals problem”. In: ACM Transactions on Programming Languages and Systems (TOPLAS) 4.3 (1982), pp. 382–401.

[13] Peter Mell and Tim Grance. The NIST definition of cloud computing. 2011.

Decentralized Cloud Computing