A dual-hypervisor for platforms supporting hardware-assisted security and virtualization

(1)

Department of Information Engineering TeCIP Institute

Retis Lab

Master Thesis in

Embedded Computing Systems

A dual-hypervisor for platforms supporting

hardware-assisted security and virtualization

Giorgiomaria Cicero

Supervisors Prof. Giorgio Carlo Buttazzo Dr. Alessandro Biondi

(2)

Abstract

The need for security and virtualization capabilities in modern cyber-physical systems is increasing and plays a crucial role in their design. During the last years, several software-based techniques have been proposed to achieve isolation and security features, offering secure computing services and storing confidential/sensible data together with the execution of multiple software components on the same platform. Notably, such architectures are typically denoted as systems with Multiple Indepen-dent Levels of Security (MILS). However, due to the increase of software complexity and the exposure of modern systems by means of connectivity infrastructures, se-curity became a fundamental design objective, originating strong functional and reliability requirements that cannot generally be achieved with pure software tech-niques. To meet such requirements, chip makers started developing hardware-based solutions to realize trusted execution environments (TEEs), whose one of the most popular proposals is the TrustZone technology developed by ARM. Contextually to the need of security features, virtualization established as the de-facto technology to support the execution of multiple software systems (possibly running upon dif-ferent operating systems) on the same platform, with hypervisors being the most wide-spread solutions to achieve virtualization of the available computing resources. This thesis aims at proposing a software infrastructure for reconciling the virtualiza-tion capabilities offered by hypervisors with the need for executing multiple TEEs upon a shared platform. To this end, a dual-hypevisor solution is proposed to enable the execution of multiple domains in isolation, where each of them can comprise both a standard (i.e., non-secure) execution environment and a TEE, where the latter is executed upon secure world enabled by the ARM TrustZone technology. The design consists in two jointly-configured hypervisors, one managing non-secure domains, and another managing a set of virtualized TEEs, thus offering a further level of isolation by construction between the two worlds. A minimal software layer has been also introduced to orchestrate the two hypervisors and dispatching the corresponding interrupt signals.

The design has been realized by taking the XVISOR open-source hypervisor as a reference system. Experimental results have been finally performed to validate the proposed approach and assess its performance upon an ARM Cortex-A15 proces-sor.

(3)

4 Proposed Design 30 4.1 Scheduling . . . 31 4.2 Secure Guests . . . 33 4.3 Inter-World Communication . . . 33 4.4 SMC Calling Convention . . . 33 4.5 System Configuration . . . 35 4.6 Checking barriers . . . 37 5 Implementation 39 5.1 S Xvisor . . . 40 5.2 NS Xvisor . . . 44 5.3 X Monitor . . . 46 5.4 Trusted Bootloader . . . 56

5.5 System Memory Map . . . 59

6 Tools 61 6.1 ARM Fast Models . . . 61

6.1.1 Fixed Virtual Platform Versatile Express Board . . . 61

6.2 ARM Cycle Models . . . 62

(4)

7 Evaluation Results 64

7.1 Memory footprint . . . 65

7.2 Trusted Bootloader latencies . . . 66

7.3 World’s context switch latencies . . . 66

7.4 Comparison between local and remote crypto algorithm . . . 67

7.5 Comparison between local and remote long block of code . . . 71

8 Conclusions 77 8.1 Future works . . . 78

Bibliography 79

(5)

1

Introduction

In the last 50 years, virtualization is becoming popular in many different fields such as cloud computing, servers, embedded computing systems, etc. The reason mainly lies on the increasing performance of the hardware able to host several systems sometimes without even degrading software performance. Another important reason is due to the contribution that virtualization can give to falut tolerance techniques. Virtualization can in fact be used to isolate submodule and/or subsystems so that internal faults do not trigger external faults and/or system failures. Arinc 653 for instance is a software specification for space and time partitioning in safety-critical avionics real-time operating systems (RTOS). It allows the hosting of multiple ap-plications of different software levels on the same hardware in the context of an Integrated Modular Avionics architecture. A typical Integrated Modular Avionics Application includes a Fault Detection Isolation and Recovery partition (FDIR) that can even take decision and dynamically recover partitions.

In the other hand, the security in terms of robustness of the systems on facing exter-nal software attacks is becoming a big requirements for those systems that provide secure services and/or store confidential data such as cryptographic keys. Using only software techniques would in itself contain bugs due to the intrinsic imperfection since in most cases these techniques are manually developed introducing human error. Strict test plans can let the system gain reliability but, as mentioned by Edsger W. Dijkstra: "testing shows the presence, not the absence of bugs". One of the biggest motivation of this thesis is to use hardware-based technique to enforce isolation in a virtualized environment. ARM TrustZone Technology is a good candidate for these purposes. TrustZone is hardware-based security built into SoCs by semiconductor chip designers who want to provide secure end points and a device root of trust. At the heart of the TrustZone approach is the concept of secure and non-secure worlds that are hardware separated, with non-secure software blocked from accessing se-cure resources directly. Within the processor, software either resides in the sese-cure world or the non-secure world; a switch between these two worlds is accomplished via software referred to as the secure monitor. This concept of secure (trusted) and non-secure (non-trusted) worlds extends beyond the processor to encompass memory, software, bus transactions, interrupts and peripherals within an SoC. The thesis focuses on virtualizing the 2 worlds so that stand-alone programs, with mixed-criticality and security levels, can coexist in the same Hardware platform. This solution supports System Managers on securely merging hetereogeneous legacy

(6)

systems in the same board. These systems could already support TrustZone to isolate secure services and/or confidential data. When merging this kind of systems, the biggest requirement is the strong isolation between them. The isolation needs to be achieved not only in the Non Secure World but also in the Secure World, even if the secure part of the software is certified or anyway trusted.

The Hypervisors by construction highlights the possibility to have different programs, even from different developers, companies and with different purposes, criticality and security levels in the same machine with the idea to make the smallest effort in the porting phase (the ideal solution is to let the legacy programs untouched so that the same program can run bare-metal or in a virtualized environment).

The Figure 1.1 shows a possible application of this work.

Fig. 1.1: Merging Systems with mixed criticality and security levels

Two or more different stand-alone systems already built and functioning in a ded-icated hardware can run in the same Hardware platform so that each of them is feasible to run in a virtualized environment. Some of them can already use the TrustZone technology or however have a secure side which can contain sensitive data, algorithm and/ora programs to be secured.

(7)

The proposed design is based on 2 hypervisors running respectively in the Secure and Non-Secure World thus providing the possibility to have several Secure Guests and several Non-Secure Guests and ensuring isolation thanks to the TrustZone tech-nology and thanks to an off-line configuration that governs the possible connections between Secure and Non-Secure Guests. An additional active Guest can reside in the Secure side such as an RTOS and/or special Guests (such as a FDIR System) which plays the role of Health Monitor or Intrusion Detection System for the entire system.

(8)

2

Background

2.1 Virtualization

Virtualization began in the 1960s, as a method of logically dividing the system re-sources provided by mainframe computers between different applications. Hardware virtualization or platform virtualization refers to the creation of a virtual machine that acts like a real computer with an operating system. Software executed on these virtual machines is separated from the underlying hardware resources.

Gerald J. Popek and Robert P. Goldberg, two American computer scientists, intro-duced in their 1974 article "Formal Requirements for Virtualizable Third Generation Architectures"[19] the Popek and Goldberg virtualization requirements: a set of condi-tions sufficient for a computer architecture to support system virtualization efficiently. They formally introcuded the concept of Virtual Machine (VM) and Virtual Machines Monitor (VMM). The VM is taken to be an efficient, isolated duplicate of the real machine. It shall be capable of virtualizing a full set of hardware resources, in-cluding a processor (or processors), memory and storage resources and peripheral devices. The VMM (also called Hypervisor) is the piece of software that provides the abstraction of a virtual machine. The VMM has three essential characteristics. First the VMM provides an environment for programs which is essentially identical with the original machine; second, programs run in this environment show at worst only minor decreases in speed; third, the VMM has a complete control of system resources.

There are three properties of interest when analyzing the environment created by a VMM when an arbitrary program is running:

• The efficiency property. All innocuous instructions are executed by the hardware directly, with no intervention at all by the VMM.

• The resource control property. It must be impossible for that aribitrary program to affect the system resources, i.e. memory, available to it; the allocator of the VMM is to be invoked upon any attempt.

• The equivalence property. A program running under the VMM should exhibit a behavior essentially identical to that demonstrated when running on an equivalent machine directly.

(9)

From the implementation point of view, the basic concept is based on partitioning the resources of a single computing platform into multiple segregated and virtualized execution environments. The biggest challenge in this technique is to run each environment completely isolated and independent from the others, thus allowing multiple operating systems to run on the same physical platform.

As said before, in the scenario where multiple OSs can run in the same physical plat-form, an additional layer must be introduced between the Guest OSs and hardware: the Hypervisor (or VMM).

Fig. 2.1: Mixed OSs Environment

The first new concept is the presence of a hierarchial scheduling framework having at least 2 schedulers: the global one used by the Hypervisor to schedule the different Virtual Machines (VM) and the local one used by the specific OS (hereafter also called Guest) to schedule the related tasks.

A virtualized system typically provides the following benefits [10]:

• Multiple Secure Environment: a system VM provides a sandbox that isolates one system environment from the others;

• Failure Isolation: virtualization helps isolating the effects of a failure to the VM where the failure occurred;

• Mixed OS Environment: a single hardware platform can support multiple operating systems concurrently;

• Better System Utilization: a virtualized system can be (dynamically or stati-cally) re-configured for changing needs;

(10)

The main techniques used to realize virtualization are: System Emulation, Full-virtualization and Para-Full-virtualization. Within the Para-Full-virtualization we can also cite Binary Translation while a typical Full-Virtualization is based on Hardware-Assisted Virtualization.

System Emulation This techniques is mainly based on emulating all the hardware resources, so the VMM can control the entire execution of the VM.

• PRO

– complete isolation

– total portability since VM are not related to any specific Hardware

plat-form

– no modifcation on the guest OS

• CONS

– slow, since everything is emulated

– efficiency property of the Popek and Goldberg virtualization requirements

is violated

Para-Virtualization This technique is based on providing a software interface to the VM that is similar, but not identical, to the underlying real hardware. The basic idea is to reduce the portion of the guest’s execution time spent performing operations which are substantially more difficult to run in a virtual environment compared to the real host hardware. The VMM basically provides a set of API so that the request can be directly run on the hardware gaining efficiency. The guest OS needs to be modified in order to cooperate with the VMM. All the privileged and sensitive instructions are trapped and emulated (see Section 2.1.2).

• PRO

– more efficient than System Emulation

• CONS

– virtualized OSs need to be modified

(11)

– isolation is more challenging

Binary Translation This technique is based on intercepting Guest OS code and translating them. The idea is to control run-time all the write operations to the PC, replacing privileged ans sensitive instructions with hypercalls to pass the baton to the VMM.

• PRO

– no modication on the Guest OS since it is not aware of virtualization

– better performance than System Emulation

• CONS

– can be a complex process

Hardware-Assisted Virtualization In hardware-assisted virtualization, the hardware provides architectural support that facilitates building a virtual machine monitor and allows guest OSes to be run in isolation. Hardware-assisted virtualization was first introduced on the IBM System/370 in 1972, for use with VM/370, the first virtual machine operating system. In 2005 and 2006, Intel and AMD provided additional hardware to support virtualization. A typical features provided by the Hardware to handle for instance Virtual Memory for the Guest OS is the presence of a MMU-enforced traps, so page faults can be easily handled. ARM, from the ARMv7-A introduced its own Virtualization Extensions. It will be deeper described in the Section 2.1.3 since it has been used in this work.

• PRO

– hardware isolation

– no changes needed fot Guest OS

– the closest to the Popek and Goldberg virtualization requirements

• CONS

– requires explicit support in the host platform

(12)

The presented work is based on 2 of the previous listed techniques: the para-virtualization and the hardware-assisted para-virtualization. The para-para-virtualization is used in the Secure world since there is no hardware support. The hardware-assisted technique is used in the Non Secure world since the ARM Virtualization Extension is fully supported.

2.1.1 Types of Hypervisor

Still in the article Formal Requirements for Virtualizable Third Generation Architec-tures [19], Gerald J. Popek and Robert P. Goldberg classified two types of hypervi-sor:

• Type-1, native or bare-metal Hypervisor. These hypervisors run directly on the host’s hardware to control the hardware and to manage guest operating systems. For this reason, they are sometimes called bare metal hypervisors.

• Type-2 or hosted Hypervisor. These hypervisors run on a conventional operat-ing system (OS) just as other computer programs do. A guest operatoperat-ing system runs as a process on the host. Type-2 hypervisors abstract guest operating systems from the host operating system.

Fig. 2.2: Types of Hyperivosr

(13)

2.1.2 Trap and Emulate

A solution to implement Para-Virtualization and Binary Translation is the Trap and Emulate Technique. As the name is suggesting, this technique is based on trapping all the privileged and sensitive intructions executed by the Guest OS. Once they are trapped, the instruction can be easily emulated (or discarded if not allowed) by the VMM in Supervisor mode.

The privileged instructions already cause a Trap in hardware when executed in User mode. The biggest problem is on handling sensitive instructions executed in User mode. Some of them (such as MCR/MRC used for Co-processors) cause an undefined exception when executed in User mode. Some others have no effect (such as MSR for handling the Processor Status Register), while some others can cause unpredictable behaviours (such as MOVS PC, LR used to return to User mode from other modes). A solution to this problem is to translate (off-line for Para-Virtualization and on-line for Binary Translation) all the sensitive instructions into Hypercalls (SVC instruction). A big help comes from the ARM Architecture, since the instructions have fixed-length that facilitates the parsing of the Binary Code and the translation into an SVC instruction. An example of "intercept and replace" is shown in the Figure 2.3.

Fig. 2.3: Translation Sensitive Instruction into Hypercall

2.1.3 ARM Virtualization Extension

The ARM Architecture virtualization extension and Large Physical Address Extension (LPAE) enable the efficient implementation of virtual machine hypervisors for ARM architecture compliant processors. To handle complex software with potentially large amounts of data, connected consumer devices and cloud computing demand energy efficient, high performance systems. The virtualization extensions provide the basis for ARM architecture compliant processors to address the needs of both client and server devices for the partitioning and management of complex software environments into virtual machines.

The Large Physical Address extension provides the means for each of the software environments to utilize efficiently the available physical memory when handling large amounts of data.

(14)

The basic idea of this architecture is based on the presence of an additional higher privileged mode: HYP Mode.

Fig. 2.4: ARM Virtualization Extensions Modes

The basic model of a virtualized system involves:

• a hypervisor, running in Hyp mode, that is responsible for switching Guest operating systems

• a number of Guest operating systems, each of which runs in the PL1 and PL0 modes (respectively Supervisor and User)

• for each Guest operating system, applications that usually run in User mode. A Guest operating system, including all applications and tasks running under that operating system, runs on a virtual machine and the hypervisor switches between virtual machines. However, the Guest OS’s view is that it is running on an ARM processor. Normally, a Guest OS is completely unaware that it is running on a virtual machine.

Each virtual machine is identified by a virtual machine identifier (VMID), assigned by the hypervisor.

The key features of this extension are:

• Hyp mode is implemented to support Guest OS management. Hyp mode operates in its own virtual address space, that is different from the virtual address space accessed from PL0 and PL1 modes.

• The Virtualization Extensions provide controls to:

– Define virtual values for a small number of identification registers. A read

of the identification register by a Guest OS or its applications returns the virtual value.

(15)

– Trap various other operations, including accesses to many other registers,

and memory management operations. A trapped operation generates an exception that is taken to Hyp mode.

These controls are configured by software executing in Hyp mode.

• With the Security Extensions (TrustZone), the Virtualization Extensions control the routing of interrupts and asynchronous Data Abort exceptions to the appropriate one of:

– the current Guest OS

– a Guest OS that is not currently running

– the hypervisor

– the Secure monitor (this part is better explained in the next chapter)

• When an implementation includes the Virtualization Extensions, it provides independent translation regimes for memory accesses from:

– Hyp mode, the PL2 translation regime

– Supervisor and User modes, the PL1&0 translation regime.

• In the PL1&0 translation regime, address translation occurs in two stages:

– Stage 1 maps the Virtual Address (VA) to an Intermediate Physical Address

(IPA). Typically, the Guest OS configures and controls this stage, and believes that the IPA is the Physical Address (PA)

– Stage 2 maps the IPA to the PA. Typically, the hypervisor controls this

stage, and a Guest OS is completely unaware of this translation.

(16)

2.2 Trustzone

As said in the Introduction chapter, this work started on exploring the ARM Trust-Zone technology and trying to evaluate it in order to enforce isolation (mainly spatial isolation) between diferrent application domains. This chapter briefly explains the ARM Trustzone technology taking as reference [15].

ARM TrustZone is a hardware security extension technology, which aims to pro-vide secure execution environment by splitting computer resources between two execution worlds, namely normal world and secure world. TrustZone is supported on different flavors of ARM architectures, that include architecture deployed on targets running regular applications, such as mobile devices and architecture for controllers. As ARM is widely deployed on the majority of mobile and micro-controller devices, TrustZone’s goal is to provide security for those platforms. A system is usually only secured at the software level. However, a greater level of security can be achieved by building security checks into the hardware of the system. This idea is implemented by the concept of Trusted Execution Environments (TEE). The Trusted Execution Environment (TEE) is a secure area of the main processor. It guarantees code and data loaded inside to be protected with respect to confidential-ity and integrconfidential-ity. The TEE as an isolated execution environment provides securconfidential-ity features such as isolated execution, integrity of Trusted Applications along with confidentiality of their assets. In general terms, the TEE offers an execution space that provides a higher level of security than a Rich OS.

Trustzone provides support for an hardware-based TEE. Formal definition and speci-fication of a TEE has been defined by GlobalPlatform [6]. The security of TrustZone is based on the idea of partitioning all of the System on Chip (SoC)’s hardware and software into two worlds: secure world and normal world. Hardware barriers are established to prevent normal world components from accessing secure world resources; the secure world is not restricted since it has full control of the entire system (the non secure world as well). Specifically, the memory system prevents the normal world from accessing:

• regions of the physical memory designated as secure

• system controls that apply to the secure world

• state switching outside of a small number of approved mechanisms.

ARM introduced 2 versions of TrustZone: one for ARM-A and one for ARM-M. The one used in this work, and so described in this document, is the ARM-A version.

(17)

2.2.1 Modes view and System IPs

TrustZone idea is based on having a sort of additional bit checked in every instruction and transaction: the NS Bit which identifies which world is currently running. We can erroneously say that a 32-bit processor, with the Security Extensions, is going to be a 33-bit processor where the 33rd is the NS bit. However this is a wrong defintion but it gives a good overview about the mechanism. In fact the NS bit is in a special register (Secure Configuration Register - SCR) in the CP15 only accesible by the Secure World.

Fig. 2.5: Modes overview in a TrustZone-enabled Processor

As shown in the Figure 2.5, the traditional User and Privileged Modes are ortogonally split into the two different worlds. A new entity is introduced which has the highest privileges and is mainly responsible of the context switching between the two worlds: the Monitor. The mechanism used to switch between worlds is better explaind in the section 2.2.2.

ARM has implemented this split-environment processor with various system IP ad-ditions. These unique components are used to enforce security restrictions while preserving the low power consumption and other advantages of ARM’s designs. Some of the features are described in the specification for ARM’s Advanced Mi-crocontroller Bus Architecture version 3 (AMBA3). The main additional system IPs of the Security Extensions are: a APB Bridge (AXI-to-APB Bridge), a modified Cache Controller, a modified DMA Controler, a TrustZone Address Space Controller (TZASC), a modified Generic Interrupt Controller (GIC) and a TrustZone Protection

(18)

Controller (TZPC).

The AMBA3 AXI to APB Bridge allows for secure communication between a CPU and peripherals. The Advanced eXtensble Interface (AXI) bus, which is the main system bus, contains an active-high non-secure (NS) bit that indicates whether a read/write operation is directed to secure or non-secure memory. The Advanced Peripheral Bus (APB), whose low bandwidth reduces power consumption, connects to the AXI bus via a bridge. As the APB does not check for security due to backward-compatibility concerns, the bridge checks for appropriate permissions and blocks unauthorized requests.

Like the AXI to APB bridge, the Cache Controller also looks for an NS bit. This bit is basically treated like a 33rd address bit: the first 32 bits provide the location, and the NS bit indicates which world it refers to. Since both worlds share the same physical cache, the same location may have two distinct addresses, requiring a controller to look up the correct location. This also includes L2 cache and other smaller locations. The Direct Memory Access (DMA) Controller is used to transfer data to physical memory locations instead of devoting processor cycles to this task. This controller, which uses AXI, can handle Secure and Non-secure events simultaneously, with full support for interrupts and peripherals. It prevents non-secure access of secure memory.

The TrustZone Address Space Controller (TZASC) allows dynamic classification of AXI slave memory-mapped devices as secure or non-secure. Controlled by the secure world, the TZASC allows partitioning of a single memory unit rather than requiring separate secure and non-secure units. The TZASC allows an arbitrary number of partitions to be created.

The Generic Interrupt Controller (GIC) is a single hardware device that supports both Secure and Non-secure prioritized interrupt sources. Attempts by Normal world software to modify the configuration of an interrupt line configured as a Secure source will be prevented by the GIC hardware. Additionally, Non-secure software can only configure interrupts in the lower half of the priority range, preventing denial-of-service attacks.

Finally, the TrustZone Protection Controller (TZPC) is a signal-control unit. It has three 2-bit registers to control up to 8 signals.

The TrustZone Hardware, where hardware extensions enforce a separation of secure and non-secure software, is more resource-efficient than the use of two separate processors.

Figure 2.6 shows the block diagram of AXI Bus TZ-Enabled.

(19)

Fig. 2.6: AXI Bus TZ-Enabled [1]

2.2.2 World Switching Mechanism

In order to switch in a synchronous manner between worlds, TrustZone in ARM-A introduced a special form of Software Generated Interrupt which is called Secure Monitor Call (SMC) (see Figure 2.5). When the processor executes the Secure Monitor Call (SMC) the core enters Secure Monitor mode to execute the Secure Monitor code. This instruction can only be executed in privileged modes, so when a User process wants to request a change from one world to the other it must first execute a SVC instruction. This changes the processor to a privileged mode where the Supervisor call handler processes the SVC and executes a SMC.

The Secure Monitor mode is tipically responsible of switching between worlds. The recommended way to return from an SMC call is to:

1. Toggle the NS bit in the SCR (so setting it if we are going to the Non Secure world or clearing it if we are going to the Secure world)

2. Execute a MOVS, SUBS or RFE.

All ARM implementations ensure that the processor can not execute the prefetched instructions that follow MOVS, SUBS, or equivalents, with Secure access permissions.

(20)

However the world’s switching mechanism is also supported asynchronously by the Hardware, for instance when an Interrupt for the Secure World is raised while the Non Secure World is running, and/or viceversa. This mechanism is better explained in the Chapter 4 and 5.

2.3 ARMv7-A with VE and SE

The ARM architecture used for this work is a 32-bit ARMv7-A Cortex-A15 with the Virtualization Extension and the Security Extension.

Processor modes As shown in Table 2.1, an ARMv7 processor has up to 9 different modes depending on if optional extensions have been implemented. The usr mode that has a privilege level 0 is where user space programs run at. The svc mode that has a privilege level 1 is where most parts of kernel execute at. However, some kernel modules run at special modes instead of svc. For example, when a data abort exception happens, a processor switches to the abt mode automatically. The current processor mode is determined by the mode field (M) of the current program state register (CPSR). Processor mode change can be triggered by exceptions, such as the aforementioned data abort exception. Or, privileged program can directly write CPSR by calling a MSR CPSR_c, #imm instruction, where c stands for the control field that includes processor mode bits and interrupt mask bits.

Mode Abbr. Privilege level Security state

User usr PL0 both

Supervisor svc PL1 both

System sys PL1 both

Abort abt PL1 both

IRQ irq PL1 both

FIQ fiq PL1 both

Undefined und PL1 both

Monitor* mon PL1 Secure only

Hyp** hyp PL2 Non-secure only

* only implemented with Security Extension ** only implemented with Virtualization Extension

Tab. 2.1: Processor modes

Processor states With the security extensions, a processor has two security states, namely the secure state (s) and the non-secure state (ns). The distinction between

(21)

the two states is orthogonal to the mode protection based on privilege levels, except that the mon mode is only available in the secure state and the hyp mode that is implemented with virtualization extensions only exists for the non-secure state. The current processor state is determined by the least significant bit of the secure configuration register (SCR) in the CP15 coprocessor.

Core registers Figure 2.7 compares the ARMv7 architecture core registers between the application level view and system level view. From the application level perspec-tive, an ARMv7 processor has 14 general-purpose 32-bit registers (R0 to R14), a 32-bit program counter R15 also known as PC, and a 32-bit application program state register (APSR). Two of the 14 general-purpose registers can be used for special purposes: R13 also known as SP is usually used as the stack pointer; R14 also known as LR is usually used to store return address. APSR is an application level alias for CPSR, and it must be only used to access condition flags.

Fig. 2.7: ARMv7-A Core Register (picture from [1])

From the system level view, these registers are arranged into several banks, which means a register name is mapped to a collection of different physical registers, governed by the current processor mode. As shown in Figure 2.7, each mode except the system mode of the processor has:

• its own banked copy of stack pointer SP

• a register that holds a preferred return address for the exception (a banked copy, such as LR_mon, for LP1 modes or a special register ELR_hyp for the hyp mode)

(22)

• a copy of saved program status register SPSR to save the copy of CPSR made on exception entry (except the usr and system).

Saving the value of CPSR in banked SPSR registers means the exception handler can immediately restore the CPSR on exception return and examine the value of CPSR when the exception was taken, for example to determine the previous process mode when the exception took place. In addition, the fiq mode has banked copies of R8 to R12. For example, when a processor is executing in the fiq mode, R0 refers to R0_usr, but R12 refers to R12_fiq instead of R12_usr.

Note that processor core registers and program status registers are not banked between the secure state and non- secure state. Therefore, a program can use registers to pass parameters between states. Also, during a processor state switch, a privileged program mostly running in Monitor mode will save the old state’s register values and restores the new state’s register values. We will deeper discuss about the Monitor Software implemented in this work in the Chapter 4 and 5.

Coprocessors and System Registers The ARM architecture supports sixteen co-processors, namely CP0 - CP15, in which CP15 (System Control coprocessor) is reserved in the architecture for the control and configuration of the processor system. Hardware manufacturer can define other coprocessors for their own purposes. The system registers in CP15 are categorized in many groups that include

• virtual memory control registers function group (SCTLR, DACR, TTBR0, TTBR1, PRRR)

• PL1 Fault handling registers

• cache maintenance operations

• address translation operations

• security Extensions registers.

Given the special purpose of CP15 system registers, many of them are banked between secure and non-secure states. However, the registers that configures the global system status, such as SCR, are not banked. Table 2.2 lists some CP15 system registers that are used in this work.

(23)

Register Name Security state

VBAR Vector Base Address Register Banked in both states MVBAR Monitor Vector Base Address Register Secure state, monitor mode ISR Interrupt Status Register

SCR* Secure Configuration Register Secure state

TTBRx Translation Table Base Register (0), (1) Banked in both states TTBCR Translation Table Base Control Register Banked in both states DACR Domain Access Control Register

SCTLR System Control Register Banked in both states NSACR* Non-Secure Access Control Register Secure state

SDER Secure Debug Enable Register * only implemented with Security Extension

Tab. 2.2: Some CP15 Registers on ARMv7-A

(24)

2.4 Xvisor

This section describes Xvisor, the embedded open-source hypervisor used to validate the techniques proposed in this work, taking as reference [17].

Xvisor, as shown in Figure 2.8, is a complete monolithic Type-1 hypervisor that supports both full virtualization and para-virtualization. It aims at providing a lightweight hypervisor that can be used within embedded systems with less over-head and small memory footprint. Xvisor primarily provides fully virtualized guest, Hardware-assisted full-virtualization when possible (ARM Virtualization Extensions) and provides para-virtualization also in the form of optional VirtIO devices [20]. Xvisor aims at providing a light-weight, portable, and flexible virtualization solution. It provides a high performance and low memory foot print virtualization solution for ARMv5, ARMv6, ARMv7a, ARMv7a-ve, ARMv8a, x86_64, and other CPU architec-tures.

Fig. 2.8: Xvisor System Architecture (picture from [17])

All core components of Xvisor such as: CPU virtualization, guest IO emulation, background threads, para-virtualization services, management services, and device drivers run as a single software layer with no prerequisite tool or binary file. The guest OS runs on what Xvisor implementers call Normal vCPUs, having a privilege less than Xvisor. Moreover, all background processing for device drivers

(25)

and management purposes run on Orphan vCPUs with highest privilege. Guest configuration is maintained in the form of a tree data structure called device tree [12]. In this way no source code changes are required for creating a customized guest for embedded systems.

The most important advantage of Xvisor is its single software layer running with highest privilege, in which all virtualization related services are provided. Xvisor’s context switches are lightweight resulting in fast handling of nested page faults, special instruction traps, host interrupts, and guest IO events. Furthermore, all device drivers run directly as part of Xvisor with full privilege and without nested page table ensuring no degradation in device driver performance. In addition, the Xvisor vCPU scheduler is per-CPU and does not do load balancing for multi-processor systems. The multi-multi-processor load balancer is a separate entity in Xvisor, independent of the vCPU scheduler. Both, vCPU scheduler and load balancer are extensible in Xvisor.

Virtualization Techniques Regarding the Virtualization Techniques, Xvisor provides Hardware-Assisted and Para-Virtualization (in case of no Virtualization Extension). In particular, the Para-Virtualization for the sensitive non-privileged instruction is made off-line by using a Python script ("elf2cpatch.py") that generates a cpatch script from Guest OS ELF.

The script mainly takes advantage of the fixed-length instruction since it is translated in a SVC instruction. Taking as reference the example shown in the Section 2.1.2, the Figure 2.9 highligths the 32-bit instruction codification according to the specific translated instruction.

Fig. 2.9: Xvisor Guest OS ELF Translation

(26)

The coded instruction is composed of a fixed part ([27:20]) and a variable part ([31:28] and [19:0]). The fixed part is needed to identify the SVC instruction. In particular the Operation field for an SVC instruction is 0xF. The ID field is always set to 0, unless the translated instruction itself is an SVC instruction, so the ID field is set to 0xF. The variable part depends on the translated instruction. In the Figure 2.9, the green field is specifically related to an MSR instruction using a register as source instead of an immediate value (in this example r0). The SUB_ID field is used to identify the MSR instruction inside the Xvisor handler (in this case it is set to 0b011) so it is customizable. The Condition and Rn fields are related to the specific translated instruction according to the following Assembly syntax MSR[cond] <coproc_register>, <Rn>. The R field is the destination PSR (0 for CPSR and 1 for SPSR). The variable part is instruction-dependend, in particular the most significant 19 bits can be split in different fields and can contain also empty fields. The Condition

field instead is always present, containing the original Conditional field.

Guest IO Emulation Embedded systems will need to run legacy software as virtual machine or guest. This legacy embedded software might expect particular type of hardware that hypervisors have to emulate. It is thus imperative that hypervisors have minimum overhead in emulating guest IO events.

Figure 2.10 shows how Xvisor in ARM with Virtualization Extensions implements the Guest IO Emulation. the scenario starts at (1) when a guest IO event is trapped by Xvisor ARM and (2) handles it in a non-sleepable normal (or emulation) context. The non-sleepable normal context ensures fixed and predictable overhead.

Fig. 2.10: Emulated Guest IO event (picture from [17])

Host Interrupts Embedded systems have to comply with the stringent timing con-straints when processing host interrupts. In virtualized environments, hypervisors can have additional overheads in processing host interrupts, which in turn affects host IO performance.

Xvisor’s host device drivers generally run as part of Xvisor with highest privilege. Figure 2.11 shows how Xvisor in ARM with Virtualization Extensions handles the Host Interrupts while a Guest OS is running. A scheduling overhead only incurs if the host interrupt is routed to guest, which is not running currently.

(27)

Fig. 2.11: Host interrupts handling (picture from [17])

Memory Management The ARM architecture with Virtualization Extension provides two-staged translation tables (or nested page tables) for guest memory virtualization. The guest OS is responsible for programming stage1 translation table which carries out guest virtual address (GVA) to intermediate physical address (IPA) translation. The ARM hypervisors are responsible for programming stage2 translation table to achieve intermediate physical address (IPA) to actual physical address (PA) trans-lation. Translation table walks are required upon TLB misses. The number levels of stage2 translation table accessed through this process affect the memory band-width and overall performance of virtualized system. Such that N levels in stage1 translation table and M levels in stage2 translation table will carry out NxM memory accesses in worst-case scenarios. Clearly, the TLB-miss penalty is very expensive for guests on any virtualized system. To reduce TLB-miss penalty in two-staged MMU, ARM hypervisors create bigger pages in stage2 translation table.

Xvisor ARM pre-allocates contiguous host memory as guest RAM at guest creation time. It creates a separate three level stage2 translation table for each guest. Xvisor ARM can create 4KB or 2MB or 1GB translation table entries in stage2. Additionally, it always creates the biggest possible translation table entry in stage2 based on IPA and PA alignment. Finally, the guest RAM being flat/contiguous (unlike other hyper-visors) helps cache speculative access, which further improves memory accesses for guests.

Virtual CPU Virtual machines are separated into two major categories (based on their use):

• System Virtual Machine: A system virtual machine provides a complete system platform which supports the execution of a complete operating system (OS) • Process Virtual Machine: A process virtual machine is designed to run a single

program, which means that it supports a single process.

Xvisor refers system virtual machine instances as "Guest" instances and virtual CPUs of system virtual machines as "VCPU".

(28)

A VCPU can be in exactly one state at any give instance of time. A VCPU state change can occur from various locations such as architecture specific code, some hypervisor thread, scheduler, some emulated device, etc. It is not possible to have an exhaustive list of all possible scenarios that would require a VCPU state change, but the VCPU state changes have to strictly follow a finite-state machine (see Figure 2.12) which is ensured by the hypervisor scheduler.

Fig. 2.12: Finite-state Machine for VCPU state

UNKNOWN: VCPU does not belong to any Guest and is not Orphan VCPU. To enforce lower memory foot print, Xvisor pre-allocates memory based on maximum number of VCPUs and put them in this state.

RESET: VCPU is initialized and is waiting for someone to kick it to READY state. To create a new VCPU, the VCPU scheduler picks up a VCPU in UNKNOWN state from pre-allocated VCPUs and initialize it. After initialization the newly created VCPU is put in RESET state.

READY: VCPU is ready to run on hardware.

RUNNING: VCPU is currently running on hardware.

PAUSED: VCPU has been stopped and can resume later. A VCPU is set in this state (usually by architecture specific code) when it detects that the VCPU is idle and can

be scheduled out.

HALTED: VCPU has been stopped and cannot resume. A VCPU is set in this state (usually by architecture specific code) when some erroneous access is done by that VCPU.

(29)

Guest Management A Guest instance consists of the fields shown in Table 2.3.

Field Description

ID Globally unique identification number

Device Tree Node Pointer to Guest device tree node

VCPU Count Number of VCPU instances belonging to this Guest VCPU List List of VCPU instances belonging to this Guest Guest Address Space Info Information required for managing Guest physical

address space

Arch Private Architecture dependent context of this Guest

Tab. 2.3: Guest Instance Fields

A Guest Address Space is architecture independent abstraction which consist of the fields shown in Table 2.4.

Field Description

Device Tree Node Pointer to Guest Address Space device tree node Guest Pointer to Guest to which this Guest Address Space

belongs

Region List A set of "Guest Regions"

Device Emulation Context Pointer to private information required by device emulation framework per Guest Address Space

Tab. 2.4: Guest Address Space

Each Guest Region has a unique Guest Physical Address (i.e. Physical address at which region is accessible to Guest VCPUs) and Physical Size (i.e. Size of Guest Region). Further a Guest Region can be one of the three forms:

• Real Guest Region: A Real Guest Region gives direct access to a Host Machine Device/Memory (e.g. RAM, UART, etc). This type of regions directly map guest physical address to Host Physical Address (i.e. Physical address in Host Machine).

• Virtual Guest Region: A Virtual Guest Region gives access to an emulated device (e.g. emulated PIC, emulated Timer, etc.). This type of region is typically linked with an emulated device. The architecture specific code is responsible for redirecting virtual guest region read/write access to the Xvisor device emulation framework.

• Aliased Guest Region: An Aliased Guest Region gives access to another Guest Region at an alternate Guest Physical Address.

Hypervisor Scheduler The Xvisor scheduler is generic and pluggable with respect to the scheduling strategy (or scheduling algorithm). It updates per-CPU ready queues whenever it gets notifications from hypervisor manager about VCPU state

(30)

change. The hypervisor scheduler uses per-CPU hypervisor timer event to allocate time slice for a VCPU. When a scheduler timer event expires for a CPU, the scheduler will find next VCPU using some scheduling strategy (or algorithm) and configure the scheduler timer event for next VCPU.

For Xvisor a Normal VCPU is a black box (i.e. anything could be running on the VCPU) and exception or interrupt is the only way to get back control. Whenever Xvisor is running, it could be in any one of following contexts:

• IRQ Context, when serving an interrupt generated from some external device of host machine.

• Normal Context, when emulating some functionality or instruction or emulating IO on behalf of Normal VCPU in Xvisor.

• Orphan Context, when running some part of Xvisor as Orphan VCPU or Thread (Note: Hypervisor threads are described later.)

Xvisor has a special context called Normal context. The hypervisor is in Normal context only when it is doing something on behalf of a Normal VCPU such as han-dling exceptions, emulating IO, etc. The Normal context is non-sleepable which means a Normal VCPU cannot be scheduled-out while it is in Normal context. In fact, a Normal VCPU is only scheduled-out when Xvisor is exiting IRQ Context or Normal Context. This helps Xvisor ensure predictable delay in handling exceptions or emulating IO.

The expected high-level steps involved in architecture specific VCPU context switch-ing are as follows:

• Save arch registers (or arch_regs_t) from stack (saved by architecture specific exception or interrupt handler) to current VCPU arch registers (or arch_regs_t). • Restore arch register (or arch_regs_t) of next VCPU on stack (will be restored

when returning from exception or interrupt handler C code).

• Switch context of architecture specific CPU resources such as MMU, Floating point subsystem, etc.

The possible scenarios in which a VCPU context switch is invoked by scheduler are as follows:

• When time slice allotted to current VCPU expires we invoke VCPU context switch. We call this situation as VCPU preemption.

• If a Normal VCPU misbehaves (i.e. does invalid register/memory access) then architecture specific code can detect such situation and halt/pause the responsible Normal VCPU using APIs from hypervisor manager.

• An Orphan VCPU (or Thread) chooses to voluntarily pause (i.e. sleep).

(31)

• An Orphan VCPU (or Thread) chooses to voluntarily yield its time slice. • The VCPU state can also be changed from some other VCPU using hypervisor

manager APIs.

Scheduling algorithm Currently, Xvisor supports 2 scheduling algorithm: • Fixed Priority Round-Robin (PRR) (default)

• Rate Monothonic (RM)

(32)

3

State of art

This chapter gives an overview about some other Hypervisors for ARM. The most popular open-source Hypervisor for ARM are:

• XEN [4]

• Kernel-based Virtual Machine (KVM) [11]

• Xvisor [17]

• Xtratum [14]

• OKL4 [7]

Regarding the proprietary hypervisors, the following products can be cited: • IBM VM/360 [8]

• VMware [9]

• WindRiver Hypervisor

• AIR [22]

3.1 Related works based on TrustZone

In this section some of the various related works based on TrustZone technology will be listed and shortly described.

OP-TEE [13] . OP-TEE stands for Open Portable Trusted Execution Environment. OP-TEE is an open source project which contains a full implementation to make up a complete Trusted Execution Environment. The project has roots in a proprietary solution, initially created by ST-Ericsson and then owned and maintained by STMi-croelectronics. In 2014, Linaro started working with STMicroelectronics to transform the proprietary TEE solution into an open source TEE solution instead. In September 2015, the ownership was transferred to Linaro. OP-TEE consists of three components: OP-TEE Client, which is the client API running in normal world user space; OP-TEE Linux Kernel driver, which is the driver that handles the communication between normal world user space and secure world; and OP-TEE Trusted OS, which is the Trusted OS running in secure world.

SierraTEE [21] . Sierraware has developed SierraTEE, a secure operating system developed for ARM TrustZone hardware security extensions. SierraTEE is a compre-hensive implementation of ARM TrustZone as well as GlobalPlatform System and IPC APIs. It provides a minimal secure kernel which can be run in parallel with a more fully featured high level OS, such as Linux, Android, BSD - on the same core.

(33)

It also provides drivers for the Rich OS ("normal world") to communicate with the secure kernel ("secure world").

Sprobes [5] . Sprobes is a novel primitive that enables introspection of operating systems running on ARM TrustZone hardware. Using SPROBES, an introspection mechanism protected by TrustZone can instrument individual operating system instructions of its choice, receiving an unforgeable trap whenever any SPROBE is executed. The key challenge in designing SPROBES is preventing the rootkit from removing them, but we identify a set of five invariants whose enforcement is sufficient to restrict rootkits to execute only approved, SPROBE-injected kernel code.

T-KVM [16] . Trusted Kernel-based Virtual Machine (T-KVM), a novel security architecture for the KVM-on-ARM hypervisor, is proposed to satisfy the current market trend. T-KVM integrates software and hardware components to secure guest Operating Systems (OSes) and enable Trusted Computing in ARM virtual machines. The architecture combines four isolation layers: ARM Virtualization and Security Extensions (ARM VE and TrustZone), GlobalPlatform Trusted Execution Environment (TEE) APIs and SELinux Mandatory Access Control (MAC) security policy.

LTZVisor [18] . Lightweight TrustZone- assisted Hypervisor (LTZVisor) is a tool to understand, evaluate and discuss the benefits and limitations of using Trust-Zone hardware to assist virtualization. LTZVisor demonstrates how TrustTrust-Zone can be adequately exploited for meeting the real-time needs, while presenting a low performance cost on running unmodified rich operating systems.

(34)

4

Proposed Design

The proposed design is a dual-hypervisor configuration in the same Cortex-A15 Processor: one Xvisor running in the Non-Secure World and one in the Secure World. As mentioned in the Section 2.3, in the Non-Secure World there is the Virtualization Extension support so the Non-Secured Xvisor (hereafter called NS Xvisor) is providing a Hardware-Assisted Full Virtualization. In the Secure World there is no support for Virtualization Extension, so the Secured Xvisor (hereafter called S Xvisor) uses Para-Virtualization. With this scenario, the Non-Secure Guests are untocuhed while the Secure Guest’s Binaries must be patched off-line (as shown in Figure 2.9) in order to handle sensitive instructions executed in User mode to support the Trap and Emulate technique.

Fig. 4.1: Architecture Overview

An additional component called X Monitor, running in Monitor mode and belonging to the Secure World, has been designed in order to handle the context switches between the two worlds and all the interrupt sources of the Secure World while the Non-Secure World is running (see Section 5.3).

The Non-Secure Guests are supposed to be Rich OSs such as Linux or Basic Firmware. The Secure Guests can be:

(35)

• Secure Server. Passive component that provides services (such as crypto algo-rithm, secure data, etc.)

• Secure OS. An active component that is supposed to be a Secure OS (such as RTOS) or a simple Basic Firmware.

4.1 Scheduling

As mentioned in the Section 2.1, virtualized environments lead to the definition of a hierarchial scheduling framework that has at least two schedulers: the global one used by the Hypervisor to schedule the different Virtual Machines (VM) and the local one used by the specific OS (hereafter also called Guest) to schedule the related tasks. In this work this concept is applicable for each world. The presence of two hypervisors complicates the system scheduling since only one world at a time can run in the same processor. The proposed design introduces an additional layer in order to schedule the two worlds, thus leading to the definition of a hierarchical scheduling.

Fig. 4.2: Schedulers Overview

Sched_0 The lowest level, named Global Scheduler, is handled by the X Monitor and it is based on Fixed Priority. The Secure World can be seen as an aperiodic task with the highest priority. Two different events can preempt the Non-Secure World:

• Synchronous call from the Non-Secure World • Asynchronous interrupt for the Secure World

The first event is based on a sychronous SMC Call by the Non Secure World in order to request a service to the Secure World. In this case the NS Xvisor is directly asking a preemption so releasing the processor to the Secure World. The SMC call switches the processor into Monitor mode. Subsequently, the X Monitor can then perform the Context Switch. At this point the S XVisor can process the request. As soon as the request has been processed and there are no ready tasks in the Secure World, the S Xvisor signals the completion of the activity by performing another SMC call. Now, the processor is again into the Monitor mode so the X Monitor can switch the Non-Secure Context back.

(36)

The second event that can trigger a preemption of the Non-Secure World is an interrupt for the Secure World. Since the Secure World has a higher priority, the execution of the Non-Secure World is preempted by the X Monitor that performs a Context Switch so that the S Xvisor can process the interrupt. Once the interrupt has been processed and there are no ready tasks in the Secure World, the S Xvisor signals the completion of the activity by performing another SMC call. Now, the processor is again into the Monitor mode, so the X Monitor can switch back the Non-Secure Context.

Figure 4.3 shows a possible scenario in which both events occur.

Fig. 4.3: Scheduling Example

The X Monitor can not be preempted, so every execution can be seen as non-preemptive critical section. In the figure, the jobs performed by the X Monitor have been modeled with a fixed-lenght for the sake of simplicity. However this is a reasonable solution since the worst case execution time of the X Monitor is easily computable thanks to the high predictable code flow. All the timing aspects are better described in the Chapter 7.

The Secure World cannot be preempted. This choice is based on aiming at a more predictable scheduling of the Secure World, at the cost of degrading the perfomances of the Non-Secure World. This policy is more suitable for guaranteeing the timing performance of software running upon a RTOS in the secure world. This solution partially implements the ARM SMC Calling Convention [3], which defines two types of calls: Fast Calls and Yielding Calls. In particular the one implemented in this work is the Fast Calls. The Yielding Calls is left as future work. However the ARM SMC Calling Convention used in this work is better described in the Section 4.4.

Sched_1 The intermediate level, named Sub-Global Scheduler, is handled by the related Hypervisor: the S Xvisor in the Secue World and the NS Xvisor in the Non-Secure World. In this work, both of them provide (as described in the Section 2.4 regarding the Hypervisor Scheduler) Prioritized Round Robin and Rate Monotonic. Sched_2 The highest level, named Local Scheduler, is handled by the Guest Op-erating System that can schedule its own tasks within the related Virtual Machine. This is not a topic of this work.

(37)

4.2 Secure Guests

Secure Guests can run as active or passive guests. However, in both cases, an important constraint is that a Secure Guest must release the processor at job completion so that it can switch its state into Pause. If some Secure Guest occupies the processor without signaling the completion of activity, the S Xvisor scheduler cannot release the processor to the Non Secure world since it has ready task to execute. An active Guest can release the processor by executing a WFI (Wait for Interrupt) instruction while a passive Guests with a SMC (Secure Monitor Call) instruction. Remember that the Secure Guests run in a para-virtualized environment, so both instructions have been off-line traslated into hypercalls. In both the cases, the S Xvisor scheduler removes the corresponding Guest from the ready queue and, once it is empty, S Xvisor can finally release the processor to the Non Secure world by performing a real SMC.

4.3 Inter-World Communication

When a Non-Secure Guest needs to request a Secure Service to a Secure Guest, it must use the physical general purpose register as specified by the ARM SMC Calling Convention [3] for passing arguments. The X Monitor preserves the r0-r7 general purpose registers when performing a context switch caused by a synchronous SMC Call by the Non-Secure World, so the Secure World will find the argument passed in such physical registers. When going back to the Non-Secure World after the requested operation completes, the X Monitor only preserves the r0-r3 physical general purpose registers, which can be modified by the Secure World. They are used to pass results (such as request accepted and performed or request denied) from the Secure World to the Non-Secure world. The remaining r4-r7 registers are restored to the previous value of the Non-Secure World, so being untouched. Additionally, the Non-Secure Guest can share one or more buffers with the Secure World. The declaration of this memory sections must be provided off-line in the Device Tree by using the reserved-memory node. The configuration phase is better described in the Section 4.5. The shared memory address and size are sent to the Secure World through the physical general purpose registers. The Section 4.4 will describe in details how to use all the physical general purpose registers in order to request a secure service.

4.4 SMC Calling Convention

In the ARM architecture, synchronous control is transferred between the Non-Secure state and the Secure state through Secure Monitor Call exceptions. SMC exceptions

(38)

are handled by the Secure Monitor. The ARM SMC Calling Convention [3] provides two types of SMC Calls:

• Fast Calls execute atomic operations. The call appears to be atomic from the perspective of the calling process, and returns when the requested operation has completed

• Yielding Calls start operations that can be preempted by a Non-secure interrupt. The call can return before the requested operation has completed

The operation of the Secure Monitor, such as distinguishing between Fast Calls and Yielding Calls, is determined by the parameters that are passed in through registers. However the Monitor implemented in this work does not check the types of calls, treating all of them as Fast Calls. Yielding Calls are left as future work.

From the point of view of the user, the way to invoke secure functions from the Non-Secure World follows the ARM SMC Calling Convention [3]. The Table 4.1 describes how to use the registers in order to pass information like function id, requested guest’s ID, address and size of the source and destination buffer.

Tab. 4.1: SMC Calling Convention Register Usage

(39)

4.5 System Configuration

From now on, please consider the System Manager as the entity that is responsible of configuring the system. The entire system configuration includes four main phases:

1. Hypervisors configuration 2. Resources allocation

3. Shared region configuration

4. Legal connections and entities configuration

Hypervisors configuration The System Manager needs to separately configure the two Xvisors such that there are no shared resources between them. These resources can be devices, such as Timers, RTC, UARTs, and memory, such as DRAM. Xvisor needs at least 1 Timer to run. However, an advisable choice is to associate at least 1 Timer and 1 UART to the Secure Xvisor so that the User can better interact with S Xvisor by using a console. The others devices can be associated to the Non Secure Xvisor unless in the board there are devices mainly aimed to a secure use, such as crypto acceleration devices. It is important to notice that the two hypervisors are not symmetrically the same, expecially regarding the privilege mode of operation. Resources allocation Regarding the memory allocation, the System Manager shall ensure that the parts of RAM allocated to the two hypervisors do not overlap between themself. If the Non Secure World overlaps a Secure Regions, this can cause a crash in the Non Secure World while saving the Secure State (due to the hardware protection). The viceversa is not true, meaning that if the Secure World overlaps a Non Secure Regions, this can cause a crash in both the worlds since the technology gives a complete access to the Secure World (even to the Non Secure Regions). Another important aspect is that the System Manager shall reserve a memory region to allocate shared buffers between Secure and Non-Secure Guests. This memory region is hereafter called Shared Region. Regarding the devices allocation, after decided which devices are intended to be used by the Secure World, the System Manager shall configure the bootloader to setup the TrustZone Protection Controller to separate the devices during the boot phase. The TrustZone Protection Controller will ensure a hardware isolation of these devices denying access from the Non Secure World.

Shared Region configuration Each Guest (Secure and Non Secure) is configured by using a Unix-like based Device Tree. Within the Device Tree, there is information such as VCPUs and the Address Space. In particular, in the Address Space node (called aspace) there is a list of memory regions that the Guest can access. These regions can be associated to DRAM or Devices such as UARTs, MMC, Timers, WDT, etc. Each of them has address and size which (of course) are not real host addresses,

(40)

since these devices will be emulated by Xvisor. In fact the attribute related to the address is called guest_physical_addr, which will be then translated into a different host_physical_addr by Xvisor during the initialization phase.

The System Manager shall manage requests from Non Secure Guests to share buffers with its registered Secure Guests. This operation can be performed by using the related aspace nodes. In particular, a reserved-memory node shall be added to both the Device Trees of the Secure and Non Secure Guest. The idea is that the Non Secure Guest shall specify shared buffers in terms of guest_physical_addr and physical_size. The System Manager will then allocate all requested buffers in the main common regions by finding free slots. NS Xvisor will override before Guest creation the host_physical_addr attribute with the allocated slot.

The Secure Guest has a reserved-memory node with guest_physical_addr, host_physical_addr and physical_size attributes equal to 0. The S Xvisor will override these attributes before Guest creation with the related non secure slot information. One contraint on configuring the shared buffer in the Secure Guest is that guest_physical_addr and host_physical_addr must coincide thus providing a direct map between virtual and physical address. This constrain is mainly based on the fact that a secure request can contain physical address and size of shared buffers as arguments. These addresses will be directly forwarded to the Secure Guest, so it shall use those addresses to access the memory location. This configuration implies another additional contraint: shared buffers must be contiguous.

Legal connections and entities configuration Each guest has a double id: the s/ns_guest_id used for the secure services and the hyp_guest_id internally assigned by the Hypervisor for management purposes. The first is the one externally visible and used by the Non Secure world to address the Secure Guest. The System Manager shall specify s/ns_guest_id and other additional security information for each Guest (both Secure and Non Secure) by using the Device Tree. In particular in the Device Tree a new node shall be been added to mainly identify the role of the guest and the legal inter-world connections. The node, called secure_services, is composed by 4 attributes for a Secure Guest and 3 attributes for a Non Secure Guest. The secure_services node for a Secure Guest has the following attributes:

• tz_side: a string attribute that can be "secure" for Secure Guests and "non_secure" for Non Secure Guests. If "secure" is specified for a Non Secure Guest, and viceversa, the configuration phase fails without loading the Guests.

• character: a string attribute that can be "passive" or "active". In case of "active", the guest is treated as an active guest so not able to provide secure services. Every request from the Non Secure world to active guests are discarded. In case of "passive", the guest is treated as a secure server that is always ready to provide secure services on demand.

(41)

• s_guest_id: an unsigned integer attribute which specifies the public Id visible to the Non Secure world and used to address request.

• registered_guests: a vector of unsigned integers seperated by a space which represents the list of Non Secure Guest that are allowed to perform secure request.

The secure_services node for a Non Secure Guest has the following attributes: • tz_side: a string attribute that has the same meaning of what specified above

for the Secure Guest.

• ns_guest_id: an unsigned integer attribute which specifies the public Id visible to the Secure world and used by the NS Xvisor to specify the requesting Guest in a transaction.

• registered_guests: a vector of unsigned integers seperated by a space which represents the list of Secure Guest that are ideally available to provide secure services.

For a better performance of the system, it is advisable to use a symmetric configura-tion in both the sides, especially for the registered_guests attribute. In the case where the two configurations do not match, the one in the S Xvisor is the most critical one that of course needs to be correct. In fact the NS Xvisor can even allow a Non Secure Guest to perform requests to all Secure Guests. The transaction will anyway be stopped by the S Xvisor. See Section 4.6 for more information about the checking barriers. However, the NS Xvisor can also restrict the registered_guests for a Non Secure Guest denying even legal connections. This case is not on-line modifiable, but just changing the configuration off-line and relaunching the system.

4.6 Checking barriers

With checking barriers we intend all the points in which a secure request is subjected to a sanity check. The Figure 4.4 shows the 3 check points that a secure request must pass. This can be considered as a redundant configuration in order to enforce security and bound the check as local as possible. However this solution improves the overall system performance since it could avoid useless context switch between worlds. In the figure the association between Secure and Non Secure Guests is highlighted by using colors: the Non Secure Guest can perform request only to Secure Guest with the same color. In this work we developed the check1 and 2. The

check3 is up to the Secure Guest.

(42)

Fig. 4.4: Checking barriers

A dual-hypervisor for platforms supporting hardware-assisted security and virtualization

A dual-hypervisor for platforms supporting

hardware-assisted security and virtualization

Giorgiomaria Cicero

Abstract

Contents

1

Introduction

2

Background

2.1

Virtualization

2.1.1

Types of Hypervisor

2.1.2

Trap and Emulate

2.1.3

ARM Virtualization Extension

2.2

Trustzone

2.2.1

Modes view and System IPs

2.2.2

World Switching Mechanism

2.3

ARMv7-A with VE and SE

2.4

Xvisor

3

State of art

3.1

Related works based on TrustZone

4

Proposed Design

4.1

Scheduling

4.2

Secure Guests

4.3

Inter-World Communication

4.4

SMC Calling Convention

4.5

System Configuration

4.6

Checking barriers