Enhanced network processing in the Cloud Computing era

(1)

UNIVERSITÀ DI PISA

DIPARTIMENTO DI INGEGNERIA DELL’INFORMAZIONE Dottorato di Ricerca in Ingegneria dell’Informazione

Acitvity Report by the Student Vincenzo MAFFIONE for the 1

st

_{, 2}

nd

_{and 3}

rd

_{year of}

the PhD Program, cycle XXXI

Tutor(s): Dr. Giuseppe LETTIERI

Research Activity

First year

In the first part of the 1 st _{hear year I developed PTNET, a high performance solution for Virtual}

Machine networking. PTNET is based on the Netmap project, combined with a network device pass-through solution for VMs. This solution has been implemented for the Linux kernel and the QEMU hypervisor, and experimentally evaluated. This activity involved several contributions to the stabilization and improvement of the Netmap open source software, which is currently publicly available on the GitHub platform. This work has been published as a paper at the IEEE LANMAN 2016 conference in Rome.

I have also been involved in the research activities of the EU project SSICLOPS. In this project I have developed a Linux driver for a network interface card (NIC) implemented in FPGA, in collaboration with the University of Cambridge.

In the second part of the year I have worked on the design and implementation of PSPAT, together with my tutor. PSPAT is an architecture for high performance packed scheduling for 10 Gbit and 40 Gbit NICs. PSPAT allows effective scheduling at very high packet rates (10-40 Mpps), and supports a large number of transmitting processes (in the thousands). The PSPAT architectures goes beyond the traditional schemes, as it delegates the scheduling activities to a dedicated process, separated from the transmitting clients.

Over the summer I have participated to the Google Summer of Code 2016, with the FreeBSD foundation. In this project I implemented a PTNET prototype for the FreeBSD kernel and the bhyve hypervisor. At the end of the summer I presented my work at the FreeBSD developer summit in Belgrade, Serbia. The code was later incorporated in FreeBSD 12.x releases.

(2)

Second year

In my 2nd _{year of the PhD Program I kept exploring the research areas identified during the}

previous year. These topics include:

Design and analysis of efficient architectures for networks of virtual machines;

Definitions and analysis of models to describe performance issues relevant for such architectures.

These activities have been carried out in the context of the EU project SSICLOPS, and in tight collaboration with my tutors.

The first part of the year has been dedicated to the study of analytical models to describe throughput, latency and energy efficiency of single-producer-single-consumer (SPSC) packet processing systems. SPSC links are commonly used to interconnect the processing elements that constitute high-speed packet processing pipelines or graphs. The concept of software packet processing graph is at the core of the Network Function Virtualization (NFV) paradigm. In the common case where each element is run by a different process (or thread), unexpected performance issues may arise depending on how the SPSC link is implemented and/or configured. As an example of unexpected performance problem, it is possible that optimizations in one element cause the whole processing graph to run at lower throughputs. More in general, it is not trivial to decide how to implement these links, given the multiple constraints and objective functions that may be specified by the user. This study tries to address this gap by providing analysis of common problems and guidelines to fix them. We focused on different SPCS synchronization methods, including sleeping, notifications and busy waiting. The work has been published as a journal paper.

In the second part of the year I worked on a survey of different fast network I/O frameworks to be used for NFV, focusing on the data-path components (i.e. not taking into account the control/management layer). First, different desirable features have been defined, including raw performance, flexibility, portability, amount of specialized hardware required and/or software to be rewritten, attention to energy consumption issues and so on. Second, a number of promising frameworks (e.g. Netmap, DPDK, Snabb, NetVM, …) have been evaluated against these features. Although these fast I/O solutions share many common ideas, no one was found out to be better than the others with respect to all the features. As a consequenence, a choice can be made depending on which desirable features are deemed more important. The survey has been published as a conference paper.

In addition to that I have worked together with other SSICLOPS partners to implement and evaluate HyperNF, a scalable packet processing architecture for NFV. This work shows that the very common choice of performing hypervisor-side network I/O processing in a separate CPU is not usually optimal for resource utilization or overall throughput. Instead, in many situations it is more convenient to perform hypervisor-side processing using the same CPUs used to run the

(3)

Finally, in August we held a Netmap tutorial at SIGCOMM 2017, with the goal of explaining how the Netmap framework can be used to easily write and deploy fast I/O processing chains for NFV applications.

Third year

In the 3nd _{year of the PhD Program I kept exploring the research areas identified during the}

previous years. In particular, I addressed two main topics:

- Experimentation with novel architectures for network packet scheduling; - Design and analysis of efficient Single Producer Single Consumer (SPSC) queues;

These activities have been carried out in the context of the EU project SSICLOPS, and in tight collaboration with my tutors.

The first part of the academic year has been spent on finalizing our PSPAT work on packet scheduling, for its final publication on an international journal. PSPAT (Parallel Scheduling PArallel Transmission) is a novel architecture for high performance packet scheduling and transmission, targeting uses-cases such as Data Center, Network Service Providers, and Network Function Virtualization (NFV). PSPAT overcomes the performance limitations of traditional packet scheduling architectures (e.g. Linux TC or FreeBSD ALTq) by delegating the scheduling and transmission tasks to one or more dedicated threads. Client threads are decoupled by PSPAT using extremely efficient (and lockless) SPSC queues. PSPAT is an enabler technology for the above-mentioned use cases, because it allows scheduling at unprecedented packet rates, e.g., 10-40 Million packets per second (Mpps).

The activity included extended experimentation and careful measurements on two independent PSPAT implementations (for Linux systems), to offer numerical evidence of its advantages over the existing alternatives. The work was finally published in May 2018 on an international journal. The rest of the academic year has been dedicated to the design, analysis and experimentation of cache-efficient SPSC queues. This research started as a follow-up of PSPAT, as we wanted to confirm that the SPSC queues used in PSPAT were indeed optimal in terms of interactions with the memory and cache subsystems. In particular, we were interested to compare them against the more traditional Lamport’s lock-free queues. This investigation was then significantly broadened, by studying six general-purpose SPSC queues and their properties. Two of these queues are actually original contributions of our research work. All of the queues have been analyzed in depth to derive predictive models for the average number of cache misses under several workloads, including throughput-sensitive and latency-sensitive ones. The cache misses behaviour is important because it gives the most relevant contribution to the overhead of an SPSC queue. Extensive experiments have been carried out to validate the models and measure throughput and latency. This research work has been submitted to an international journal, receiving positive feedback. A minor review has also been submitted and we are currently waiting for the response to come.

(4)

Research periods abroad

During the second and third year I have spent 14 summer weeks as an intern at Google’s offices in Sunnyvale (CA) and Madison (WI), USA. It has been an important educational experience where I could use my background on virtual machine networking and efficient notification systems in the context of Google Cloud. In particular I worked on the Andromeda virtual switch, optimizing the notification system between the VMs and the virtual switch itself. Among the other things, I had the chance to clearly see how the internal technologies used in Google’s virtualization platforms are aligned with the ideas and techiques that I have been working on during my PhD.

Publications

International Journals

[J1] G. Lettieri, V. Maffione, L. Rizzo: "A Study of I/O Performance of Virtual Machines", The Computer Journal, 28 September 2017

[J2] L. Rizzo, P. Valente, G. Lettieri, V. Maffione: "PSPAT: software packet scheduling at hardware speed", Computer Communications 120 (2018): 32-45.

[J3] V. Maffione, G. Lettieri, L. Rizzo: “Cache-aware design of general purpose Single Producer Single Consumer queues”, Software Practice and Experience (2018)

International Conferences/Workshops with Peer Review

[C1] G. Lettieri, V. Maffione, L. Rizzo: "A survey of fast packet I/O technologies for Network Function Virtualization", VHPC 2017, 12th Workshop on Virtualization in High-Performance Cloud Computing

[C2] K. Yasukata, F. Huici, V. Maffione, G. Lettieri and M. Honda: "HyperNF: Building a High Performance, High Utilization and Fair NFV Platform", ACM Symposium on Cloud Computing (SoCC) , September 2017

[C3] L. Rizzo, G. Lettieri, V. Maffione: "Very high speed link emulation with TLEM", 2016 IEEE

International Symposium on Local and Metropolitan Area Networks (LANMAN)

[C4] V. Maffione, L. Rizzo, G. Lettieri: Flexible Virtual Networking using netmap passthrough, IEEE

Lanman 2016, Rome, June 2016

[C5] L. Rizzo, S. Garzarella, G. Lettieri, V. Maffione: "A Study of Speed Mismatches Between

Communicating Virtual Machines", IEEE/ACM ANCS 2016, Santa Clara, March 2016 Others

[O1] G. Lettieri, V. Maffione, L. Rizzo, M. Honda: “The Netmap framework for NFV applications”, Full-Day tutorial held at ACM SIGCOMM 2017

Formation activities

(5)

● “Fuzzy Logic and Fuzzy Systems” (3 credits)

● “Academic Writing and Presentation Skills” (4 credits)

● “Middleware and robotic software programming” (3 credits)

● “Signal processing and mining of big data: biological data as case study” (5 credits)

● “Introduzione allo studio delle equazioni differenziali alle derivate parziali” (4 credits)

● “Design concepts of Radio Frequency (RF) and microwave circuits for wireless

applications” (4 credits)