Contents
1 INTRODUCTION 3
2 NETFPGA: A RESEARCH PLATFORM 7
2.1 The NetFPGA architecture 9 2.1.1 Pipeline Interface Details 11 2.1.2 THE IPV4 ROUTER 13 2.2 What is a FPGA? 17 2.2.1 INNER STRUCTURE 18 2.2.2 FPGA Design and Programming 21 2.3 Verilog: some hints 23 2.3.1 A Brief History 24 2.3.2 A simple example 25 2.3.3 Language fundamentals 27
3 PREFIX MATCH LOOKUP ALGORITHMS 29
3.1 Introduction to Prex Lookups 29 3.1.1 Variable{length prexes 30 3.1.2 Lookup Model 31 3.2 Non{algorithmic techniques for prex matching:
Ternary Content Addressable Memories (TCAM) 33 3.3 Tries based algorithms 35 3.3.1 Multibit Tries 37 3.3.2 Level{compression: Lulea Tries 40 3.3.3 Tree Bitmap Idea 43
3.4 Longest Prex Matching using Bloom Filters 47 3.4.1 Bloom Filter Theory 48 3.4.2 Basic Conguration 52 3.4.3 Optimization 54 3.4.4 Performance 56
4 IP LOOKUP WITH BLOOMING TREES ON NETFPGA 59
4.1 An overview 59 4.2 Minimal Perfect Hash Function: a base for fast prefix lookup 61 4.2.1 Blooming Tree 61 4.2.2 The MPHF construction 63 4.3 The MPHF lookup Algorithm 68 4.3.1 The Forwarding Plane 70 4.3.2 The Control Plane 90 4.4 A lookup example 94 4.5 The update process 98 4.6 Considerations and Optimizations 102
5 PERFORMANCE EVALUATION 105
5.1 Memory requirements 105 5.2 False Positive Probability 106 5.3 SRAM accesses 107 5.4 Update and Construction Time 108 5.5 Device Utilization 109 5.6 Development Simulations 111
Chapter 1
INTRODUCTION
This thesis work shows the implementation of a new solution for the IP lookup function carried out by routers in a network. Packet forwarding in IP routers is performed according to the packet destination address which is matched, in a Longest Prefix Match(LPM) fashion, against several thousands of entries
in a “Forwarding Table". This search for the Longest Prex
Match of the IP destination address is commonly referred to as IP lookup. The explosive growth of the Internet has translated into an unceasing reduction of the time-budget for packet pro-cessing and a growth of the number of entries in the Forwarding Tables, therefore this fundamental yet simple functionality has now become a critical task, which can often be the bottleneck in high performance routers. That is why a large variety of new algorithms have been presented, trying to improve the e-ciency and speed of the lookup. The Algorithm here proposed is based on data structures called Blooming Trees, compact and
fast techniques for membership queries. A Blooming Tree is a Bloom Filter based structure, which takes advantage of low false positive probability in order to reduce the mean number of memory accesses.
A Novel IP Lookup Algorithm with Minimal Perfect Hash Function for high performance routers based on NetFPGA
ity of an algorithm for high performance routers, given that it strongly inuences the mean time required for a lookup process. An array of parallel Blooming Trees accomplishes the Longest Prex Match function for the entries of the Forwarding Table by storing the entries belonging to the 16-32 bit range. Shorter entries, intead, are stored in a very simple Direct Addressing logical block. Direct Addressing uses the address itself (in this case only the 15 most signicant bits) as on oset to memory locations. Every Blooming Tree (hereafter BT) has been set up according to the MPH function [8], a scheme conceived to obtain memory ecient storage and fast item retrieval.
The implementation platform for this algorithm is the NetF-PGA [3] board, a new networking hardware which proves to be a perfect tool for research and experimentation. It is composed of a full programmable Field Programmable Gate Array (FPGA) core, four Gigabit Ethernet ports and four banks of Static and Dynamic Random Access Memories (S/DRAM). NetFPGA has been designed as part of the Stanford University project named Clean Slate, a program which focuses on unconventional, bold, and long{term research that tries to break the network's ossi-cation in order to improve it. This work is primarily focused on the central FPGA, where the Verilog language, an Hardware Description Language (HDL) describing directly the bit ows over the AND/OR/NOT ports, is adopted.
In details, a set of static Blooming Trees structure is asso-ciated to the actual Forwarding Table and stored in fast Block on{chip RAM, while a second structure that stores the next-hop data is located onto the bigger NetFPGA SRAM. The lookup mechanism consists of a query in the BT array searching for a match and, in the case of a positive search, a query to the SRAM is performed in order to verify the matching. Since a BT always provides a non-zero false positive probability there
could be an erroneous matching: in this case a new query is car-ried out. Finally if there is no correspondence for the searched IP address in the BT block of the algorithm, a simple Direct Addressing of the 15 most signicant bits is done.
A Novel IP Lookup Algorithm with Minimal Perfect Hash Function for high performance routers based on NetFPGA
A software control plane manages the algorithm, controlling the database construction and its update (adding or removing entries). In this sense the control module merges perfectly in the preexistent SCONE (Software Component of the NetFPGA). The thesis is organized as follows: in the second Chapter a complete NetFPGA overview is presented, discussing about its main components and performances together with a survey on the clean slate projects which better claries the scenario and the motivations that brought to the NetFPGA design. The
Verilog language fundamentals, adopted to program the FPGA) and the development environment (Xilinx ISE Foundation) [4] are also presented.
Chapter 3 describes the details of the IP lookup function in a modern router and the state{of{the{art algorithms actu-ally implemented onto the most modern machines. A hard-ware technique known as TCAM (Ternary Content Addressable Memory), nowadays the fastest but also the most expensive so-lution, is discussed and compared to the most used techniques based on Tries (Multi-bit, Binary, Lulea and Bitmap Tries). An extensive comparison between these implementation tries also to explain what are the best solutions for each case. At the end of this chapter, section 3.4 presents Bloom Filter as a general data structure for membership queries.Chapter 4 is a complete de-scription of the new algorithm, presented in two distinct parts: the forwarding process which contains the Blooming Trees ar-ray constructed in the Verilog language, and the control plane dened in C language. Various examples of lookup and update mechanisms to demonstrate the behavior of the whole algorithm are also reported. Finally, Chapter 5 shows the simulation re-sults and the expected performances of the entire work, giving particular attention to the critical aspects of our problem such as the mean memory access time, the mean number of memory
accesses, the update and construction mean time, an estimated false positive probability, the storage occupancy and the scala-bility.