• Non ci sono risultati.

The new ENEA CRESCO3 and CRESCO4 HPC systems

F. Ambrosino, G. Bracco, A. Colavincenzo, A. Funel, G. Guarnieri, B. Mastroianni, S. Migliori, G. Ponti

D. Abate, G. Aprea, T. Bastianelli, F. Beone, R. Bertini, M. Caporicci, B. Calosso, M. Chinnici, R. Ciavarella, A. Cucurullo, P. D’Angelo, D. De Chiara, M. De Rosa, P. De Michele, G. Furini, D. Giammattei, S. Giusepponi, R. Guadagni,

A. Italiano, S. Magagnino, A. Mariano, G. Mencuccini, C. Mercuri, P. Ornelli, F. Palombi, S. Pecoraro, A. Perozziello, S. Pierattini, S. Podda, F. Poggi,

A. Quintiliani, A. Rocchi, C. Sci`o, F. Simoni

Technical Unit for IT and ICT Systems Development ENEA — Italian National Agency for New Technologies,

Energy and Sustainable Economic Development Lungotevere Thaon di Revel 76 – 00196 Rome, Italy

Abstract. We introduce CRESCO3 and CRESCO4 clusters, the new HPC systems in ENEA Portici Research Center. CRESCO3 has been released in May 2013, whereas CRESCO4 has been installed, configured and tested after the summer 2013. They both represent the latest offspring of ENEA CRESCO HPC systems, the ENEA x86 64 cluster family, fully integrated in ENEAGRID, the infrastructure which contains all the com- putational facilities located at several ENEA sites in Italy. CRESCO3 is installed in the old data center room which also hosts CRESCO1 and CRESCO2 clusters, whereas CRESCO4 is hosted in its newly built computer room, with a dedicated power supply and cooling subsystem.

1

Introduction

ENEAGRID is a distributed hardware/software infrastructure which allows to gather, connect and seamlessly make available to users all ENEA computing facilities, i.e., HPC clusters, and specific research instruments, such as lab equipment and remote rendering machines. Year by year, ENEAGRID evolves, reshaping and redefining itself according to ENEA strategic goals, continuously acquiring computational resources and software, keeping up the pace with world HPC trends.

Here we introduce CRESCO3 and CRESCO4, the latest ENEA x86 64 Linux operated HPC system, hosted at Portici Research Center, the major ENEAGRID computational site. CRESCO3 and CRESCO4 have been purchased in the framework of the projects of Italian National Operational Programme (PON) 2007-2013, that are IT@CHA and LAM- RECOR for CRESCO3, and TEDAT for CRESCO4 [1, 2, 3]. Moreover unlike previous

CRESCO HPC systems, CRESCO4 has been designed and configured in-house in all its elements, the HPC core (computing nodes, Ethernet and Infiniband networks, storage system), the computer room, the cooling and the power supply subsystems, with separate tenders for each of the main items and with a good and effective interaction with each of the vendors. In this way we has full control of all the crucial components with great advantages for the definition of the tender specifications and for all the subsequent phases: installation, testing and operation.

The paper is structured as follows: CRESCO3 and CRESCO4 HPC systems are described in Section 2, regarding computing nodes architectural details and a brief description of the ENEAGRID infrastructure. Section 3 details the new data center room itself, including the cooling and the fire supression systems. The power supply is described in Section 4. Finally, conclusions in are drawn in Section 5.

2

Architecture of CRESCO3 and CRESCO4

The CRESCO computing laboratory at Portici Research Center consists of various Linux x86 64 Clusters. As the name suggests, CRESCO3 cluster is the third member of the CRESCO HPC systems hosted in ENEA Portici Research Center [4].

Figure 1: The ENEA CRESCO3 cluster.

It is composed by 84 server nodes distributed into two racks (as in Fig. 1), and each node has the following characteristics:

• 2 sockets CPU, processor type AMD Opteron 6234TM 12 cores 2.4 GHz; • 1 local HD of 500 GB SATA II;

• 1 Infiniband QDR 40 Gbps interface; • 2 GbE interfaces;

The whole CRESCO3 cluster has a total number of 2016 cores, and reaches a peak comput- ing power of the machine amounts to Rpeak = 19 Tflops and its measured HPL efficiency is 75%.

Figure 2: The ENEA CRESCO4 cluster.

CRESCO4 is its fourth and latest member [4], consists of six racks: five are dedicated to computing nodes and one hosts the main InfiniBand switch, as shown in Fig. 2.

Here are some technical specifications of the machine. The system is composed by 38 Supermicro F617R3-FT chassis, each hosting 8 dual CPU nodes. The whole CRESCO4 cluster has a total number of 4864 cores. The peak computing power of the machine amounts to Rpeak= 101 Tflops and its measured HPL efficiency is 85%. Each node hosts:

• a dual-socket 8 core CPU, processor type Intel E5-2670 (codenamed Sandy Bridge) with an operational frequency of 2.6 GHz;

• 64 GB of RAM memory (4 GB per core); • 1 local HD of 500 GB;

• 1 Infiniband QDR 40 Gbps interface; • 2 GbE interfaces;

• BMC/IPMI 1.8 support and software for remote console managing and control.

The computing nodes access a DDN S2A9900 storage system, for a total storage amount of 480 TB. The computing nodes are interconnected based via an Infiniband 4xQDR QLogic/Intel 12800-180 switch (432 ports, 40 Gbps). The single node architecture is summarized by Fig. 3. In Fig. 4 is shown the configuration of the InfiniBand network of CRESCO3 and CRESCO4 .

NUMANode P#0 (32Gb) Socket P#0 L3 (20MB) L2 (256kb) L2 (256kb) L2 (256kb) L2 (256kb) L2 (256kb) L2 (256kb) L2 (256kb) L2 (256kb) L1d (32kb) L1d (32kb) L1d (32kb) L1d (32kb) L1d (32kb) L1d (32kb) L1d (32kb) L1d (32kb) L1i (32kb) L1i (32kb) L1i (32kb) L1i (32kb) L1i (32kb) L1i (32kb) L1i (32kb) L1i (32kb) Core P#0 PU P#0 Core P#1 PU P#1 Core P#2 PU P#2 Core P#3 PU P#3 Core P#4 PU P#4 Core P#5 PU P#5 Core P#6 PU P#6 Core P#7 PU P#7

PCI 1077:7322 PCI 8086:10d3 PCI 8086:10d3 PCI 102b:0532 PCI 8086:1d02

qib0 eth0 eth1 sda

NUMANode P#1 (32Gb) Socket P#1 L3 (20MB) L2 (256kb) L2 (256kb) L2 (256kb) L2 (256kb) L2 (256kb) L2 (256kb) L2 (256kb) L2 (256kb) L1d (32kb) L1d (32kb) L1d (32kb) L1d (32kb) L1d (32kb) L1d (32kb) L1d (32kb) L1d (32kb) L1i (32kb) L1i (32kb) L1i (32kb) L1i (32kb) L1i (32kb) L1i (32kb) L1i (32kb) L1i (32kb) Core P#0 PU P#0 Core P#1 PU P#1 Core P#2 PU P#2 Core P#3 PU P#3 Core P#4 PU P#4 Core P#5 PU P#5 Core P#6 PU P#6 Core P#7 PU P#7

Figure 3: CRESCO4 single node architecture.

Figure 4: The IB 4xQDR (40 Gbps) network of CRESCO3 and CRESCO4 clusters.

2.1 Integration within ENEAGRID

As well as its predecessors CRESCO systems, CRESCO3 and CRESCO4 are also inte- grated within ENEAGRID, a large infrastructure which includes all the ENEA computing resources installed at the various ENEA research centres in Italy, altogether 6 sites dis- tributed across the whole country, interconnected via the GARR network. ENEAGRID infrastructure is shown in Fig. 5.

Figure 5: Geographical map of the ENEAGRID sites.

ENEAGRID is characterized by solid structural components, offering reliability and easy management, and by web interfaces which have been developed in–house and customized so as to ensure a friendly user environment:

• authentication via Kerberos v5;

• geographic filesystems: AFS/OpenAFS;

• GPFS parallel file system also on WAN among different clusters and sites; • resource management via LSF Multicluster;

• system monitoring via Zabbix;

• web access via in–house FARO interface, see [5];

• user management via in–house WARC interface, see [4].

The computing resources currently provided to the users are x86–64 Linux systems (the CRESCO HPC gather more than ∼ 10, 000 cores) and dedicated systems (e.g. GPU systems). The computing resources spread essentially across four ENEA research centres, that are

• ENEA Portici: CRESCO1 (672 cores), CRESCO2 (2496 cores), CRESCO3 (2016 cores), CRESCO4 (4864 cores)

• ENEA Frascati: CRESCOF (480 cores) • ENEA Casaccia: CRESCOC (192 cores) • ENEA Brindisi: CRESCOB (88 cores)

The new CRESCO3 and CRESCO4 clusters are located in the ENEA Portici site, where the largest fractions of computing and man power concentrate. ENEA Portici is connected to the Internet through the PoP GARR of Napoli–Monte S. Angelo thanks to two 1 Gbps links (GARR is the Italian Research & Education Network, planning and operating the national high-speed telecommunication network for University and Scientific Research, see [6]). In 2012 the in–/out– data transfers amounted to 230 TB, equivalent to an average bandwidth of 60 Mb/s, while the transfer peak value reached 80% of the available bandwidth. In 2013 an increase of 30% of the overall network traffic was observed with respect to 2012, due to the entrance of new users and to changes in software applications.

3

Data center room and cooling system

The CRESCO4 computing facility is located in a new and dedicated computer room, adjacent to the room in which are located the other three CRESCO clusters. Computing and cooling systems are located in the section of the room having the highest ceiling. In figure 6 is represented the space arrangement of the computing and cooling facilities.

Figure 6: Map of the CRESCO4 facilities.

The cooling system is composed by four independent Climaveneta i-AFU F04 MOD A

50 units; these units are dual circuit systems, featuring a direct expansion circuit with

inverter and a secondary water circuit connected to external dry coolers model BDC-078m. This technology permits to take advantage of free-cooling when the external conditions are favourable.

The cold air flows through the floor to a confined cold aisle in front of the computing nodes where is heated and then released into the hot aisle. The design of the configuration has been done optimizing and minimizing the heat transfer to the cold aisle; indeed the latter is very well confined and almost isolated from other heat sources. The cooling units take the air from the hot aisle, apply the refrigeration process and send it under the raised floor.

The cooling system is dimensioned with four independent Climaveneta units; the dimen- sioning takes into account the maximum heat production by computing nodes that is about 110kW , the nominal cooling capacity of each system that is 50kW and the presence of a further machine for redundancy purposes.

In the qualification tests of the system, the HPL benchmark yielded 85 Tflops with a computing electrical power consumption of 109kW , a cooling power consumption of 35 kW and no free–cooling during the benchmark. The PUE (Power Usage Effectiveness) was 1.32 with 0.78 TFlops/kW. The optimization of the free-cooling system settings is currently underway and global results will be available after at least one year of data collection.

3.1 Fire suppression system

The fire suppression system consists of a set of aerosol potassium salt extinguishers, that can be automatically or manually activated. The automatic activation is triggered by an electronic fire detection system, based on optical smoke detectors. The detectors react in case of presence of combustion products and the fire control system start an alarm sequence (audio and visual) for a prompt evacuation of the data center.

4

Power supply

The utility provides 9kV power to the site main building and transformers convert the power to 400 Vac. In the site main power supply room a system of switchboards feed a 400kVA UPS (Schneider Electric, MGE Galaxy 6000 400kVA/320kW, efficiency at half load 94.4%, with 204 EXIDE P6V1700 batteries) shared between the previous existing CRESCO1, CRESCO2 and the new CRESCO3 and CRESCO4 clusters. Utility faults are backed up by a 500 kVA diesel generator (Cummins Power Generation, C550 D5). With CRESCO4 installation the air conditioning power system has been switched to a dedicated power supply line, backed up in case by another diesel generator (Volvo Penta 485kW). 3 phase power cables (100 mt) connect the site main power room with the main cluster switchboards, located in the entrance hall common to the already existing computer room, hosting CRESCO1,CRESCO2 and CRESCO3 clusters and to the new CRESCO4 computing room (see section 3). In the new computing room we installed 7 racks to host the computing nodes and the network devices; 6 of them (the racks hosting the nodes) are equipped with 4 APC web enabled 3phase 32A PDU while the rack for the network devices is equipped with 2 PDU of the same type. In order to have a redundant power supply system we installed in the computer room 4 panel boards with 8 3phases lines each. In this way, for a single rack each of the 4 PDU is connected to a different panel board. So that the system can tolerate the simultaneous fault of two over four panel boards. The single PDU provides power to the computing elements via short C13-C14 cables.

5

Conclusions

We have presented the main architectural features of the new HPC systems CRESCO3 (2016 cores) and CRESCO4 (4864 cores). These facilities have been purchased in the framework of the projects of the Italian National Operational Programme (PON) 2007- 2013. The design and realization of these infrastructures is mainly an in-house effort of the ENEA UTICT researchers and technicians. CRESCO3 and CRESCO4 are powerful computing tools at disposal of the scientific community and, with an aggregate computing power of ∼ 120 Tflops, are at the moment the main HPC facilities of ENEAGRID.

References

[1] http://www.progettoitacha.it/. [2] http://www.lamrecor.it/.

[3] http://www.utict.enea.it/it/progetti/utict-e-i-progetti/tedat. [4] http://www.cresco.enea.it/.

[5] Rocchi A., Pierattini S., Bracco G., Migliori S., Beone F., Santoro A., Sci`o C., and Podda S. Faro - the web portal to access eneagrid computational infrastructure. In- ternational Workshop on Science Gateways (IWSG 2010), 2010.

Documenti correlati