• Non ci sono risultati.

GPU-based parallelization of QuickScorer to speed-up document ranking with tree ensembles

N/A
N/A
Protected

Academic year: 2021

Condividi "GPU-based parallelization of QuickScorer to speed-up document ranking with tree ensembles"

Copied!
1
0
0

Testo completo

(1)

GPU-based Parallelization of QuickScorer

to Speed-up Document Ranking with Tree Ensembles

Francesco Lettich∗, Claudio Lucchese†, Franco Maria Nardini†, Salvatore Orlando∗,†,

Raffaele Perego†, Nicola Tonellotto†, and Rossano Venturini‡,†

Universit`a Ca’ Foscari Venezia, Italy †

ISTI, CNR, Pisa, Italy ‡

Universit`a di Pisa, Italy

Abstract. Scoring documents with learning-to-rank (LtR) models based on large ensembles of regression trees currently represents one of the most effective so-lutions to rank query results returned by large scale Information Retrieval sys-tems. However, such scoring models are very complex, and when deployed in real Web Search Engine infrastructures they are constrained within strict time budgets. This calls for very fast and efficient solutions, able to exploit all the computational resources offered by a given system. This paper investigates the opportunities offered by modern graphic cards (GPUs) to efficiently exploit LtR complex models based on trees ensembles to rank documents. To this end we propose GPUSCORER, a GPU-based parallelization of the state-of-the-art algo-rithm QUICKSCORERto score documents with tree ensembles. GPUSCORER

takes advantage of the huge computational power of GPUs to perform tree en-semble traversal by evaluating multiple documents simultaneously. We provide a concise experimental evaluation, and show that GPUSCORERis able to achieve speedups up to 32x over the sequential version of QUICKSCORER.

In this work we propose GPUSCORER, a GPU-based parallelization of QUICKSCO -RER[1], the state-of-the-art algorithm to score documents with tree ensembles.

GPUSCORER proposes a cache-conscious approach that builds on the blockwise variant of QUICKSCORER– where tree ensembles can be conveniently partitioned in disjoint tree-blocks – to leverage the massive computational power offered by modern GPUs. More precisely, GPUSCORERis made up by a processing pipeline structured in three different phases, where data structures and operations are conveniently redesigned or rearranged to take into account the features and limitations underlying the execution modeland memory hierarchy characterizing the GPUs.

In the experimental evaluation we show how our approach is able to achieve con-sistent speedups with respect to QUICKSCORER, especially when the number of leaves per tree becomes consistent (up to 32× with 64 leaves, and 31× with 32 leaves).

References

1. C. Lucchese, F. M. Nardini, S. Orlando, R. Perego, N. Tonellotto, and R. Venturini, “Quickscorer: A fast algorithm to rank documents with additive ensembles of regression trees,” in Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). ACM, 2015, pp. 73–82.

Riferimenti

Documenti correlati

Martian (MOLA) analogues of Europa ridged terrain 0 50 100 150 200 250 300 350 Longitude [deg] -80 -60 -40 -20 0 20 40 60 80 Latit tude [deg] Pwyll Powys Darkspot Cilix Yelland

So, partly emulat- ing Giotto’s virtuosistic gesture, when he gives Pope Boni- facio VIII’s messenger a “simple” circle, and partly Chuang Tzu’s gesture, when he draws “the

78 Department of Physics and Astronomy, University College London, London, United Kingdom 79 Louisiana Tech University, Ruston LA, United States of America. 80 Laboratoire de

C’era dunque un clima di attesa e, come ricorda Ferdinando Ranalli nella sua minuziosa cronaca di quei momenti: “Fece meraviglia, il gior- no appresso che il principe aveva

The Case Study of Sedad Hakki Eldem (Doctorate School in Architecture, design and history of the Arts at the Univer- sity of Florence), results partially published in S.

– Ad ogni passo viene aggiunto l'arco di peso minimo che collega un nodo già raggiunto dell'albero con uno non ancora raggiunto. Prim: Shortest connection networks and

• Come nel caso degli alberi RB, questo inserimento può portare ad una violazione della condizione di bilanciamento. ‣ Negli alberi RB: il numero di nodi neri in ogni percorso

Collaudata nel 1974, la realizzazione comportò infatti un costo totale di 368 miliardi di lire (poco meno di 200 milioni di euro) per un costo unitario di circa 860 milioni di