• Non ci sono risultati.

Information Retrieval 18

N/A
N/A
Protected

Academic year: 2021

Condividi "Information Retrieval 18"

Copied!
1
0
0

Testo completo

(1)

Information Retrieval

18 June 2019 Ex 1 [points 3+3+3]

 Construct a 2-gram index for the set of strings S={dat, daz, dit, dox, zox}.

 Design an algorithm that finds all candidate strings of S to possibly have 1-edit distance from P[1,p] by using the overlap distance.

 Simulate the proposed algorithm over the string P=dix.

Ex 3 [ranks 2+2+3+3] Given the list of integers S=(2,4,10,13,14,21).

 Show how to compress S via gap-coding and then Gamma-code

 Show how to compress S via gap-coding and then Delta-code

 Show how to compress S via Elias-Fano coding.

 Describe how to select the 5th integer from S, by assuming to have a Select_1 data structure over the array H.

Ex 4 [ranks 4+3] Given a graph G of directed edges {(1,3), (3,1), (3,2), (1,4), (2,1)}.

 Simulate the execution of one step of the PageRank algorithm, starting with nodes 1 and 2 having score ¼ and node 3 having score ½, and assuming a teleportation step which favors node 3 (hence, it jumps only to it).

 Simulate the HITS computation of a(3) and h(3) by assuming that all a- and h- scores are initially set to 1.

Ex 5 [ranks 2+2] In the context of a text annotator like TagMe, define

 the link probability of an anchor text a with respect to the Wikipedia knowledge graph.

 the commonness of the link between an anchor a and a Wikipedia page p.

Riferimenti

Documenti correlati

Si può osservare come l’invito a considerare il saggio in rapporto alle sue possibili frontiere abbia permesso di non limitarne la considerazione all’interpretazione solita- mente

In February 2018, the Department of Architecture of the University of Florence (DIDA) promoted an open and public analysis and evaluation of the research carried out by the

Lastly, in the group with a negative family history, no syndromic lesions and single Pheo/PGL (the patients with tumors generally classified as “truly sporadic”) germ-line

circuits, sensorimotor integration and plasticity in human motor cortical projections

Temporal trends in the atmospheric concentration of selected metals in the PM10 aerosol collected in March- September 2010 at Ny-Ålesund (Svalbard

We also use soft methods from symplectic geometry (the relative version of the h–principle for Lagrangian immersions) and tools from algebraic topology to prove (both positive

Hypertensive women in this study retained higher left ventricular ejection fraction and stress-corrected midwall shortening in spite of less hypertrophy regression during