• Non ci sono risultati.

Information Retrieval 29 June 2017

N/A
N/A
Protected

Academic year: 2021

Condividi "Information Retrieval 29 June 2017"

Copied!
1
0
0

Testo completo

(1)

Information Retrieval

29 June 2017

Name: Surname: Matricola:

Ex 1 [points 4+3+3] Let us given a set of strings S = { dad, atom, mamo, oma, zoo }.

- Build a 2-gram index over S

- Given pattern P = mom, show how the index executes the search for 1-edit error - Given pattern P = mom, show how the index executes the search for 2-edit errors

Ex 2 [points 4+3] Write and comment on the formula of PageRank and Personalized PageRank, and discuss how the Personalized PageRank can be used to estimate the “similarity” between two nodes u and v in a weighted graph.

Ex 3 [points 5] Consider the WAND algorithm over four posting lists by assuming that at some step the algorithm is examining the heads of the following lists:

t1  (…, 5, 6, 7, 8, 11) t2  (…, 2, 3, 5, 7, 8, 11) t3  (…, 8, 13, 15) t4  (…, 4, 5, 8, 9)

At that time the current threshold equals 2.3, and the upper bounds of the scores in each posting list are: ub_1 = 0.4, ub_2 = 2, ub_3 = 4, ub_4 = 0.1.

Which is the next docID whose full score is computed? (Motivate your answer)

Ex 4 [points 5+3] You are given a binary tree T formed by n=5 nodes {a,b,c,d,e}, rooted in “a”, and having the following edges {(a,b), (a,c), (b,d), (c,e) }, where “d” is the left child of “b” and “e” is the left child of “c”.

 Show the succinct encoding of T (recall that it takes 2n+1 bits).

 Describe how to follow the path that starts from the root “a” and then goes right to “c” and finally goes left to “e”.

Ex 5 [LAB TEST] Let us assume that we have built a Lucene index, with a Whitespace Analyzer, over the following three documents:

 d1 = "The sea Mediterraneo is a very well known sea close to Italy."

 d2 = "Mediterraneo is a sea in front of Italy."

 d3 = "the name mediterraneo is for a sea!"

and then assume that you execute the following query with a Whitespace Analyzer (hint: keep attention to the lower/upper cases, spaces,…):

 q1 = "sea Mediterraneo”

Show which documents will be returned and in which order, commenting on your assumptions about the term frequencies.

Riferimenti

Documenti correlati

 Levy structural AR, Yarmiayev flexibility V, of Moskovitz the human Y, copper Ruthstein metallochaperone S: Probing the Atox1 dimer and its interaction with the CTR1

When the action can be shown to only depend algebraically on the background metric the solution of the deformation equation on the Lagrangian can be given in closed form in terms

In  the  field  of  the  study  of  the  internal  behavior  of  a  PAT,  the  assessment  of  the  mechanical  wear  caused  by  the  impingement  of 

(a) Lobesia botrana male catches (n) per trap every 30 minutes in the main period of male activity (21:00-23:00, >90% of catches) analysed separately to show differences

The flux contribution of the missing bands at early phases, in particular during the rising branch to the peak and the period around maximum, has been obtained under the

We focus on the photoionized emission features in the high- resolution X-ray spectra of NGC 3783 obtained in December 2016 when the soft X-ray continuum was heavily obscured.. We

Più difficile, semmai, sarebbe identificare nella totalità quella progressione nelle tre immagini della seconda quartina: se il falco, infatti, ap- partiene al mondo animale

The analytical evaluation of the enzymatic creatinine assay available on Radiometer ABL90 FLEX PLUS encompassed the assessment of both intra-assay, inter-assay and total