• Non ci sono risultati.

Information Retrieval 29 January 2014 Ex 1 [ranks 3+4]

N/A
N/A
Protected

Academic year: 2021

Condividi "Information Retrieval 29 January 2014 Ex 1 [ranks 3+4]"

Copied!
1
0
0

Testo completo

(1)

Information Retrieval

29 January 2014 Ex 1 [ranks 3+4] Given a binary array B of size n,

 define Rank and Select primitives over B,

 describe a possible use of these primitives, commenting on the time/space complexity advantage and possible disadvantages.

Ex 2 [ranks 3]Given the string S=aaba, compute the length of its Arithmetic encoding without running the entire algorithm, and comment the answer.

Ex 3 [points 3+3+2]

 Given the array of scores A = [2, 5, 4, 3, 1, 10 , 8 , 9, 7 , 6] show the steps performed by the algorithm that identifies the top-5 elements in A (i.e., the five elements with the largest values in A).

 Indicate and discuss the time complexity of this algorithm.

 Explain how to modify the solution above to identify the k elements with the smallest values.

Ex 4 [points 4] Given a term-document matrix (m x n), describe how to project via LSI-technique a query vector of size m onto a reduced space of size k.

Ex 5 [points 4+2] Given the array of integers (1, 2, 5, 11, 12, 15, 16, 20, 23, 26, 31),

 Compress with Elias-Fano

 Comment on the space occupancy of the proposed solution.

Ex 6 [points 2] Prove that the algorithm K-means converges to a local minimum.

Riferimenti

Documenti correlati

Assume you are given 6 strings (aa, ab, bb, bc, ca, db) and you wish to construct a minimal ordered perfect

 SECOND MINIMUM, which defines the similarity between a pair of clusters as the second minimum distance between all pairs of items of the two clusters Ex 3 [points 3] Discuss the

Ex 1 [ranks 3+3+3] Given the words {abaco, adagio, baco, alano} construct a 2-gram index, and describe the algorithm that solves 1-edit distance over it. Finally test the

[r]

 Discuss its advantages in crawling with respect to the simple scheme that allocates URL to crawlers based on a static url_domain->crawler scheme.  Simulate its execution over

 Describe the algorithm that solves the top-k in auto-completion search, provided that k is not known at preprocessing time and possibly large (hint:. you need an additional

Show the algorithm that uses a k-gram index for restricting the E-approximate search over a subset of candidate strings within D.. Show the 3-gram index for D={barca, bit, borda,

Ex 5 [points 3] Compute the probability of error for a Bloom Filter that uses a bit- array of m bits, k hash functions and indexes a dictionary of