• Non ci sono risultati.

Information Retrieval

N/A
N/A
Protected

Academic year: 2021

Condividi "Information Retrieval"

Copied!
1
0
0

Testo completo

(1)

Information Retrieval

16 November 2020 – time 45 minutes

Question #1 [rank 6]. Given the three adjacency lists compress them via the Web graph algorithm by choosing always the best previous list to differentially encode the current one. (So no bound is set on the backward window to “copy”.)

14 à 3, 10, 11, 13, 14, 17, 19, 21, 25 15 à 5, 10, 11, 13, 14

16 à 3, 10, 11, 13, 21, 25, 30

Question #2 [rank 6]. You are given the following five binary vectors:

A = 01100, B = 00000, C = 11111, D = 00010, E = 10010.

Compute the groups of similar vectors using the clustering algorithm based on LSH and connected components, by assuming k=2 and L=2, with projections I1={1,2} and I2={3,4}.

Question #3 [rank 3+4]. Let us given a set of strings S = { bud, budy, turdy, tus }.

- Build a 2-gram index of S

- Given P = “tusdy”, show how the index executes the search for 1-edit errors based on the 2-gram index, a filter based on overlap distance, and then computing the dynamic programming matrix.

Question #4 [rank 3+5]. You are given the two files:

F_old = “cane ratto matt”, F_new = “cane gatto pane”, Assume a block size B=3 chars, and

• Show the execution of the algorithm rsync.

• Show the execution of the algorithm zsync.

Question #5 [rank 3]. Compress the string S = bababc via LZ77-gzip by assuming window w=3

Riferimenti

Documenti correlati

Although Guangzhou's manufacturing industry has more advantages in technology and high technological development than South Asia and Central Asia which makes Guangzhou maintaining

In the analyzed case, it can be noticed that the conformity control and number of samples have a significant impact on the level of reliability of concrete structures, whose safety

If further operation of the transformer is planned in the normative exploitation conditions, the marginal residual lifespan of the transformer corresponds to the

We show that segregation according to information preferences leads to a seg- regation of information: If groups do not interact with each other, within each group, only the

Eq. A thoroughly study of performances of Prussian Blue in presence of different cations was carried out. It has been demonstrated that the PB electrochemical activity is supported

 Command ls provides information about a file according to the specified options.  If the pathname is a directory, ls lists the files and subdirectories contained in that

It provides access to the case-law of the European Court of Human Rights (Grand Chamber, Chamber and Committee judgments, decisions, communicated cases, advisory opinions and

Refus d'accorder à un père, à l'occasion de la liquidation de sa pension, une bonification pour enfant, suite à l'adoption d'une loi nouvelle ayant un effet rétroactif uniquement