• Non ci sono risultati.

Lab on Algorithms for Big Data 5 June 2015 Exercise [rank 4+4+6]

N/A
N/A
Protected

Academic year: 2021

Condividi "Lab on Algorithms for Big Data 5 June 2015 Exercise [rank 4+4+6]"

Copied!
1
0
0

Testo completo

(1)

Lab on Algorithms for Big Data

5 June 2015

Exercise [rank 4+4+6] Given the set of strings S={bar, basr, box, cab, cat, dog}

construct:

 The ternary search trie (TST)

The last column of the Compressed permuterm index.

Show how the string b*r is searched in both data structures.

Exercise [rank 4+4] Given the sequence S = (1, 3, 4, 5, 9, 16, 23, 27, 28, 31, 40), show the Elias-Fano encoding of S, and comment on the execution of S[6]16.

Exercise

Students that did project #1 (edit distance 1) should address the following two additional exercises [rank 4+4]:

 Prove that the min-wise hashing technique estimates the Jaccard coefficient

 Let us given the two sets A={1,2,3,4,5} and B={2,4,6,7} drawn from the

universe {0, 1, …, 10}, simulate the computation of Jaccard(A,B) via min-wise hashing by using the following two permutations p1(x) = 3x+1 mod 11, p2(x) = 5x mod 11

Students that did project #2 (compression) should address the following two additional exercises [rank 5+3]:

 Construct the DFUDS encoding of the TST above.

 Show how to compute the first child of node u.

Riferimenti

Documenti correlati

Simulate the whole process, report the conditions of the PFR outlet, and plot the molar flow of the components versus the length of the reactor. How can the recovery of styrene

Let us given the strings S = {abac, abra, bacco, bei, acat}, show the structure of a Ternary Search tree built by inserting them in alphabetic order.. Describe how you can use

[rank 6+4] Given the string S = abaabba, construct the BWT of S, and then show how to locate the position in S of the 4-th character in the BWT (counting from 0) by assuming

Simulate the behavior of the algorithm MultiKey-Quicksort on the following array of 6 strings S=[zoo, horse, bath, abacus, bar, aargh].. Exercise

Write the pseudo-code of the Reservoir Sampling algorithm that random samples m items from a sequence of unknown length. Prove that the probability of extracting any element when

Exercise [rank 5] Discuss the optimality of the disk striping technique in combination with multi-way mergesort when sorting N items over D disks, in a model with.. internal memory

Exercise [rank 4+4] Let us given two arrays: A[1,n] of items and [1,n] that denotes the permutation of the positions in the range [1,n], such that [i] is the position where the

Exercise [points 4+4] Given the array A = [5, 2, 4, 8, 3, 1, 9, 6], build the RMQ data structure but ONLY for RMQ-queries which are either aligned to blocks or overlap