• Non ci sono risultati.

Data Science And Database Technology Homework 4

N/A
N/A
Protected

Academic year: 2021

Condividi "Data Science And Database Technology Homework 4"

Copied!
1
0
0

Testo completo

(1)

Data Science And Database Technology Homework 4

The following relations are given (primary keys are underlined):

CLEANING-COMPANY(CId, Name, Address, City, Region) OFFERED-SERVICES(CId, SId)

SERVICES(SId, ServiceName, Category)

BUILDING(BId, BuildingName, BuildingType, Address, City, Region) CLEANING-SERVICES(CId, BId, Date, SId, Cost, NumberOfHours) Assume the following cardinalities:

• card(CLEANING-COMPANY) = 104 tuples, distinct values of Region = 20

• card(OFFERED-SERVICES)= 2 · 105 tuples

• card(SERVICES)= 100 tuples, distinct values of Category = 10

• card(BUILDING)= 5 · 107 tuples, distinct values of City = 1000 distinct values of BuildingType = 10

• card(CLEANING-SERVICES)= 109 tuples,

MIN(Date) = 1/1/2010, MAX(Date) = 31/12/2019

Furthermore, assume the following reduction factor for the group by condition:

• having COUNT(*)>1 � 12.

• having SUM(Cost)≥1000 � 101. Consider the following SQL query:

select BId, SUM(Cost) as TotCost, SUM(NumHors) as TotHours from CLEANING-SERVICES CS, BUILDING B

where CS.Date>=1/1/2019 and CS.Date<=31/12/2019 and B.BuildingType <> ‘Office’

and B.City=’Turin’

and CS.BId=B.BId

and CS.SId IN ( select OS.SId

from CLEANING-COMPANY CC, SERVICES S, OFFERED-SERVICE OS where OS.SId=S.SId and OS.CId=CC.CId

and (Region=‘Piedmont’ or Region=‘Liguria’) and Category=‘IndoorCleaning’

group by OS.SId having COUNT(*)>1) group by CS.BId

having SUM(Cost)>=1000 Homework tasks For the SQL query:

1. Report the corresponding algebraic expression and specify the cardinality of each node (representing an inter- mediate result or a leaf). If necessary, assume a data distribution. Also analyze the GROUP BY anticipation.

2. Select one or more secondary physical structures to increase query performance. Justify your choice and report the corresponding execution plan (join orders, access methods, etc.).

Riferimenti

Documenti correlati

- To analyze the data correlation matrix associated with the analyzed dataset go back to the main process (click on button “Process”), temporarily disable the

When a ticket is issued, if it is associated to a CARDNO of a frequent-flyer customer ( CARDNO is NOT NULL ) it is necessary to update the covered miles by means of a new

A - Trigger to manage the insertion of a new record into the IMP table Insert statement. INSERT INTO IMP(EMPNO, ENAME, JOB, SAL) VALUES(4, 'NERI',

‐‐‐insert into the credits table the miles for the customer INSERT INTO CREDITS(TICKETID, CARDNO, MILES) VALUES(:NEW.TICKETID, :NEW.CARDNO, FLIGHTMILES);. ‐‐‐STEP

It is possible to find this node in a join (specify the join conditions, as in the example image) or in filter operations on the attributes of a table (WHERE conditions)... -

• An access predicate indicates, in the case of a B + Tree index, that the data is selected based on a certain attribute by descending the tree hierarchy.

The statistics are made on the number of expressed votes, the average vote and the total incomes obtained by the production company by introducing advertisement in the applications

Performing a 10-fold Stratified Cross-Validation, what is the impact the maximal gain and maximal depth parameters on the average accuracy achieved by Decision Tree.. Report at least