Data Base and Data Mining Group of Politecnico di Torino
DBG
The data mining process
Elena Baralis
Politecnico di Torino
DB M G
2 Non trivial extraction of
implicit
previously unknown
potentially useful
information from available data
Extraction is automatic
performed by appropriate algorithms
Extracted information is represented by means of abstract models
denoted as pattern
Data mining
DB M G
3Knowledge Discovery Process
data
selected
data preprocessed
data transformed
data pattern
knowledge
selection
preprocessing
transformation
data mining
interpretation
KDD = Knowledge Discovery from Data
DB M G
4
Descriptive methods
Extract interpretable models describing data
Example: client segmentation
Predictive methods
Exploit some known variables to predict
unknown or future values of (other) variables
Example: “spam” email detection
Analysis techniques
DB M G
5Analysis techniques
Association rule extraction
Classification
Clustering
model
training data
unclassified data classified data