Università degli Studi di L’Aquila Dipartimento di Informatica
Claudio Arbib
Data Mining and
optimization problems
(Julian Beever)
Digging into data
Looking for
1. decision support for instance, compare the mean features of a subject to a given sample
2. a way to save time and money for instance, limiting tests and search to just those cases that promise a
good result
3. confirmation or confutation for instance, check the experimental validity of an assumption
4. interpretation for instance, explain a phenomenon showing the mechanism that
determine some result from
some combination of values
Some tools
Classification to group individual into classes with relatively homogeneous features
Regression to infer trends from data
Separation to mark the difference between groups of individuals
Overlap to highlight similarities among
groups of individuals
Dramatis personae
The individual is described by a record (vector of attributes ∈ IR
n)
example: age, weight, cholesterol, triglyceride age, sex, income, education
weight, power, consumption, price
The population is the set of all the individual considered
example: the resident population of Minorca
the car models of the C segment in the French market
the turkeys sold the day before Thanksgiving
The sample is a set of individuals for which all attributes are known, and is specially selected so as to give a reasonable
representation of the population
All can be represented as (sets of) points of IR
nExample
32,730 15,21
1171 77
Volkswagen Golf 1.6
26,992 13,15
1065 81
Toyota Corolla 1.6
26,970 16,07
1205 75
Skoda Octavia 1.6
27,492 14,94
1150 77
Seat Leon 1.6
32,706 14,69
1175 80
Rover 45 1.6
31,392 12,91
1020 79
Rénault Mégane Classic 1.6
29,792 14,86
1070 72
Peugeot 306 1.6
28,192 14,95
1106 74
Opel Astra 1.6
26,242 18,03
1190 66
Nissan Almera 1.5
26,262 15,42
1018 66
Mitsubishi Colt 1.6
36,192 13,87
1040 75
Mercedes A160
42,392 16,45
1250 76
Lancia Lybra
28,942 14,24
1210 85
Hyundai Lantra 1.6
39,792 9,87
1165 118
Honda Civic 1.6
27,192 14,55
1077 74
Ford Focus 1.6
28,792 13,82
1050 76
Fiat Bravo 100
25,792 13,65
1065 78
Daewoo Lanos 1.6
28,212 16,58
1078 65
Citroen Xsara 1.6
40,230 14,53
1090 75
Audi A3
29,992 13,24
1165 88
Alfa Romeo 145
Price Weight/Power
Weight Power
Model
Example
C segment: price vs. weight/power
20,000 25,000 30,000 35,000 40,000 45,000
12,00 13,00 14,00 15,00 16,00 17,00 18,00 19,00
w eight/pow er (g/W)
price (M£) individuals dominated by A
individuals that dominate A
A
What are the “best” solutions?
Example
C segment: price vs. weight/power
20,000 25,000 30,000 35,000 40,000 45,000
12,00 13,00 14,00 15,00 16,00 17,00 18,00 19,00
w eight/pow er (g/W)
price (M£)
Rénault Mégane Classic
Toyota Corolla
Daewoo Lanos
What are the “best” solutions?
Example
C segment: price vs. weight/power
20,000 25,000 30,000 35,000 40,000 45,000
12,00 13,00 14,00 15,00 16,00 17,00 18,00 19,00
w eight/pow er (g/W)
price (M£)
efficiency curve