Outline
[read Chapter 2]
[suggested exercises 2.2, 2.3, 2.4, 2.6]
Learning from examples
General-to-specic ordering over hypotheses
Version spaces and candidate elimination
algorithm
Picking new examples
The need for inductive bias
Note: simple approach assuming no noise, illustrates key concepts
Training Examples for
EnjoySport
Sky Temp Humid Wind Water Forecst EnjoySpt
Sunny Warm Normal Strong Warm Same Yes
Sunny Warm High Strong Warm Same Yes
Rainy Cold High Strong Warm Change No
Sunny Warm High Strong Cool Change Yes
Representing Hypotheses
Many possible representations
Here,
h
is conjunction of constraints on attributesEach constraint can be
a specc value (e.g.,
Water
=Warm
)don't care (e.g., \
Water
=?")no value allowed (e.g.,\Water=;")
For example,
Sky AirTemp Humid Wind Water Forecst
Prototypical Concept Learning Task
Given:
{
InstancesX
: Possible days, each described bythe attributes Sky, AirTemp, Humidity,
Wind, Water, Forecast
{
Target functionc
:EnjoySport
:X
! f0;
1g{
HypothesesH
: Conjunctions of literals. E.g.h?
;Cold;High;
?;
?;
?i:
{
Training examplesD
: Positive and negativeexamples of the target function
h
x
1
;c
(x
1)i
;:::
hx
m;c
(x
m)iThe inductive learning hypothesis:
Any hypothesis found to approximate the target function well over a suciently large set of training examples will also approximate the target function well over other unobserved examples.Instance, Hypotheses, and
More-General-Than
h = <Sunny, ?, ?, Strong, ?, ?> h = <Sunny, ?, ?, ?, ?, ?> h = <Sunny, ?, ?, ?, Cool, ?> 2 h h 3 h Instances X Hypotheses H Specific General 1 x 2 xx = <Sunny, Warm, High, Strong, Cool, Same> x = <Sunny, Warm, High, Light, Warm, Same>
1 1 2 1 2 3
Find-S
Algorithm
1. Initialize
h
to the most specic hypothesis inH
2. For each positive training instance
x
For each attribute constraint
a
i inh
If the constraint
a
i inh
is satised byx
Then do nothing
Else replace
a
i inh
by the next moregeneral constraint that is satised by
x
Hypothesis Space Search by
Find-S
Instances X Hypotheses H Specific General 1 x 2 x x3 x4 h0 h1 h2,3 h 4 + + +x = <Sunny Warm High Strong Cool Change>, +
4
x = <Sunny Warm Normal Strong Warm Same>, +1 x = <Sunny Warm High Strong Warm Same>, +
2
x = <Rainy Cold High Strong Warm Change>,
-3
h = <Sunny Warm Normal Strong Warm Same>1 h = <Sunny Warm ? Strong Warm Same>2
h = <Sunny Warm ? Strong ? ? >
4
h = <Sunny Warm ? Strong Warm Same>
3
0
h = <∅, ∅, ∅, ∅, ∅, ∅>
-Complaints about
Find-S
Can't tell whether it has learned concept
Can't tell when training data inconsistent
Picks a maximally specic
h
(why?)Version Spaces
A hypothesis
h
isconsistent
with a set oftraining examples
D
of target conceptc
if andonly if
h
(x
) =c
(x
) for each training exampleh
x;c
(x
)i inD
.Consistent
(h;D
) (8hx;c
(x
)i 2D
)h
(x
) =c
(x
)The
version space
,V S
H;D, with respect tohypothesis space
H
and training examplesD
,is the subset of hypotheses from
H
consistentwith all training examples in
D
.The
List-Then-Eliminate
Algorithm:
1.
V ersionSpace
a list containing everyhypothesis in
H
2. For each training example, h
x;c
(x
)iremove from
V ersionSpace
any hypothesish
forwhich
h
(x
) 6=c
(x
)Example Version Space
S:
<Sunny, Warm, ?, ?, ?, ?>
<Sunny, ?, ?, Strong, ?, ?> <?, Warm, ?, Strong, ?, ?> <Sunny, Warm, ?, Strong, ?, ?>
{ }
Representing Version Spaces
The
General boundary
, G, of version spaceV S
H;D is the set of its maximally generalmembers
The
Specic boundary
, S, of version spaceV S
H;D is the set of its maximally specicmembers
Every member of the version space lies between these boundaries
V S
H;D = fh
2H
j(9s
2S
)(9g
2G
)(g
h
s
)gwhere
x
y
meansx
is more general or equal toCandidate Elimination Algorithm
G
maximally general hypotheses inH
S
maximally specic hypotheses inH
For each training example
d
, doIf
d
is a positive example{
Remove fromG
any hypothesis inconsistentwith
d
{
For each hypothesiss
inS
that is notconsistent with
d
Remove
s
fromS
Add to
S
all minimal generalizationsh
ofs
such that
1.
h
is consistent withd
, and{
Remove fromS
any hypothesis inconsistentwith
d
{
For each hypothesisg
inG
that is notconsistent with
d
Remove
g
fromG
Add to
G
all minimal specializationsh
ofg
such that
1.
h
is consistent withd
, and2. some member of
S
is more specic thanh
Remove from
G
any hypothesis that is lessExample Trace
S
What Next Training Example?
S:
<Sunny, Warm, ?, ?, ?, ?>
<Sunny, ?, ?, Strong, ?, ?> <?, Warm, ?, Strong, ?, ?> <Sunny, Warm, ?, Strong, ?, ?>
{ }
How Should These Be Classied?
S:
<Sunny, Warm, ?, ?, ?, ?>
<Sunny, ?, ?, Strong, ?, ?> <?, Warm, ?, Strong, ?, ?> <Sunny, Warm, ?, Strong, ?, ?>
{ }
G: {<Sunny, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?> }
h
Sunny Warm Normal Strong Cool Change
i hRainy Cool Normal Light Warm Same
iWhat Justies this Inductive Leap?
+ h
Sunny Warm Normal Strong Cool Change
i+ h
Sunny Warm Normal Light Warm Same
iS
: hSunny Warm Normal
? ? ?iWhy believe we can classify the unseen
An UNBiased Learner
Idea: Choose
H
that expresses every teachableconcept (i.e.,
H
is the power set ofX
)Consider
H
0 = disjunctions, conjunctions,negations over previous
H
. E.g.,h
Sunny Warm Normal
? ? ?i _ :h? ? ? ? ?Change
iWhat are
S
,G
in this case?S G
Inductive Bias
Consider
concept learning algorithm
L
instances
X
, target conceptc
training examples
D
c = fhx;c
(x
)iglet
L
(x
i;D
c) denote the classication assigned tothe instance
x
i byL
after training on dataD
c.Denition
:The
inductive bias
ofL
is any minimal setof assertions
B
such that for any targetconcept
c
and corresponding trainingexamples
D
c(8
x
i 2X
)[(B
^D
c ^x
i) `L
(x
i;D
c)]Inductive Systems and Equivalent
Deductive Systems
Candidate Elimination Algorithm Using Hypothesis Space Training examples New instanceEquivalent deductive system
Theorem Prover Training examples New instance Classification of new instance, or "don’t know" Classification of new instance, or "don’t know" Inductive system H Assertion " contains the target concept"
Three Learners with Dierent Biases
1. Rote learner: Store examples, Classify
x
i itmatches previously observed example.
2. Version space candidate elimination algorithm
Summary Points
1. Concept learning as search through
H
2. General-to-specic ordering over
H
3. Version space candidate elimination algorithm
4.
S
andG
boundaries characterize learner'suncertainty
5. Learner can generate useful queries
6. Inductive leaps possible only if learner is biased 7. Inductive learners can be modelled by equivalent