Laurea Laurea MAGISTRALE in MAGISTRALE in COMPUTER SCIENCECOMPUTER SCIENCE

(1)

1

Laurea

Laurea MAGISTRALE in MAGISTRALE in COMPUTER SCIENCE COMPUTER SCIENCE

Corso di

ARTIFICIAL INTELLIGENCE Stefano Ferilli

Questi lucidi sono stati preparati per uso didattico. Essi contengono materiale originale di proprietà dell'Università degli Studi di Bari e/o figure di proprietà di altri autori, società e organizzazioni di cui e' riportato il riferimento. Tutto o parte del materiale può essere fotocopiato per uso personale o didattico ma non può essere distribuito per uso commerciale. Qualunque altro uso richiede una specifica autorizzazione da parte dell'Università degli Studi di Bari e degli altri autori coinvolti.

Search / 1 Search / 1

Irrevocable/tentative strategies - Blind/informed Irrevocable/tentative strategies - Blind/informed

search - U

search - Use of heuristics: A, admissibility – se of heuristics: A, admissibility – Performance measures

Performance measures

Problem State Space

●

Representing the problem state space allows

●

Formally defining the problem in terms of initial situation to be turned into a goal situation

●

Defining the solving process as a combination of

–

Pre-defined techniques

●

E.g., an operator/rule defining a step in the process

–

Search techniques

●

Exploration of the state space searching for a path – the “best”

path to attain the goal

●

Problem of finding a sequence of operations that turn one database into another

= finding a path in a directed graph

●

Nodes = problem states

●

Arcs = operators/rules whose application allows to go from a state to another

●

A general way to build the interpreter that acts as an inference engine :

endowing it with a generalized graph-search

●

By building a search tree, allows it to explore the problem space (database of configurations) looking for a path to reach the goal configuration

Modeling the Solution Process

●

Need to consider

●

Representation of each node in the search process

●

Selection of applicable operators

●

Topology of the search process

and direction in which carrying out the search

●

Possible use of (local- or global-level) knowledge to guide the search

●

When specifying the set of operators several aspects must be taken into account

●

E.g., in the case of rules:

–

The implicit assumptions in the informal description of the problem

–

The generality of the rules

–

The possibility of using in the rules previous knowledge in order to reduce the problem complexity

8-puzzle

●

Problem

●

Starting from any initial layout of 8 numbered tiles in a frame, place the tiles by increasing order along the frame border

–

E.g.: Initial state Final State

–

How to describe it?

–

Does the description affect the computational effort?

2 8 3

1 6 4

7 5

1 8

7 6

2

5

4

3

(2)

8-puzzle

●

Representing states is quite obvious

●

E.g., square matrices of characters

–

Possibly linearized

●

How to represent moves?

●

Problem space (set of all possible configurations):

9! = 362.880 configurations

●

Using a 3x3 matrix to represent the problem state, this will be the item in the global database

●

To model the problem all possible configurations created starting from the initial configuration by applying all possible moves must be described

2 8 3 1 6 4 7 5

8 2 3 1

6 4 7 5 2 8 3 1 6 4 7 5 2 8 3 1 6 4 7 5 2 8 3 1 6 4 7 5 2 8 3 1

6 4 7 5 2 8 3 6 7 1 4 5

2 8 3 1 6 4 7 5 2 8 3

1 6 4 7 5

2 8 3 1 6 4 7 5

2 8 3 1 6 4 7 5 2 8 3 1 6 4 7 5

2 8 3 1 6 4

7 5

2 8 3 1 6 4

7 5 2 8 3

6 1 4 7 5 2 8 3

1 6 4 7 5 2 8 3 6 1 4 7 5

2 8 3 6 7 1 4 5

2 8 3 1 6 4 7 5

2 8 1 3

6 4 7 5

2 8 3 1 4 6 7 5 2 8 3

6 1 4 7 5 2 8 1 3

6 4 7 5

2 8 3 1 6 7 4

5 2 8 3 1

6 4 7 5 8-puzzle

8-puzzle

●

How to represent the operators/rules?

●

2 options

–

Describing all possible moves for all numbered tiles in all possible legal positions

–

Describing moves of the “empty” tile

●

Just 4 moves

–

Move empty to left

–

Move empty to right

–

Move empty upward

–

Move empty downward

●

Moves may be expressed as production rules

–

“preconditions” (antecedents) define the state configurations in which a move is possible

8-puzzle

●

Rules

●

IF you are not in row (1) THEN move empty upward

●

Goal

●

May be explicit

–

The final state defined by the game

●

Or implicit

–

Expressed by a general boolean condition (goal condition) of the kind

●

“Sum of row (1) = 6”

AND “Sum of row (2) = 12”

AND “#”

Control Strategies

●

May be used by the problem solver in the solution process, intended as a path toward a goal state

●

Irrevocable

–

Rules are chosen and never reconsidered

●

Possible only if much knowledge about the problem is available, up to a deterministic solution (algorithm)

●

Tentative

–

The chosen rules may be reconsidered to obtain a better performance

●

Needed when the deterministic solution is impractical

Control Strategies

●

Irrevocable strategies

●

Efficient if knowledge about the problem is available

●

Even just “local” information that allows to proceed towards the goal

–

E.g., a function associated to each state description that provides an idea of the distance from the goal

–

Rule selection

●

Choose rule that decreases the function value

●

If no rule available to decrease the function value,

arbitrarily choose one that does not increase it

(3)

Control Strategies

●

Example: 8-puzzle

●

Distance from goal = number of misplaced tiles with respect to the final configuration

2 8 3 1 6 4 7 5

2 8 3 1 4 7 5

2 8 3 1 6 4 7 5

2 8 3 1

6 4 7 5 8 2 3

1 6 4 GOAL 7 5

4 6 3 3 2

1 Tentative Strategies

●

Solution searched tentatively

by trying to improve efficiency of the procedure through a reduction of the search space

●

Both selective

–

e.g., backtracking

and generative techniques may be used

–

most used is generate-and-test (with greedy techniques)

●

For problems involving limited search, tentative (or anyhow exhaustive) control strategies may turn out to be perfectly adequate and efficient

Tentative Strategies

●

Selective technique with backtracking

–

A search tree is built to search a path in the problem state space

●

Nodes = states of the problem

●

Root = initial state

●

Operators are available : applied to a state, transform it into another state

–

Each node represents a stage in the solution

●

If any element in the search space is structured into n components, after i stages a partial solution will be available

●

If information is available to understand when, from a partial solution, a complete solution cannot be reached, then the search process may be stopped on that path and go back to try others

●

Exhaustive technique!

–

Exploits the whole search space of the problem

8-puzzle

●

A semi-rigid and systematic strategy

●

Always move the empty tile (if possible) to the left, then upward, then to the right, then downward

●

Stop if

–

Generated a description already encountered

–

Applied a number of rules one does not want to exceed

●

Depth bound

–

No other applicable rule available

●

Technique effective if the db is well evaluated and more information is available to choose the move to make

●

We choose to

●

Not examine

the tree beyond a given depth

–

Define a depth bound DB

●

Ignore paths that produce already visited configurations

2 8 3 1 6 4 7 5

2 8 3 1 6 4

7 5 2 8 3

1 6 4 7 5 2 8 3 1 6 4 7 5 2 8 3 1 6 4

7 5 2 8 3 1 6 4

7 5 2 8 3 1 6 4 7 5

2 8 3 1

6 4 7 5

2 8 3 1

6 4 7 5

1° 2° 3° 4°

5°

6°

7° 7°

7°

6°

DB = 6

N queens problem

●

n queens to be placed in a nxn chessboard so that they do not mutually threat each other

●

States?

●

Operators?

●

Goal?

●

Cost?

–

E.g.: 8 queens:

(4)

N queens problem

●

Termination condition

●

A generic rule might be expressed as follows

●

For 1  i,j  N: R

ij –

Precondition

●

i = 1 (no queen in the schema)

●

1 < i  N (a queen in (i–1)-th row of the array)

–

Action

●

Place a queen in row i, column j of the array

●

For N = 4, the backtracking algorithm backtracks 22 times before finding a solution

●

A more efficient algorithm might be obtained by using a “more informed” ordering on rules

●

E.g., using function

–

diag(i,j) = length of the longest diagonal through cell C(i,j)

●

R

_ij

will be chosen before R

_mn

if diag(i,j) < diag(m,n)

●

For N = 4, the order of rules will be

–

R

₁₂

R

₂₄

R

₃₁

R

43

and the algorithm goes back just twice

Backtracking Strategy

●

recursive procedure in LISP-like [Nilsson, 1980]

●

backtrack(Data)

–

1. IF term(Data) return NIL;

●

term(.) predicate : true for arguments that satisfy the termination condition of the production system

●

If it terminates with success, the empty list NIL is returned

–

2. IF deadend(Data) return FAIL;

●

deadend(.) predicate : true for arguments that are known not to belong to a solution path

●

In such a case, the procedure returns a failure

–

3. Rules  apprules(Data);

●

apprules(.) : function that computes the rules applicable to its argument, and orders them

–

Arbitrarily or heuristically

–

4. Loop: IF null(Rules) return FAIL;

●

If there are no more rules to apply, the procedure fails

Backtracking Strategy

–

5. r  first(Rules);

●

Choose the best rule to apply

–

6. Rules  tail(Rules);

●

Reduces the list of applicable rules by removing the selected rule

–

7. Rdata  r(Data);

●

Apply rule R to produce a new DB

–

8. Path  backtrack(Rdata);

●

Recursively call the procedure on the new DB

–

9. IF Path = FAIL go Loop;

●

If recursive call fails, try another loop

–

10. return new list with R it its head cons(R, Path);

●

Otherwise add R to the list of succesfully applied rules

–

Cons : a LISP function that adds an element in first

position in the previous list

List constructor in LISP dialects

Backtracking Strategy

●

Comments

●

The procedure terminates only if producing a DB that satisfies the termination condition

–

The list of rules to produce the DB is in 10

–

When termination fails (steps 2 and 4) the procedure goes back to a higher level

●

The procedure might never terminate

–

It may indefinitely generate new non-terminal DBs or loop. This may be avoided by defining a depth bound

–

The loop on DBs may be limited by maintaining a list of the produced DBs and checking new generated DBs against those in the list

●

Heuristic information on the problem may be exploited in rule ordering (step 3)

Backtracking Strategy

●

A more efficient backtracking procedure

●

To avoid loops, pass the list of DBs as an argument to the procedure

–

When calling the procedure for the first time, it only contains the initial DB as a singleton

●

In case of termination, the procedure returns a

sequence of rules that are applicable to the initial

DB to produce one that satisfies the termination

condition

(5)

Backtracking Strategy

●

Recursive procedure backtrack1(Dlist)

–

1. Data  first(Dlist);

●

Dlist : list of all DBs on a path walked backward up to the initial node. Data : most recently produced DB

–

2. IF member(Data, tail(Dlist)), return FAIL;

●

Procedure fails if visiting again a previous DB

–

3. IF term(Data), return NIL;

–

4. IF deadend(Data), return FAIL;

–

5. IF length(Dlist) > Bound, return FAIL;

●

Procedure fails if too many rules were applied

●

Bound : a global variable specified before the procedure is called for the first time

–

6. Rules  apprules(Data);

–

7. Loop: IF null(Rules) return FAIL;

Backtracking Strategy

–

8. r  first(Rules)

–

9. Rules  tail(Rules)

–

10. Rdata  r(Data)

–

11. Rdlist  cons(Rdata, Dlist);

●

List of DBs is inspected and extended with the current DB

–

12. Path  backtrack1(Rdlist)

–

13. IF Path = FAIL, go to 7 (Loop)

–

14. return cons(R, Path)

Backtracking Strategy

●

Both strategies proceed step-wise

●

Failure at level n returns control to level n – 1, where another rule is attempted

●

However, sometimes failure is due to the choice of inappropriate rules made several levels before

●

In such a case, jumping many levels back would improve efficiency

Graph Search Strategy

●

Finding a sequence of operators that transform a state of the problem into another

= Finding a path in a directed graph

●

Nodes = problem states

●

Arcs = application of operators

●

All possible paths on the graph are represented, by generating a search tree rooted in the node that is the initial state

●

Strategy : explore the tree until producing a state that satisfies termination (coincides with the goal state)

General Graph Search Strategy

●

A graph may be explicitly specified

●

Implemented using suitable abstract data structures (e.g., adjacency or incidence matrices)

●

Generally impractical for AI applications for which the graph is unknown a-priori

or implemented dynamically

●

Generated during exploration

●

The control strategy itself makes explicit part of an implicitly specified graph

–

Solution path built starting from an initial node n

0

and applying the set of rules that allow to generate the nodes one after the other

General Graph Search Strategy

●

Dynamic generation of the problem graph (and the search tree)

●

Need for a successor operator, to be applied to a node to generate all of its successors

–

Nodes that can be obtained through application of suitable operators/rules

●

Expanding a node =

Applying the successor operator to a node

–

Expanding n

0

, then the successor of n

0

, and so on makes explicit a graph defined implicitly

●

Control strategy = the process to make explicit,

through the generation of the search tree, a portion

of an implicit graph that is sufficient to include a

goal node

(6)

General Search Method

●

Informal description

●

function general_search(problem, strategy) return solution or failure

–

Initialize search tree using the initial state of the problem

–

Loop do

●If there are no nodes to expand then return failure

●else

–

choose a leaf node to expand according to strategy

–

expand node and add resulting nodes to the tree

–

end

Graphsearch

–

1. Generate a search tree T

_r

consisting solely of the start node n

₀

. Add n

₀

to an ordered list OPEN

–

2. Generate a list CLOSED, initially empty

–

3. If OPEN is empty, exit with FAILURE

–

4. Choose first node n in OPEN, extract it and add it to CLOSED

–

5. If n is the goal node exit with SUCCESS with the solution obtained by tracing back a path along the arcs in T

_r

from n no n

0

–

6. Expand node n generating a set M of successors. Add the nodes in M as successors of n in T

r

adding an arc from n to each element in M. Add these elements of M to OPEN

–

7. Reorder list OPEN based on some criterion

–

8. Go to step 3

Graphsearch

●

Some considerations

●

The graphsearch procedure generates all successors of a node

●

The algorithm can be modified so as to generate one successor at a time

–

The node is not put in CLOSED until all of its successors are generated

–

Usually preferred

●

Application of rules to a DB to produce new DBs is computationally expensive

Control Strategies

●

Main problem for the problem solver:

choosing an applicable rule

●

Important feature: amount of knowledge used to select the operators to be applied

Arbitrary Selection

Knowledge-driven Selection

Control Strategies

●

Computational efficiency depends on the computational costs of the control strategy

●

Due to

–

Cost of applying the operators/rules

–

Control costs

●

Low control cost high cost to apply operators

●

High control cost low cost to apply operators

Total cost for P.S.

Control strategy cost

Rule application cost (cost of the heuristics)

C om pu ta ti on al c os t

Availability of knowledge

(7)

Control Strategies

●

2 extremes:

●

Few available knowledge implies that rule selection is arbitrarily made

–

Low control cost

–

High rule application cost because many may be attempted

●

Control strategy guided by a deeper problem knowledge, which allows to choose the “right” rule at the right moment

–

High control cost, due to the use of problem domain- related information

–

Low rule application cost, because the production system goes straight to the solution

Graph-Search Procedures

●

Search may be more or less informed

●

Ranges between blind search

–

Fruitful only for exhaustive exploration of the state space and informed search

–

Knowledge is available to choose the most promising node and/or the best/optimal path so as to dramatically reduce the search space

Search Strategies for Problem Solving

●

A systematization

–

[N. Nilsson, “Principles of Artificial Intelligence” 1980]

Blind

Depth first Breadth

first Non-

deterministic Beam

search Best first climbing Hill

Informed procedures

Graph-Search Procedures

●

Blind (Uninformed) Graph-Search Techniques

●

No heuristic criterion to sort nodes in OPEN

●

Ordering is just syntactic

–

Ascending or descending or other

●

Strictly speaking, should not be considered in the development of knowledge-based system

–

May be reduced to the graph search algorithm

●

Depth-first search

●

Breadth-first search

●

Search with Depth Iteration

●

Bidirectional search

Graphsearch

●

Can be used to perform a search

●

In breadth (Breadth first)

–

New nodes are added at the end of OPEN

●

FIFO ordering

●

In depth (Depth first)

–

New nodes are added at the beginning of OPEN

●

LIFO ordering

●

Heuristic (Best First)

–

New nodes are ordered using a goodness criterion

●

Local heuristic

Search Strategies for Problem Solving

●

Depth-first Search

●

Orders nodes in OPEN in descending order based on the depth of the search tree

–

Deeper nodes are the first in the list

–

Nodes at the same depth are arbitrarily ordered

–

So, the deepest node is always selected for expansion

●

Breadth-first Search

●

Orders nodes in OPEN in ascending order based on the depth of the search tree

–

“in breadth” because the expansion of nodes in the

search tree proceeds along borders of equal depth

(8)

Search Strategies for Problem Solving

●

Depth-first Search (DFS)

Numbers = expansion order generated by DFS 0

1 8

c

2 5 9 c 1 c

2 3 4 1

3 1 c 4

6 7 1

0 1 1

Search Strategies for Problem Solving

●

Breadth-first Search (BFS)

Numbers = expansion order generated by BFS 0

1 2

c

3 4 5 c 6 c

7 8 1

3 1

c 4

9 1

0 1

1 1

2 Search Strategies for Problem Solving

●

Search with Depth Iteration [Korf, 1985]

●

Also in uninformed techniques efficiency can be improved by leveraging the linear memory requirements of the depth-first search, finding a goal node at minimum depth

●

Subsequent depth-first searches are carried out, each with a depth bound increased by 1 with respect to the previous one, until a goal node is found

●

The number of nodes expanded by the search with depth iteration is not much larger than that of the nodes expanded by the breadth-first search

Search Strategies for Problem Solving

●

Search with Depth Iteration [Korf, 1985]

Numbers = expansion order generated by DFID 0

1,3, 9

2,6,1 6 4,1 c

0 5,1

3 ^7,1 ₇ c ^8,2 ₀

11 12 14 15 c 18 19 21 22

Search Strategies for Problem Solving

●

Bidirectional Search

●

Idea: searching simultaneously both starting from the initial state toward the goal state, and from the goal state toward the initial state,

letting the two borders meet in the middle

S Goal

node

monodirectional

Informed Search

●

Exhaustive search is impractical in problems having exponential complexity

●

Knowledge and experience about the problem used to detect most promising paths

●

Heuristic knowledge

–

Greek “eurisko” = to find

helps in doing “insightful” choices

●

Search not avoided, but reduced

●

Generally allows to find a good solution in acceptable time

●

Under some conditions completeness and

optimality guaranteed

(9)

Informed Search

●

Example: route finding

●

Given a set of towns to visit, a start town and a destination town, the aim is finding the roads to go to reach the latter from the former, covering the least possible distance

–

If all towns are connected, the problem is decidable and can be solved tentatively

–

Complete exploration generates a combinatorial explosion

●

Total time is O(n!)

–

For n towns the number of different paths is

●

(n-1)! = (n-1) • (n-2) •………. • 2 • 1

–

Examining a path requires a time proportional to n

–

Knowledge must be used to reduce the search space

Informed Search

●

Best first

●

The procedure to search a solution path ( graph search) may exploit local-level knowledge to choose the best rule to apply and generate the solution path with a good success probability

–

Search proceeds non-uniformly starting from the initial node, using a preference criterion in the selection of the nodes to expand based on the specific knowledge for each problem (heuristic)

Heuristic Evaluation Functions

●

In general, problem knowledge is available through a state evaluation function

●

Heuristic evaluation function f : n  R

●

Depends on the state only

–

8-puzzle: number of misplaced tiles

–

Checkers or Chess: advantage in pieces

–

Route-finding: the closest town (or the town which is closest to the goal via a straight line)

Best-first Search Algorithm

●

At each step, chooses the node on the border having the best value of f

●

Most promising node

●

Best = “less” if estimating the distance from the solution

●

Can be implemented using a priority queue ordered based on the value of the heuristic evaluation function

Best-first Strategy

●

Example: route finding

Step 1 A

Best-first Strategy

●

Example: route finding

Passo 1 A

3 Step 2

B D

5 1

C

(10)

Best-first Strategy

●

Example: route finding

Passo 1 A

3 Passo 2

B D

5 1

C Step 3

D

E F

4 6

Best-first Strategy

●

Example: route finding

Passo 1 A

3 Passo 2

B D

5 1

C Passo 3

D

E F

4 6

Step 4

B

G H

6 5

Best-first Strategy

●

Example: route finding

Passo 1 A

3 Passo 2

B D

5 1

C Passo 3

D

E F

4 6

Passo 4

B

G H

6 5

Step 5

E

G I

2 1

Best-first Strategy

●

Example: route finding

Passo 1 A

3 Passo 2

B D

5 1

C Passo 3

D

E F

4 6

Passo 4

B

G H

6 5

Passo 5

E

G I

2 1

Step 6

I

Best-first Strategy

●

Example: route finding

Passo 1 A

3 Passo 2

B D

5 1

C Passo 3

D

E F

4 6

Passo 4

B

G H

6 5

Passo 5

E

G I

2 1

Passo 6

I Step 7

G

2 8 3 1 6 4 7 5

8 2 3 1

6 4 7 5

2 8 3 1 6 4 7 5 2 8 3 1 6 4 7 5 2 8 3 1 6 4 7 5 8 2 3 1

6 4 7 5

2 8 3 1 6 4 7 5 2 8 3

1 6 4 7 5

2 8 3 6 1 4 7 5

2 8 3 1 6 4 7 5 2 8 3 1 6 4 7 5 2 8 3

6 1 4 7 5 2 8 3

1 6 4 7 5

2 8 3 1

6 4 7 5 5

5 3

5 2

4 5

6 4

3 3

3 4 3 4

1 0 2

Number in green  node expansion order

Circled values

= node goodness expressed as number of misplaced tiles (distance from goal)

1 Start node

Goal node 3

Toward useless exploration

8-puzzle: best-first with heuristic function = number of misplaced tiles

(11)

●

Using heuristics in search process (1)

●

1. decide which node to expand first, instead of proceeding blindly

–

Expanding the “most promising” node

●

“ordered” or “best-first” search

–

Local evaluation function f

●

To decide which is the best node to expand

●

Possible ideas:

–

Distance measure between an arbitrary node and the goal set (evaluation of what is locally optimum)

–

Application-related scores assigned to configurations

Example: Route finding

●

Possible evaluation function

●

Straightline distance from state n to destination

Example: Route finding

●

Route finding with Best-first search

●

In general, the algoritm does not guarantee convergence

–

Choice is greedy and would guarantee the optimal solution only if the graph were complete

Example: Route finding

–

From Arad to Bucharest

●

Greedy: Arad, Sibiu, Fagaras, Bucharest (450)

●

Optimum: Arad, Sibiu, Rimnicu, Pitesti, Bucharest (418)

–

From Iasi to Fagaras: ... false start ... or loop

●

Using heuristics in search process (2)

●

2. Level-wise search

–

decide, while expanding a node, which successors to generate first, instead of generating all indiscriminately

●

i.e., which operators applying to the next node/state

–

Node is partially expanded

●

Heuristic information used to reduce the number of offspring nodes to consider, possibly intending to complete the search later

–

Albeit the use of local heuristics may reduce the quantity of search needed to explore large graphs, sometimes memory may be out before finding a satisfactory path

●

In such cases, one may decide to truncate the search tree, guaranteeing that the paths left out are reconsidered in subsequent stages

●

Using heuristics in search process (3)

●

3. Search with successor limitation

–

Decide that some nodes will be discarded from the search tree

●

The search tree is pruned

–

Nodes seemingly not essential to the solution

●

E.g., expanding all successors except those having lower values than f

–

Should be done when reliable problem knowledge is available

●

A non-expanded node might be on the only solution path

(12)

Heuristic Search for an Optimal Path

●

Aim: not only the best node, but the best solution path!

●

Not just finding as soon as possible a generic path to the solution, but finding an optimum/best path

●

Need to define “best”

–

Costs may be associated to arcs to represent the cost of applying a rule

●

C(n

_i

, n

_j

)

–

Assumed to be positive and larger than an arbitrarily chosen positive number 

–

Aim: minimizing the cost of the path between two nodes

●

Path = sequence of arcs that make up the route from the source node to a destination node

 cost of the path = sum of all costs associated to arcs that make up the path

●

In modeling search strategies for a production system, the problem is simply

●

Finding a best path between a given node s

–

Representing the initial DB and another node t (tip node)

–

Representing another target/goal DB

●

DBs satisfying the termination condition define a set {t_i} called goal set

●

Searching for a best path means guaranteeing the minimum cost path

●

Interest in minimizing the combination of

–

Path cost, and

–

Associated search cost

●

Such combinations are never actually computed, but estimated based on the intuition associated to the problem knowledge

–

Indeed, to evaluate the path cost and the probability that a node is on the best path one needs to make estimations

●

Let

–

g(n) be the cost of the minimum cost path between the starting node s and node n

–

h(n) be the cost of the minimum cost path between a node n and the goal node

●

Function

–

f(n) = g(n) + h(n)

is the cost of the minimum cost path from source node s to node goal, constrained to passing from node n

–

Note: f(s) = h(s) is the cost of a minimum cost path from s to the goal node (not constrained to any n)

●

For any node n, let us denote

●

h*(n) heuristic factor

–

The estimation of h(n)

●

g*(n) depth factor

–

The cost of the lowest cost path found up to node n

●

A good search algorithm will use as an evaluation function

●

f(n) = g(n) + h*(n)

n

0

6 2

7 5

3 4

f(n

0

)= cost of the lowest cost path to the GOAL

f(n) = g(n) + h(n) cost of the lowest cost path from the source node to the GOAL node, constrained to pass from n

g(n) (in this example = 8)

Is the cost of the best path from s to n

(13)

Graph search A*

●

Search for an optimum path [Nilsson, 1980]

–

1. Generate a search graph G consisting only of the starting node n

0

. Add n

0

to an ordered list OPEN

–

2. Generate a list CLOSED initially empty

–

3. If OPEN is empty, exit with FAILURE

–

4. Choose the first node n in OPEN, extract it and add it to CLOSED

–

5. If n is a goal node, exit with SUCCESS with the solution obtained by tracing back the path along the pointers from n to n

0

in G

–

6. Expand n by generating a set M of its successors that are not already ancestors of n in G. Add such elements of M in G as successors of n

Graph search A*

●

Search for an optimum path [Nilsson, 1980]

–

7. generate a pointer to n from each element in M that is not already in G (i.e., in OPEN or CLOSED). Add such elements of M in OPEN. For all elements m in M that are already in OPEN or CLOSED, replace their pointer with one that points to n if the best path to m found so far is that through n. For all elements in M already in CLOSED, change the pointer of each of its descendents in G so that they point back in correspondence to the best paths found so far for such descendants

–

8. sort list OPEN by ascending values of f. In case of ties on the minimum value of f, prefer the deepest node in the search tree

–

9. goto 3

Graph search A*

●

Procedure is general

–

Generates an explicit graph G (search graph) and a subset T (search tree) in step 7

●

OPEN = leaves of the search tree

●

CLOSED = non-terminal nodes

–

In particular, at step 3

●

OPEN = nodes in the search tree not yet selected for expansion

●

CLOSED = terminals chosen for expansion that did not generate successors or non-terminals

–

Step 8 sorts the nodes so that the best one can be expanded in 4

●

When this is a goal node the process terminates with SUCCESS

Graph search A*

●

If the node already exists, one may

●

1. make the currently expanded node point to the existing node corresponding to its successor, instead of the new one

●

2. if keeping track at each node of the best (shortest, cheaper) path, one may determine if the new path is better or not than the previous one, and the change is made only in the former case

Nodes in CLOSED Nodes in OPEN

Black arrows along arcs = pointers defining the parents of nodes in the search tree

when expanding 1 4

2 5 3 1

6 n

0

4

2 5 3 1

6 Graph after expanding node 1

n0

(14)

Graph search A*

●

Conditions guaranteeing the A* algorithm applied to the graph to always find the optimum path

●

Conditions on graph

–

All nodes in the graph have a finite number of successors (if any)

–

All arcs in the graph have costs  than a value  (>0)

●

Conditions on h*

–

For all nodes n in the search graph, it is h*(n)  h(n)

●

h* never overestimates the real value h

–

(optimistic estimation)

Graph search A*

●

So, A* is a graphsearch algorithm using a heuristic function that is a lower bound on h*

●

h=0 is certainly a lower bound on h*

 breadth-first search finds the minimum cost path as a special case of A*

●

Admissibility of A*

●

A search algorithm is admissible if, for any graph, it terminates in an optimum (minimum cost) path from s to a goal node, when a path from s to a goal node exists

●

Proof [Nilsson, 1968]

–

N. Nilsson: “Artificial Intelligence”

–

S.J. Russell, P. Norvig: “Artificial Intelligence: a Modern Approach”

A* Route finding

(15)

A* Route finding

●

In the graphsearch procedure, when a node n is expanded, some of its successors may already be in OPEN or CLOSED

●

Posing a reasonable restriction on h may avoid the (computationally heavy) effort to test if a node has already been generated or not

–

A heuristic function h satisfies the monotonic restriction if for all nodes n

i

and n

j

 n

j

is a successor of n

i

,

●

h(n

_i

) – h(n

_j

)  C(n

_i

,n

_j

) with h(t) = 0

●

i.e., h(n

_i

)  h(n

_j

) + C(n

_i

,n

_j

)

–

The estimation of the optimal path to the goal from node n

_i

cannot be larger than the cost of the arc from n

i

to n

j

plus the estimation of the optimal path from n

j

to the goal

●

In the 8 puzzle, h(n) = w(n) (number of misplaced tiles) satisfies the monotonic restriction

●

It can be proven that, if the monotonic restriction is satisfied,

●

A* has already found an optimal path to any node selected for expansion

–

i.e., if A* chooses n for expansion, g(n) = g*(n)

●

f values for the sequence of nodes expanded by A*

are non-decreasing

●

So, the effect of such a restriction is to increase the chance that the first discovered path to a node is an optimal path

Comparison of A* algorithms

●

Precision depends on the heuristic function

●

h depends on the quantity of knowledge on the problem

–

h(n) = 0 means complete lack of information

●

Even if it is a minimum on h*(n) and thus, by definition, yields an admissible algorithm

●

If two version A

₁

and A

₂

of A* exist, with evaluation functions

–

f

₁

(n) = g

₁

(n) + h

₁

(n)

–

f

₂

(n) = g

₂

(n) + h

₂

(n) with both h

1

and h

2

lower bounds on h*, we say that A

2

is more informed than A

₁

if, for all non-goal nodes, h

2

(n) > h

₁

(n)

Comparison of A* algorithms

●

It can be proven that

●

If A

1

and A

2

are two version of A* such that A

2

is more informed than A

1

, upon termination of the search ran on any graph in which a path exists from n

₀

to the goal node, any node expanded by A

₂

is also expanded by A

1 –

A

1

expands at least as many nodes as A

2

●

A more informed algorithm expanding less nodes than a less informed one is intuitive, but this does not mean that it is more efficient

–

Computational complexity to compute the heuristics may

increase, limiting efficiency. In any case, the number of

expanded nodes is one of the factors affecting efficiency

(16)

Example

●

Back to the 8-puzzle

●

A more articulate evaluation function than the previously proposed one (w(n) only)

–

f(n) = d(n) + w(n)

●

Where d(n) is the depth of n and w(n) is the number of misplaced tiles

–

For the start node it is f(s) = 0 + 4 = 4

●

Warning!

–

f(n) = d(n) yields the breadth-first process

2 8 3 1 6 4 7 5

8 2 3 1

6 4 7 5

2 8 3 1 6 4 7 5 2 8 3 1 6 4 7 5 2 8 3 1 6 4 7 5 8 2 3 1

6 4 7 5

2 8 3 1 6 4 7 5 2 8 3

1 6 4 7 5

2 8 3 6 1 4 7 5

2 8 3 1 4 6

7 5 2 8 3 1 6 4 7 5 2 8 3

6 1 4 7 5 2 8 3

1 6 4 7 5

2 8 3 1

6 4 7 5 4

6 4

6 2

4 5

6 6

5 5

6 7 5 7

5 5 7

Number in green  order of nodes expansion

Evaluation of f 1 Start node

Goal node 3

●

h = 0  ensures admissibility but leads to breadth- first, usually inefficient

●

h = highest possible value among the lower bounds on h*  expands the smallest number of nodes consistent with admissibility

●

Admissibility can be traded for more heuristic power, using h which is not a lower bound on h*

–

In the 8-puzzle, instead of w(n), consider the Manhattan distance P(n)

●

P(n) = 

_i

d

_i

where d

_i

= distance of tile i from its final position

–

I.e., sum of distances of tiles from the final configuration

2 8 3 4 1 6 7 5

2 8 3 4 1 6 7 5 4 3 8

7 5

57 57 59

3 4 1

2 8 1 3 6 4 7 5 59

2 8 3 4 1 6 7 5

59 59

2 8

9 1 6 4 7 2

2 8 3 1 6 4 7 5

57 57

2 8 3 1 6 4 7 5

59 57

2 8 3 1 6 4 7 5

59 58 2

1 6 8 3 4 7 5

2 8 3 4 1 6

7 5

59 62

2 8 3 6 1 4 7 5

2 8 3 1 6 4 7 5

55 57

2 8 3 1 6 7 5 4 55

10

5 7 6

8 2 1 6

3 5

Example: 8-puzzle

●

A still better estimation is

●

h(n) = P(n) + 3 S(n)

–

Where S(n) is a sequence score obtained by checking the tiles around non-central quadrants

●

1 a tile in the middle

●

2 each tile not followed by its correct successor

●

0 otherwise

–

Not a lower bound on h*, but

–

Allows a solution path in 18 steps albeit not ensuring optimality at all

●

Another factor affecting the heuristic power of the search algorithm is the effort spent to compute f

–

The best function would seem to be equal to h*

●

Minimum number of expansions

–

Actually, g is essential to guarantee that (at least) one path is found, but it is convenient to reduce it with respect to h

●

E.g., using

–

f = g + wh with w a positive number

●

So, the factors affecting the heuristic power of A* are:

●

Cost of the path

●

Number of expanded nodes in finding the path

●

Computational effort needed to compute h

(17)

(Generic graph-search algorithms) (Best-first search)

A*

(Uniform cost) = 0 = depth (Breadth first)

≤ h Depth first (LIFO ordering)

f*

h*

h f

= g* + h*

Performance Measures

●

Penetrance

●

P = L / T

–

L length of the solution path found

–

T total number of nodes generated during the search

●

Including the final node, excluding the start node

●

Gives an idea of how focused the search is towards an objective

–

Maximum value = 1

●

Reached if successor operator is precise

–

Small values for blind search

●

Measure how ‘elongated’ or ‘bushy’ the generated tree is

Performance Measures

●

Example: 8-puzzle

●

F = g (breadth-first) f = g + w(n) f = g + P(n) + 3 S(n)

●

5/46 5/13 18/43

●

0.108 0.385 0.419

Performance Measures

●

Actual branching value

●

B

–

Constant number of successors each node should have in an imaginary tree of depth equal to the length of the path and with a total number of nodes equal to the number of nodes generated during search

●

Depends on the length of path L and on the number of generated nodes T

–

B + B

²

+ ... B

^L

= T

–

B close to 1  search focused towards the goal

–

Bushy graph  large B

●

Penetrance can be expressed in terms of B

–

P = L/T = L (B – 1) / B (B

^L

– 1) B

B - 1 (B

^L

Laurea Laurea MAGISTRALE in MAGISTRALE in COMPUTER SCIENCECOMPUTER SCIENCE

Laurea

Laurea MAGISTRALE in MAGISTRALE in COMPUTER SCIENCE COMPUTER SCIENCE

Corso di

ARTIFICIAL INTELLIGENCE Stefano Ferilli

Search / 1 Search / 1

Irrevocable/tentative strategies - Blind/informed Irrevocable/tentative strategies - Blind/informed

search - U

search - Use of heuristics: A*, admissibility – se of heuristics: A*, admissibility – Performance measures

Performance measures

Problem State Space

Representing the problem state space allows

Formally defining the problem in terms of initial situation to be turned into a goal situation

Defining the solving process as a combination of

Pre-defined techniques

E.g., an operator/rule defining a step in the process

Search techniques

Exploration of the state space searching for a path – the “best”

path to attain the goal

Problem of finding a sequence of operations that turn one database into another

= finding a path in a directed graph

Nodes = problem states

Arcs = operators/rules whose application allows to go from a state to another

A general way to build the interpreter that acts as an inference engine :

endowing it with a generalized graph-search

By building a search tree, allows it to explore the problem space (database of configurations) looking for a path to reach the goal configuration

Modeling the Solution Process

Need to consider

Representation of each node in the search process

Selection of applicable operators

Topology of the search process

and direction in which carrying out the search

Possible use of (local- or global-level) knowledge to guide the search

When specifying the set of operators several aspects must be taken into account

E.g., in the case of rules:

The implicit assumptions in the informal description of the problem

The generality of the rules

The possibility of using in the rules previous knowledge in order to reduce the problem complexity

8-puzzle

Problem

Starting from any initial layout of 8 numbered tiles in a frame, place the tiles by increasing order along the frame border

E.g.: Initial state Final State

How to describe it?

Does the description affect the computational effort?

2 8 3

1 6 4

7 5

1 8

7 6

2

5

4

3

8-puzzle

Representing states is quite obvious

E.g., square matrices of characters

Possibly linearized

How to represent moves?

Problem space (set of all possible configurations):

9! = 362.880 configurations

Using a 3x3 matrix to represent the problem state, this will be the item in the global database

To model the problem all possible configurations created starting from the initial configuration by applying all possible moves must be described

2 8 3 1 6 4 7 5

8 2 3 1

6 4 7 5 2 8 3 1 6 4 7 5 2 8 3 1 6 4 7 5 2 8 3 1 6 4 7 5 2 8 3 1

6 4 7 5 2 8 3 6 7 1 4 5

2 8 3 1 6 4 7 5 2 8 3

1 6 4 7 5

2 8 3 1 6 4 7 5

2 8 3 1 6 4 7 5 2 8 3 1 6 4 7 5

2 8 3 1 6 4

7 5

2 8 3 1 6 4

7 5 2 8 3

6 1 4 7 5 2 8 3

1 6 4 7 5 2 8 3 6 1 4 7 5

2 8 3 6 7 1 4 5

2 8 3 1 6 4 7 5

2 8 1 3

6 4 7 5

search - Use of heuristics: A, admissibility – se of heuristics: A, admissibility – Performance measures