Sistemi per il Governo dei Robot

(1)

Sistemi per il Governo dei Robot

Silvia Rossi - Lezione 5

(2)

Planning graph

Directed, leveled graph with alternating layers of nodes

Odd layers represent candidate

propositions that might hold at time I

Even layers represent candidate actions that we might possibly execute at time I, including maintain actions

We can only execute one real action at any step, but the data structure keeps track of

(3)

Basic idea

Construct a graph that encodes constraints on possible plans

Use this “planning graph” to constrain search for a valid plan

Planning graph can be built for each

problem in relatively short time

(4)

Problem handled by GraphPlan

STRIPS operators: conjunctive

preconditions, no conditional or universal effects, no negations

Finds “shortest” plans (by some definition)

Sound, complete and will terminate with

failure if there is no plan.

(5)

Planning graph

Nodes divided into “levels”, arcs go from one level to the next

For each time period, a state level and an action level

Arcs represent preconditions, adds

and deletes

(6)

What actions and what literals?

Add an action in level Ai if all its

preconditions are present in level Pi

Add a precondition in level Pi if it is the effect of some action in level Ai-1

(including no-ops – frame problem)

Level P0 has all the literals from the initial

state

(7)

Simple NASA domain

Literals:

at X L X is at location L fuel R rocket has fuel in X R X is in the rocket

Actions:

load X L load X onto rocket at location L (X and R must be at L)

unload X L unload X from rocket at location L (R must be at L, X must be in L)

move L L’ move rocket from L to L’

(R must be at L and have fuel; moving uses up the fuel)

Graph representation:

(8)

Example planning graph

AT A L AT B L AT R L FUEL R

LOAD A L LOAD B L MOVE L P

IN A R IN B R FUEL R

AT A L AT B L

AT R L

AT R P

MOVE P L

IN A R IN B R AT R P

UNLOAD A P UNLOAD B P

AT A P AT B P

(9)

Exclusion relations (mutexes)

Two actions (or literals) are mutually exclusive at some stage if no valid plan could contain

both.

Can quickly find and mark some mutexes:

Inconsistent Effects: one action negates an effect of the other

Interference: One of the effects of one action is the negation of the precondition of the other

Competing needs: If some precond of one action is mutex with a precond of the other action

Inconsistent support: Two propositions are mutex if all

(10)

Example planning graph

AT A L AT B L

AT R L

AT R P

MOVE P L

AT A P AT B P NOP

NOP NOP NOP

NOPNOP NOP NOP

NOP NOP NOP INCONSISTENT EFFECTS

MOVE L P

MOVE P L

(11)

Example planning graph

PROPS ACTIONS PROPS ACTIONS PROPS ACTIONS GOALS

AT A L AT B L

AT R L

AT R P

MOVE P L

AT A P AT B P NOP

NOP NOP NOP

NOPNOP NOP NOP

NOP NOP NOP

(12)

A slightly different planning graph…

AT A L AT B P AT R L FUEL R

LOAD A L

MOVE L P

IN A R FUEL R

AT A L AT B P AT R L

AT R P

LOAD A L LOAD B P MOVE L P

MOVE P L

AT A P AT B P NOP

NOP NOP NOP

NOPNOP NOP

NOP NOP NOP MOVE L P

NOP

AT R L

AT R P

LOAD A L LOAD B P

NOP

COMPETING NEEDS

INTERFERENCE

INCONSISTENT SUPPORT

(13)

Mutex: Summary

Actions A,B are mutex at level i if no valid plan could possibly contain both at i:

inconsistent effects: A’s effects negate one of B’s effects

interference: A’s effects delete one of B’s preconditions, or vice- versa

competing needs: A & B have mutex preconditions

Propositions P,Q are mutex at level i if no valid plan could possibly contain both at i

inconsistent support: If at i, all ways to achieve P exclude all ways to achieve Q

(14)

GraphPlan algorithm (without termination)

Grow the planning graph (PG) until all goals are reachable and not mutex. (If PG levels off first, fail)

Search the PG for a valid plan

If none found, add a level to the PG and try

again

(15)

Valid plans

A valid plan is a planning graph where:

Actions at the same level don’t interfere (delete each other’s preconditions or add effects)

Each action’s preconditions are made true by the plan

Goals are satisfied

(16)

Extending the planning graph

Action level: For each possible instantiation of each operator (including no-ops), add if all of its preconditions are in the previous level and no

two are mutex.

Proposition level: Add all effects of actions in previous level (including no-ops).

! Mark pairs of propositions mutex if all ways to create one are mutex of all ways to create the

other

(17)

Creating the planning graph is usually fast Theorem 1:

! The size of the t-level PG and the time to create it are polynomial in t, n (= number of objects),

! m (= number of operators) and p (=

propositions in the initial state)

(18)

Searching for a plan

Backward chain on the planning graph

Complete all goals at one level before going back:

At each level, pick a subset of actions to achieve the goals that are not mutex. Their preconditions become the goals at the earlier level.

Build subset by picking each goal and choosing an action to add.

Use one already selected if possible.

Do forward checking on remaining goals.

(19)

SATplan

Formulate the planning problem as a CSP Assume that the plan has k actions

Create a binary variable for each possible action a:

Action(a,i) (TRUE if action a is used at step i)

Create variables for each proposition that can hold at different points in time:

Proposition(p,i) (TRUE if proposition p holds at step i)

(20)

Constraints

Only one action can be executed at each time step (XOR constraints)

Constraints describing effects of actions

Persistence: if an action does not change a proposition p, then p’s value remains

unchanged

A proposition is true at step i only if some

action (possibly a maintain action) made it true

Constraints for initial state and goal state

(21)

Planning as CSP

Constraint-satisfaction problem (CSP)

^Given:

set of discrete variables,

domains of the variables, and

constraints on the specific values a set of variables can take in combination,

Find an assignment of values to all the variables which respects all constraints

Compile the planning problem as a constraint-satisfaction problem (CSP)

Use the planning graph to define a CSP

(22)

Representing the Planning Graph as a CSP

(23)

Summary: classical planning

Representations in planning Representation of action:

preconditions + effects Approaches:

State-space search Plan-space search

Constraint-based search

(24)

(25)

Real-world planning domains

Real-world domains are complex and don’t satisfy the assumptions of STRIPS or partial-order planning methods

Some of the characteristics we may need to deal with:

Modeling and reasoning about resources Representing and reasoning about time

Planning at different levels of abstractions Conditional outcomes of actions

Uncertain outcomes of actions Exogenous events

Incremental plan development Dynamic real-time replanning

(26)

Planning vs. scheduling

Planning: given one or more goal, generate a sequence of actions to achieve the goal(s)

Scheduling: given a set of actions and constraints, allocate resources and assign times to the actions so that no constraints are violated

Traditionally, planning is done with specialized logical reasoning methods

Traditionally, scheduling is done with constraint satisfaction, linear programming, or OR methods

However, planning and scheduling are closely interrelated and can’t always be separated

(27)

Hierarchical Planning Brief History

Originally developed about 25 years ago Knowledge-based  Scalable

Task Hierarchy is a form of domain-specific knowledge

Practical, applied to real world problems (linear time vs. exponential)

Lack of theoretical understanding until

early 1990’s

(28)

Hierarchical Decomposition

(29)

HTN Planning

Capture hierarchical structure of the planning domain

Planning domain contains non-primitive actions and schemas for reducing them

Reduction schemas:

given by the designer

express preferred ways to accomplish a

task

(30)

Hierarchical decomposition

Hierarchical decomposition, or hierarchical task

network (HTN) planning, uses abstract operators to incrementally decompose a planning problem from a high-level goal statement to a primitive plan network

Primitive operators represent actions that are executable, and can appear in the final plan

Non-primitive operators represent goals

(equivalently, abstract actions) that require further decomposition (or operationalization) to be executed

There is no “right” set of primitive actions: One agent’s goals are another agent’s actions!

(31)

HTN planning: example

(32)

Hierarchical Decomposition

FROM A HIGH-LEVEL ACTION TO A

PARTIALLY ORDERED SET OF LOWER-

LEVEL ACTIONS.

(33)

HTN Planning

A plan library contains both primitive and non-primitive actions.

Non-primitive actions have external

preconditions, as well as external effects.

Sometimes useful to distinguish

between primary effects and secondary

effects.

(34)

HTN Formalization (1)

State: list of ground atoms Tasks:

Primitive tasks: do[f(x₁, …, x_n)]

Non-primitive tasks:

Goal task: achieve(l) (l is a literal) Compound task: perform[t(x₁, …, x_n)]

Plan Library

Operator:

[operator f(x₁, …, x_n) (pre: l₁, …, l_n) (post: l’₁, …, l’_n)]

Method: Decompose(α, d)

α is a non-primitive task and d is a task network

(35)

Task Reduction

(36)

Example action descriptions

Action(BuyLand, PRECOND:money, EFFECT: Land ⋀ ⌉Money)

Action(GetLoan, PRECOND: GoodCredit, EFFECT: Money ⋀ Mortage) Action(BuildHouse, PRECOND:Land, EFFECT: House)

Action(GetPermit, PRECOND: Land, EFFECT: Permit) Action(HireBuilder, EFFECT: Contract)

Action(Construction, PRECOND: Permit ⋀ Contract, EFFECT: HouseBuild ⋀ ⌉Permit)

(37)

Example action descriptions

Decompose(BuildHouse,

Plan(STEPS:{S1: GetPermit, S2: HireBuilder, S3: Construction, S4: PayBuilder}

ORDERINGS: {Start ≺ S1 ≺ S3 ≺ S4 ≺ Finish, Start ≺ S2 ≺ S3,

LINKS: {Start Land S1, Start Money S4, S1 Permit S3,

S2 Contract S3, S3 HouseBuilt S4,

(38)

Correctness

• A decomposition should be a correct implementation of the action.

• A plan d implements an action a correctly if d is a complete and consistent partial-order plan for the problem of achieving the effects of a given the preconditions of a (result of a sound POP).

• The plan library contains several decompositions for any high- level action.

• Each decomposition might have different preconditions and

effects. The preconditions of the high-level action should be the

(39)

HTN Planning Algorithm (intuition)

Problem reduction:

Decompose tasks into subtasks Handle constraints

Resolve interactions

If necessary, backtrack and try other

decompositions

(40)

Basic HTN Procedure

1. Input a planning problem P

2. If P contains only primitive tasks, then

resolve the conflicts and return the result. If the conflicts cannot be resolved, return

failure

3. Choose a non-primitive task t in P 4. Choose an expansion for t

5. Replace t with the expansion

6. Find interactions among tasks in P and

suggest ways to handle them. Choose one.

7. Go to 2

(41)

For each decomposition d of an action a

Remove the high level action, and insert/

reuse

actions for each action in d Reuse -> subtask sharing

Merge the ordering constraints (If there is an ordering constraint of the form B ≺ a,

should every step of d come after B?

(42)

Comments on HTN planning

The major idea is to gain efficiency by using the library of preconstructed plans.

When there is recursion, it is undecidable even if the underlying state space is finite.

recursion can be ruled out

the length of solutions can be bound

(43)

SHOP (Simple Hierarchical Ordered Planner) Domain-independent algorithm for

Ordered Task Decomposition

Sound/complete

Input:

State: a set of ground atoms Task List: a linear list of tasks

Domain: methods, operators, axioms

Output: one or more plans, it can return:

the first plan it finds all possible plans

a least-cost plan

all least-cost plans

(44)

Initial task list: ((travel home park))

Initial state: ((at home) (cash 20) (distance home park 8)) Methods (task, preconditions, subtasks):

(:method (travel ?x ?y)

((at x) (walking-distance ?x ?y)) ' ((!walk ?x ?y)) 1) (:method (travel ?x ?y)

((at ?x) (have-taxi-fare ?x ?y))

' ((!call-taxi ?x) (!ride ?x ?y) (!pay-driver ?x ?y)) 1) Axioms:

(:- (walking-dist ?x ?y) ((distance ?x ?y ?d) (eval (<= ?d 5)))) (:- (have-taxi-fare ?x ?y)

((have-cash ?c) (distance ?x ?y ?d) (eval (>= ?c (+ 1.50 ?d)))) Primitive operators (task, delete list, add list)

(:operator (!walk ?x ?y) ((at ?x)) ((at ?y)))

…

Simple Example

OPTIONAL COST;

DEFAULT IS 1

(45)

Precond: Precond:

(TRAVEL HOME PARK)

(!WALK HOME PARK)

(!CALL-TAXI HOME)(!RIDE HOME PARK)(!PAY-DRIVER HOME PARK)

FAIL (DISTANCE > 5)

SUCCEED (WE HAVE $20, AND THE FARE IS ONLY

$9.50) SUCCEED

SUCCEED (AT HOME) (walking-distance

Home park)

(have-taxi-fare home park)

(AT PARK) (CASH 10.50)

(DISTANCE HOME

Simple Example (Continued)

(AT HOME)

INITIAL STATE:

FINAL STATE:

(AT HOME) (CASH 20)

(DISTANCE HOME PARK 8)

(46)

The SHOP Algorithm

procedure SHOP (state S, task-list T, domain D) 1. if T = nil then return nil

2. t₁ = the first task in T

3. U = the remaining tasks in T

4. if t is primitive & an operator instance o matches t₁ then 5. P = SHOP (o(S), U, D)

6. if P = FAIL then return FAIL 7. return cons(o,P)

8. else if t is non-primitive

& a method instance m matches t₁ in S

& m’s preconditions can be inferred from S then 9. return SHOP (S, append (m(t₁), U), D)

10. else

11. return FAIL

12. end if

state S; task list T=( t₁ ,t₂,…) state o(S) ; task list T=(t₂, …)

task list T=( t₁ ,t₂,…) NONDETERMINIST IC CHOICE AMONG

ALL METHODS M WHOSE

PRECONDITIONS CAN BE INFERRED

FROM S

(47)

Pianificare e agire in domini non deterministici

In questo contesto l’ indeterminismo è dovuto a conoscenza incompleta o non corretta

Due situazioni di indeterminismo:

Indeterminismo delimitato, e indeterminismo senza limitazioni

(48)

Bounded Indeterminancy

Le azioni possono avere effetti impredicibili

ma in ogni caso compresi in un insieme noto e finito di casi.

Due modi per pianificare in presenza di bounded determinacy:

Conformant Planning ovvero pianificazione senza fase di percezione (sensorless planning) e

Conditional Planning (contingency planning).

(49)

Sensorless Planning

L’ algoritmo di pianificazione deve assicurare che il piano fornisca una soluzione al

problema in presenza anche in condizioni in cui non è possibile usare sensori.

Si basa sulla coercizione, cioè si assume che il mondo possa essere costretto ad

andare in un determinato stato, anche

quando si hanno informazioni parziali sullo stato corrente. Poiché la coercizione non è spesso possibile questo algoritmo non è

(50)

Sensorless Planning esempio

L’ uso della coercizione è suggerito, quando non è possibile null’altro ed è molto pericolosa l’ inattività.

Esempi si hanno in relazione alla medicina di urgenza, esempio : un paziente viene

ricoverato con febbre altissima, gli si da un

antibiotico a largo spettro e un medicinale

per abbassare la febbre prima di decidere,

in base al risultato di ulteriori accertamenti,

(51)

Sensorless Planning pianificazione

Il processo di ricerca del piano coinvolge lo “belief state space” in cui ogni stato è di fatto un insieme di stati.

Ricordate il problema dell’ aspirapolvere: esiste un piano nel caso “sensorless” e nel caso in cui vale la legge: Se

l’aspirapolvere aspira in uno stato in cui non c’è nulla da aspirare lo sporca.

(52)

(53)

Conditional Planning 1

Due situazioni possibili:

L’ambiente è completamente “osservabile” (e l’agente ovviamente può osservarlo),

L’ambiente è parzialmente “osservabile”.

In entrambi i casi va esteso il linguaggio con il quale si esprime il piano.

(54)

Conditional planning in ambiente totalmente osservabile

Estensione di Strips per introdurre il non determinismo nelle azioni.:

“or” fra due o più effetti,

cioè eseguire A ci porta in uno stato a priori non determinato sappiamo quello che

potremo conoscere solo usando la percezione.

Per esempio, nel problema dell’aspirapolvere che non funzioni correttamente:

(55)

Conditional planning in ambiente totalmente osservabile 1

Costrutto per rappresentare

“ConditionalEffects”

when“condition” : “effect”

Questo costrutto non introduce

indeterminismo, il risultato infatti è univoco in dipendenza dallo stato in cui l’ azione è

applicata

richiede una fase di percezione per testare se

“condition” è vera

(56)

Conditional planning in ambiente totalmente osservabile 3

Estensione del linguaggio per esprimere piani condizionali:

If “test” then “plan_A” else “plan_B”

Dove test è una funzione booleana sullo stato.

Sia plan_A che plan_B possono contenere

steps condizionali (ogni piano diviene un

(57)

Conditional planning

Il conditional planning è un algoritmo che affronta l’incertezza dell’informazione

introducendo nel piano fasi di “percezione” cioè fasi in cui, a execution time si interroga il mondo sul suo stato.

Queste “fasi di percezione” sono i “conditional steps ” cioè i punti in cui una particolare azione viene scelta in conseguenza del reale stato dell’

ambiente.

(58)

Algoritmo di pianificazione

Il conditionalplan deve funzionare

comunque indipendentemente da quale sarà l’ effetto reale dell’ esecuzione dell’

azione ovvero i risultati dei test

Spazio di ricerca a grafo “and” “or” in cui i nodi “or” e “and” si alternano, i nodi “or”

modellano lo stato e gli archi da essi

uscenti sono labellati con le azione e i nodi

“and” hanno come figli i possibili effetti

(59)

Rappresentazione grafica step condizionale

Esempio: "double Murphy"

vacuum cleaner sometimes deposits dirt when it moves to a clean destination square

and sometimes deposits dirt if Suck is applied to a clean square.

(60)

[Left, if AtL and CleanL and CleanR then []

else Suck].

(61)

Passi dell’ algoritmo

IL FALLIMENTO SIGNIFICA: NON C’È PIANO PRIVO DI CICLI.

(62)

Esempio piano ciclico e sue possibili descrizioni

Grafo parziale