Sistemi per il Governo dei Robot
Silvia Rossi - Lezione 5
Planning graph
Directed, leveled graph with alternating layers of nodes
Odd layers represent candidate
propositions that might hold at time I
Even layers represent candidate actions that we might possibly execute at time I, including maintain actions
We can only execute one real action at any step, but the data structure keeps track of
Basic idea
Construct a graph that encodes constraints on possible plans
Use this “planning graph” to constrain search for a valid plan
Planning graph can be built for each
problem in relatively short time
Problem handled by GraphPlan
STRIPS operators: conjunctive
preconditions, no conditional or universal effects, no negations
Finds “shortest” plans (by some definition)
Sound, complete and will terminate with
failure if there is no plan.
Planning graph
Nodes divided into “levels”, arcs go from one level to the next
For each time period, a state level and an action level
Arcs represent preconditions, adds
and deletes
What actions and what literals?
Add an action in level Ai if all its
preconditions are present in level Pi
Add a precondition in level Pi if it is the effect of some action in level Ai-1
(including no-ops – frame problem)
Level P0 has all the literals from the initial
state
Simple NASA domain
Literals:
at X L X is at location L fuel R rocket has fuel in X R X is in the rocket
Actions:
load X L load X onto rocket at location L (X and R must be at L)
unload X L unload X from rocket at location L (R must be at L, X must be in L)
move L L’ move rocket from L to L’
(R must be at L and have fuel; moving uses up the fuel)
Graph representation:
Example planning graph
AT A L AT B L AT R L FUEL R
LOAD A L LOAD B L MOVE L P
IN A R IN B R FUEL R
AT A L AT B L
AT R L
AT R P
LOAD A L LOAD B L MOVE L P
MOVE P L
AT A L AT B L AT R L FUEL R
IN A R IN B R AT R P
UNLOAD A P UNLOAD B P
AT A P AT B P
Exclusion relations (mutexes)
Two actions (or literals) are mutually exclusive at some stage if no valid plan could contain
both.
Can quickly find and mark some mutexes:
Inconsistent Effects: one action negates an effect of the other
Interference: One of the effects of one action is the negation of the precondition of the other
Competing needs: If some precond of one action is mutex with a precond of the other action
Inconsistent support: Two propositions are mutex if all
Example planning graph
AT A L AT B L AT R L FUEL R
LOAD A L LOAD B L MOVE L P
IN A R IN B R FUEL R
AT A L AT B L
AT R L
AT R P
LOAD A L LOAD B L MOVE L P
MOVE P L
AT A L AT B L AT R L FUEL R
IN A R IN B R AT R P
UNLOAD A P UNLOAD B P
AT A P AT B P NOP
NOP NOP NOP
NOPNOP NOP NOP
NOP NOP NOP INCONSISTENT EFFECTS
MOVE L P
MOVE P L
Example planning graph
PROPS ACTIONS PROPS ACTIONS PROPS ACTIONS GOALS
AT A L AT B L AT R L FUEL R
LOAD A L LOAD B L MOVE L P
IN A R IN B R FUEL R
AT A L AT B L
AT R L
AT R P
LOAD A L LOAD B L MOVE L P
MOVE P L
AT A L AT B L AT R L FUEL R
IN A R IN B R AT R P
UNLOAD A P UNLOAD B P
AT A P AT B P NOP
NOP NOP NOP
NOPNOP NOP NOP
NOP NOP NOP
A slightly different planning graph…
AT A L AT B P AT R L FUEL R
LOAD A L
MOVE L P
IN A R FUEL R
AT A L AT B P AT R L
AT R P
LOAD A L LOAD B P MOVE L P
MOVE P L
AT A L AT B L AT R L FUEL R
IN A R IN B R AT R P
UNLOAD A P UNLOAD B P
AT A P AT B P NOP
NOP NOP NOP
NOPNOP NOP
NOP NOP NOP MOVE L P
NOP
AT R L
AT R P
LOAD A L LOAD B P
NOP
COMPETING NEEDS
INTERFERENCE
INCONSISTENT SUPPORT
Mutex: Summary
Actions A,B are mutex at level i if no valid plan could possibly contain both at i:
inconsistent effects: A’s effects negate one of B’s effects
interference: A’s effects delete one of B’s preconditions, or vice- versa
competing needs: A & B have mutex preconditions
Propositions P,Q are mutex at level i if no valid plan could possibly contain both at i
inconsistent support: If at i, all ways to achieve P exclude all ways to achieve Q
GraphPlan algorithm (without termination)
Grow the planning graph (PG) until all goals are reachable and not mutex. (If PG levels off first, fail)
Search the PG for a valid plan
If none found, add a level to the PG and try
again
Valid plans
A valid plan is a planning graph where:
Actions at the same level don’t interfere (delete each other’s preconditions or add effects)
Each action’s preconditions are made true by the plan
Goals are satisfied
Extending the planning graph
Action level: For each possible instantiation of each operator (including no-ops), add if all of its preconditions are in the previous level and no
two are mutex.
Proposition level: Add all effects of actions in previous level (including no-ops).
! Mark pairs of propositions mutex if all ways to create one are mutex of all ways to create the
other
Creating the planning graph is usually fast Theorem 1:
! The size of the t-level PG and the time to create it are polynomial in t, n (= number of objects),
! m (= number of operators) and p (=
propositions in the initial state)
Searching for a plan
Backward chain on the planning graph
Complete all goals at one level before going back:
At each level, pick a subset of actions to achieve the goals that are not mutex. Their preconditions become the goals at the earlier level.
Build subset by picking each goal and choosing an action to add.
Use one already selected if possible.
Do forward checking on remaining goals.
SATplan
Formulate the planning problem as a CSP Assume that the plan has k actions
Create a binary variable for each possible action a:
Action(a,i) (TRUE if action a is used at step i)
Create variables for each proposition that can hold at different points in time:
Proposition(p,i) (TRUE if proposition p holds at step i)
Constraints
Only one action can be executed at each time step (XOR constraints)
Constraints describing effects of actions
Persistence: if an action does not change a proposition p, then p’s value remains
unchanged
A proposition is true at step i only if some
action (possibly a maintain action) made it true
Constraints for initial state and goal state
Planning as CSP
Constraint-satisfaction problem (CSP)
Given:set of discrete variables,
domains of the variables, and
constraints on the specific values a set of variables can take in combination,
Find an assignment of values to all the variables which respects all constraints
Compile the planning problem as a constraint-satisfaction problem (CSP)
Use the planning graph to define a CSP
Representing the Planning Graph as a CSP
Summary: classical planning
Representations in planning Representation of action:
preconditions + effects Approaches:
State-space search Plan-space search
Constraint-based search
Real-world planning domains
Real-world domains are complex and don’t satisfy the assumptions of STRIPS or partial-order planning methods
Some of the characteristics we may need to deal with:
Modeling and reasoning about resources Representing and reasoning about time
Planning at different levels of abstractions Conditional outcomes of actions
Uncertain outcomes of actions Exogenous events
Incremental plan development Dynamic real-time replanning
Planning vs. scheduling
Planning: given one or more goal, generate a sequence of actions to achieve the goal(s)
Scheduling: given a set of actions and constraints, allocate resources and assign times to the actions so that no constraints are violated
Traditionally, planning is done with specialized logical reasoning methods
Traditionally, scheduling is done with constraint satisfaction, linear programming, or OR methods
However, planning and scheduling are closely interrelated and can’t always be separated
Hierarchical Planning Brief History
Originally developed about 25 years ago Knowledge-based Scalable
Task Hierarchy is a form of domain-specific knowledge
Practical, applied to real world problems (linear time vs. exponential)
Lack of theoretical understanding until
early 1990’s
Hierarchical Decomposition
HTN Planning
Capture hierarchical structure of the planning domain
Planning domain contains non-primitive actions and schemas for reducing them
Reduction schemas:
given by the designer
express preferred ways to accomplish a
task
Hierarchical decomposition
Hierarchical decomposition, or hierarchical task
network (HTN) planning, uses abstract operators to incrementally decompose a planning problem from a high-level goal statement to a primitive plan network
Primitive operators represent actions that are executable, and can appear in the final plan
Non-primitive operators represent goals
(equivalently, abstract actions) that require further decomposition (or operationalization) to be executed
There is no “right” set of primitive actions: One agent’s goals are another agent’s actions!
HTN planning: example
Hierarchical Decomposition
FROM A HIGH-LEVEL ACTION TO A
PARTIALLY ORDERED SET OF LOWER-
LEVEL ACTIONS.
HTN Planning
A plan library contains both primitive and non-primitive actions.
Non-primitive actions have external
preconditions, as well as external effects.
Sometimes useful to distinguish
between primary effects and secondary
effects.
HTN Formalization (1)
State: list of ground atoms Tasks:
Primitive tasks: do[f(x1, …, xn)]
Non-primitive tasks:
Goal task: achieve(l) (l is a literal) Compound task: perform[t(x1, …, xn)]
Plan Library
Operator:
[operator f(x1, …, xn) (pre: l1, …, ln) (post: l’1, …, l’n)]
Method: Decompose(α, d)
α is a non-primitive task and d is a task network
Task Reduction
Example action descriptions
Action(BuyLand, PRECOND:money, EFFECT: Land ⋀ ⌉Money)
Action(GetLoan, PRECOND: GoodCredit, EFFECT: Money ⋀ Mortage) Action(BuildHouse, PRECOND:Land, EFFECT: House)
Action(GetPermit, PRECOND: Land, EFFECT: Permit) Action(HireBuilder, EFFECT: Contract)
Action(Construction, PRECOND: Permit ⋀ Contract, EFFECT: HouseBuild ⋀ ⌉Permit)
Example action descriptions
Decompose(BuildHouse,
Plan(STEPS:{S1: GetPermit, S2: HireBuilder, S3: Construction, S4: PayBuilder}
ORDERINGS: {Start ≺ S1 ≺ S3 ≺ S4 ≺ Finish, Start ≺ S2 ≺ S3,
LINKS: {Start Land S1, Start Money S4, S1 Permit S3,
S2 Contract S3, S3 HouseBuilt S4,
Correctness
• A decomposition should be a correct implementation of the action.
• A plan d implements an action a correctly if d is a complete and consistent partial-order plan for the problem of achieving the effects of a given the preconditions of a (result of a sound POP).
• The plan library contains several decompositions for any high- level action.
• Each decomposition might have different preconditions and
effects. The preconditions of the high-level action should be the
HTN Planning Algorithm (intuition)
Problem reduction:
Decompose tasks into subtasks Handle constraints
Resolve interactions
If necessary, backtrack and try other
decompositions
Basic HTN Procedure
1. Input a planning problem P
2. If P contains only primitive tasks, then
resolve the conflicts and return the result. If the conflicts cannot be resolved, return
failure
3. Choose a non-primitive task t in P 4. Choose an expansion for t
5. Replace t with the expansion
6. Find interactions among tasks in P and
suggest ways to handle them. Choose one.
7. Go to 2
For each decomposition d of an action a
Remove the high level action, and insert/
reuse
actions for each action in d Reuse -> subtask sharing
Merge the ordering constraints (If there is an ordering constraint of the form B ≺ a,
should every step of d come after B?
Comments on HTN planning
The major idea is to gain efficiency by using the library of preconstructed plans.
When there is recursion, it is undecidable even if the underlying state space is finite.
recursion can be ruled out
the length of solutions can be bound
SHOP (Simple Hierarchical Ordered Planner) Domain-independent algorithm for
Ordered Task Decomposition
Sound/complete
Input:
State: a set of ground atoms Task List: a linear list of tasks
Domain: methods, operators, axioms
Output: one or more plans, it can return:
the first plan it finds all possible plans
a least-cost plan
all least-cost plans
Initial task list: ((travel home park))
Initial state: ((at home) (cash 20) (distance home park 8)) Methods (task, preconditions, subtasks):
(:method (travel ?x ?y)
((at x) (walking-distance ?x ?y)) ' ((!walk ?x ?y)) 1) (:method (travel ?x ?y)
((at ?x) (have-taxi-fare ?x ?y))
' ((!call-taxi ?x) (!ride ?x ?y) (!pay-driver ?x ?y)) 1) Axioms:
(:- (walking-dist ?x ?y) ((distance ?x ?y ?d) (eval (<= ?d 5)))) (:- (have-taxi-fare ?x ?y)
((have-cash ?c) (distance ?x ?y ?d) (eval (>= ?c (+ 1.50 ?d)))) Primitive operators (task, delete list, add list)
(:operator (!walk ?x ?y) ((at ?x)) ((at ?y)))
…
Simple Example
OPTIONAL COST;
DEFAULT IS 1
Precond: Precond:
(TRAVEL HOME PARK)
(!WALK HOME PARK)
(!CALL-TAXI HOME)(!RIDE HOME PARK)(!PAY-DRIVER HOME PARK)
FAIL (DISTANCE > 5)
SUCCEED (WE HAVE $20, AND THE FARE IS ONLY
$9.50) SUCCEED
SUCCEED (AT HOME) (walking-distance
Home park)
(have-taxi-fare home park)
(AT PARK) (CASH 10.50)
(DISTANCE HOME
Simple Example (Continued)
(AT HOME)
INITIAL STATE:
FINAL STATE:
(AT HOME) (CASH 20)
(DISTANCE HOME PARK 8)
The SHOP Algorithm
procedure SHOP (state S, task-list T, domain D) 1. if T = nil then return nil
2. t1 = the first task in T
3. U = the remaining tasks in T
4. if t is primitive & an operator instance o matches t1 then 5. P = SHOP (o(S), U, D)
6. if P = FAIL then return FAIL 7. return cons(o,P)
8. else if t is non-primitive
& a method instance m matches t1 in S
& m’s preconditions can be inferred from S then 9. return SHOP (S, append (m(t1), U), D)
10. else
11. return FAIL
12. end if
state S; task list T=( t1 ,t2,…) state o(S) ; task list T=(t2, …)
task list T=( t1 ,t2,…) NONDETERMINIST IC CHOICE AMONG
ALL METHODS M WHOSE
PRECONDITIONS CAN BE INFERRED
FROM S
Pianificare e agire in domini non deterministici
In questo contesto l’ indeterminismo è dovuto a conoscenza incompleta o non corretta
Due situazioni di indeterminismo:
Indeterminismo delimitato, e indeterminismo senza limitazioni
Bounded Indeterminancy
Le azioni possono avere effetti impredicibili
ma in ogni caso compresi in un insieme noto e finito di casi.
Due modi per pianificare in presenza di bounded determinacy:
Conformant Planning ovvero pianificazione senza fase di percezione (sensorless planning) e
Conditional Planning (contingency planning).
Sensorless Planning
L’ algoritmo di pianificazione deve assicurare che il piano fornisca una soluzione al
problema in presenza anche in condizioni in cui non è possibile usare sensori.
Si basa sulla coercizione, cioè si assume che il mondo possa essere costretto ad
andare in un determinato stato, anche
quando si hanno informazioni parziali sullo stato corrente. Poiché la coercizione non è spesso possibile questo algoritmo non è
Sensorless Planning esempio
L’ uso della coercizione è suggerito, quando non è possibile null’altro ed è molto pericolosa l’ inattività.
Esempi si hanno in relazione alla medicina di urgenza, esempio : un paziente viene
ricoverato con febbre altissima, gli si da un
antibiotico a largo spettro e un medicinale
per abbassare la febbre prima di decidere,
in base al risultato di ulteriori accertamenti,
Sensorless Planning pianificazione
Il processo di ricerca del piano coinvolge lo “belief state space” in cui ogni stato è di fatto un insieme di stati.
Ricordate il problema dell’ aspirapolvere: esiste un piano nel caso “sensorless” e nel caso in cui vale la legge: Se
l’aspirapolvere aspira in uno stato in cui non c’è nulla da aspirare lo sporca.
Conditional Planning 1
Due situazioni possibili:
L’ambiente è completamente “osservabile” (e l’agente ovviamente può osservarlo),
L’ambiente è parzialmente “osservabile”.
In entrambi i casi va esteso il linguaggio con il quale si esprime il piano.
Conditional planning in ambiente totalmente osservabile
Estensione di Strips per introdurre il non determinismo nelle azioni.:
“or” fra due o più effetti,
cioè eseguire A ci porta in uno stato a priori non determinato sappiamo quello che
potremo conoscere solo usando la percezione.
Per esempio, nel problema dell’aspirapolvere che non funzioni correttamente:
Conditional planning in ambiente totalmente osservabile 1
Costrutto per rappresentare
“ConditionalEffects”
when“condition” : “effect”
Questo costrutto non introduce
indeterminismo, il risultato infatti è univoco in dipendenza dallo stato in cui l’ azione è
applicata
richiede una fase di percezione per testare se
“condition” è vera
Conditional planning in ambiente totalmente osservabile 3
Estensione del linguaggio per esprimere piani condizionali:
If “test” then “plan_A” else “plan_B”
Dove test è una funzione booleana sullo stato.
Sia plan_A che plan_B possono contenere
steps condizionali (ogni piano diviene un
Conditional planning
Il conditional planning è un algoritmo che affronta l’incertezza dell’informazione
introducendo nel piano fasi di “percezione” cioè fasi in cui, a execution time si interroga il mondo sul suo stato.
Queste “fasi di percezione” sono i “conditional steps ” cioè i punti in cui una particolare azione viene scelta in conseguenza del reale stato dell’
ambiente.
Algoritmo di pianificazione
Il conditionalplan deve funzionare
comunque indipendentemente da quale sarà l’ effetto reale dell’ esecuzione dell’
azione ovvero i risultati dei test
Spazio di ricerca a grafo “and” “or” in cui i nodi “or” e “and” si alternano, i nodi “or”
modellano lo stato e gli archi da essi
uscenti sono labellati con le azione e i nodi
“and” hanno come figli i possibili effetti
Rappresentazione grafica step condizionale
Esempio: "double Murphy"
vacuum cleaner sometimes deposits dirt when it moves to a clean destination square
and sometimes deposits dirt if Suck is applied to a clean square.
[Left, if AtL and CleanL and CleanR then []
else Suck].
Passi dell’ algoritmo
IL FALLIMENTO SIGNIFICA: NON C’È PIANO PRIVO DI CICLI.
Esempio piano ciclico e sue possibili descrizioni
Grafo parziale