Multi-Agent Planning:
Models, Algorithms, Complexity
Ronen I. Brafman
Computer Science Department Ben-Gurion University
Joint work with Raz Nissim (BGU) and Carmel Domshlak (Technion)
Part I:
Background, Model,
Complexity, and Decomposition
Part II:
Multi-Agent Distributed Search,
Privacy, and Selfish Agents
AI Planning:
Components
Input: a model (States, Actions, Initial State, Goal Condition)
Model of the set of possible states of the world
Description of the current/initial state
Model of the agent’s actions
Description of the goal/task Output: a plan
A plan of action that achieves the goal or performs the task
AI Planning
Studies: models, languages, and algorithms for
describing and solving different types of sequential decision problems
Original motivation: allow autonomous robots to achieve goals without requiring an explicit program
“R2D2, Get me some coffee”
Well known application: NASA’s Mars Exploration Rovers […]
Can be viewed as a form of automated programming
Many different application supporting autonomy and task automation
AI Planning
Classical AI Planning
Single agent
Actions are deterministic and instantaneous
Initial state is fully known
The goal is to reach one of a set of states satisfying a set of propositions
A plan is a sequence of actions that, when
applied in the initial state lead to one of the
goal states
Running Example:
The Logistics Domain
Every states of the world describes the location of packages, trucks and airplanes
Initial state:
package p1 is in the post-office in Brescia, p2 is in Hotel Canoa, and p3 is in Beer-Sheva
Truck in Verona airport, and a truck in Beer-Sheva
Airplane in Verona airport
Goal: p1 and p2 in Beer-Sheva, p3 in Hotel Canoa
Actions: we can load and unload packages into/from a truck or a plane, we can move trucks between locations in the same country, and we can fly planes between
airports
STRIPS [Fikes &
Nilsson, 71]
A concise language for describing classical AI planning problems
Recall: (States, Actions, Initial State, Goal Condition)
States – described via a set P of propositions
Any assignment to P is a possible world
Initial state: Simply one such assignment
Goal condition: partial assignment over P
Any state satisfying it is a goal state
STRIPS
Components: (States, Actions, Initial State, Goal Condition)
Actions – described by two lists
Preconditions: list of propositions
Must be true for the action to be applied in a state
Effects: list of literals
Become true following the action’s application
Propositions not mentioned in the effects list remain unchanged
Logistics Domain Example
Propositions: at(p1,BresciaPO),
at(p1,Canoa), at(p1,VRN), at(p1, TLV),
at(p1, Beer-Sheva), in(p1,ital-truck), in(p1, isra-track), in(p1, plane), at(ital-truch,
BresciaPO), …
Initial state: True propositions:
at(p1,BresciaPO), at(p2,Canoa), at(p3,BS), at(ital-truck, VRN), at(isra-truck, BS),
at(plane, VRN)
Goal: at(p1,BS) & at(p2,BS) & at(p3, Canoa)
Logistics Domain Example
Actions:
Load(package,vehicle,location)
Preconditions: at(package,location), at(vehicle,location)
Effects: -at(package, location), in(package, vehicle)
Unload -- similar
Move(vehicle,loc1,loc2)
Preconditions: at(vehicle, loc1)
Effects: -at(vehicle,loc1), at(vehicle, loc2)
Planning Ain’t Easy
STRIPS planning is hard:
PSPACE-Complete [Bylander, 1994]
NP-complete for practical purpose
(i.e., plans that are not too long)
Tractability?
In special cases with special structure under severe restrictions (e.g single effect + more restrictions)
Can “multi-agent” structure help us?
Intuitively: enables problem decomposition along agents
Multi-Agent Planning
An extremely popular area of research
Results for queries to google scholar:
Automated planning 1,620,000 Artificial intelligence
planning 1,170,000
STRIPS planning 161,000 Classical planning 936,000 Multi-agent planning 1,080,000
Multi-Agent Planning
Most work focuses on complex models and task allocation, attempting to model
realistic problems
Often models communication, resources, concurrent actions, etc.
Few attempts to define a core model, like STRIPS, and to use it to study basic
computational complexity issues and
algorithms
Our Agenda
Define simplest reasonable multi-agent planning model
Model should be easily extendible/adaptable to diverse multi-agent settings (cooperative, non-cooperative, etc.)
Use it to study the theory and practice of multi-agent planning and single-agent planning by decomposition
Complexity
Algorithms
Common core related algorithms for different variants
Classes of MA Systems Studied
Agents as Abstractions
Goal: Decompose single agent
planning into a multi-agent planning problem
Example: Logistics planning. Natural agents can be trucks and planes.
Result: Provably efficient (polytime)
planning when the abstract agents
loosely interact.
Perspectives We Study
Agents as Abstractions
Agents as Distributed Components
Goal: Distributed planning for a distributed system of agents
Example: Autonomous agents seeking survivors in a disaster area
Result: Fully distributed planning
algorithms.
Perspectives We Study
Agents as Abstractions
Agents as Distributed Components
Both cases correspond to fully cooperative, possibly distribute team of agents (real or abstract)
Focus of Part I
Perspectives We Study
Agents as Abstractions
Agents as Components
Privacy Preserving Agents with Common Goal
Goal: A group of agents seek to solve a
common goal, but they do not want to share private information
Example: Companies collaborating on a project, different departments in one company
Example: Agents with complex models that are difficult to transmit, or proprietary
Result: Fully distributed, privacy preserving,
algorithms
Perspectives We Study
Agents as Abstractions
Agents as Components
Privacy Preserving Agents with Common Goal
Selfish Agents with Shared Task
Setting: group of utility maximizing agents with a common task
Example: a group of contractors required for building a house
Result: distributed, strategy-proof, socially efficient, privacy preserving mechanism
In lay terms: a distributed protocol that leads to an optimal solution, preserves privacy, and agents have no incentive to lie or deviate from it.
Also: stable coalitions of agents and algorithms that generate stable plans
Core Model: MA-STRIPS
A minimalistic extension of STRIPS
Only change: associate each action with an agent
Formally:
STRIPS: (States, Actions, Initial State, Goal Condition)
MA-STRIPS: (States, Actions1,…,Actionsm, Initial State, Goal Condition)
All variants we study are based on MA-STRIPS
Logistics Example
Agents: every truck and every plane
Truck actions: load into, unload from, move truck
Load(p?,Ital-Truck,l?), Move(Ital-Truck,l1?,l2?)
Plane actions: : load into, unload from, fly plane
Notice:
move action involves truck only
load action involves packages and may have
implications to what other agents can or cannot do
Truck doesn’t unload package in airport plane cannot load it
Private and Public Variables
Variable v is private to A
iif it does appear in the description of any action of another
agent
Example: at(Ital-Track, VRN)
All other variables are public
Example: at(p1, TLV)
Private and Public Actions
Action a
iεA
iis private to A
iif its description contains only private variables
Example: move(Isra-Truck, BS, TLV)
Private actions don’t affect nor are they affected by actions of other agents
All other actions are public
Example: Load(p1,Ital-Truck, BresciaPO)
A public action may have private
precondition: at(Ital-Truck, BresciaPO)
Agents as Abstractions
Transform single-agent to multi-agent
Move from STRIPS to MA-STRIPS by partitioning set of agents
Agents are simply an abstraction for a set of actions
Known as factored planning, or planning by decomposition
Main question: does this additional structure of a multi-agent problem buy us anything?
Main intuition: if the “agents” have little interaction with each other, we can treat this as multiple smaller problems
Fully Decoupled Fully Decomposable
Agents have no influence on each other
An agent’s actions affect only preconditions of its actions
Each agent can plan separately for the sub-goals it can achieve
The different plans do not interact
Many smaller single agent planning problems
Exponential decrease in complexity
Example: Logistics model with UPS and Fed-Ex
Each has its own packages, trucks, and planes
We can solve each company’s problem separately
Fully Coupled No Decomposition
Each Agent’s actions impact the ability of all other agents to use their actions
Nothing to decompose
Loosely Coupled:
Some Decomposition Possible
There are interactions
But most agents affect few other agents
Trucks in different cities do not affect each other
Can we quantify the coupling level of a problem?
How does it impact the complexity of the
problem?
Coupling Level:
Agent Interaction Graph (AIG)
Nodes – agents
Edge between A
iand A
jif a
iεA
iadds/deletes a precondition of some a
jεA
j, or negates an effect of a
jIsra-Truck
Isra-Truck PlanePlane Ital-TruckItal-Truck
Isra-Truck Isra-Truck Fra-Truck Fra-Truck
Plane1 Plane1 Plane2
Plane2 Ital-TruckItal-Truck Sicily-Truck Sicily-Truck
Coupling Measure:
AIG’s Tree Width
A graph’s tree-width is a measure its cliquishness
≥ largest clique
Join-tree of a graph = tree whose nodes are sets of the graph’s nodes
Constraint 1: every clique’s nodes must reside in some join- tree node
Constraint 2: if graph node n appears in tree nodes N and N’, then it must appear in all nodes N’’ on the path from N to N’
Graph’s Treewidth: min{its joint trees} max{node size} – 1
No cycles (tree): width 1
Fully connected: width m (# of agents)
Isra-Truck Isra-Truck Fra-Truck Fra-Truck
Plane1 Plane1 Plane2
Plane2 Ital-TruckItal-Truck Sicily-Truck Sicily-Truck
Width
= 2
Isra-Truck Isra-Truck Fra-Truck Fra-Truck
Plane1 Plane1 Plane2
Plane2 Ital-TruckItal-Truck Sicily-Truck Sicily-Truck
Width
= 2
Isra-Truck
Isra-Truck PlanePlanePlanePlane Ital-TruckItal-TruckItal-TruckItal-Truck Isra-Truck
Isra-Truck
Isra-Truck Isra-Truck
Width
= 1
MAP as a CSP
Iterate over increasing values for δ: maximal number of public actions per agent in the plan
One variable, V
i Dom(Vi)= sequences of length at most δ of pairs (a
i,t
i)
ai = action, ti = (abstract) execution time
Abstract time induces an ordering over public actions
Dummy variable/agent with null plan: (a
init,0),
(a
goal,m*δ)
MAP as CSP
Global constraints:
If (a,t) appears in some sequence then for every precondition of athere exists a pair (a’,t’) such that a’ produces this precondition and t’<t
No action at t’<t”<t destroys this precondition
Local constraints:
There is a plan consisting of local agents that achieves the local preconditions of every public action in its sequence given from the state
following the previous action
A public plan for switching two packages between Beer-Sheva and
Brescia
MAP as
CSP+Planning
The CSP part (global constraints on public) actions takes care of inter-agent coordination
The rest is done by local planning by the agent
If the coordination is simple, the rest is local planning by the agents on much smaller
domains with fewer variables
Should be exponentially easier to solver each such problem
Main complexity factor: Solving the global CSP
Now with the local plans
Constraint Graph = Agent
Interaction Graph
Complexity
Exponential in Constraint Graph = AIG’s width ω
Polynomial in domain size
Domain size exponential in δ
Polynomial in local planning
Bottom line: poly-time planning, as long as the system remains loosely coupled and balanced
Loosely coupled: constant/logarithmic tree width ω
Balanced: world load reasonably balanced δ
Each agent does at most logarithmic part of the public plan
A Fully Distributed Algorithm
Any DisCSP algorithm would solve this CSP distributedly
However, none works in practice:
Assume binary constraints
Huge domains
Very weak heuristics, in comparison to
planning
A Fully Distributed Algorithm
Reformulate the CSP breaking each variable into 3
Actions var: sequences of actions
Time var: time of each action
Req var: who supplies the preconditions of each action
Agents generate plans locally
Plan requirements (preconditions) serve as “landmarks” for other agents
Backtracking to previous agents if cannot plan with these requirements
Agents are ordered based on their ability to achieve more sub-goals and based on their chances of failing (because they have many
landmarks)
Many more details for making this efficient, but it’s all subsumed by forward search
Empirical Results
43
Practical Difficulties
CSP method does not scale up well when delta > 3
Solution: adapt state-of-the-art heuristic forward search algorithms to MAP
Not all problems have natural distributed structure; a problem when trying to solve general single-agent problems
Answer 1: No one said this method works on all domains
Answer 2: Try to discover such structure, if possible
Part II Part II
NextNext
Automated Decomposition
How can we apply these methods wwe use our methods on single agent problem that do not have natural multi- agent structure?
Automated decomposition method
Natural candidate: clustering/partitioning algorithms
Problem: objective of current method not aligned with our task
Objective 1: few public actions to decrease interaction
Single agent, no public actions…
Objective 2: Many agents to increase decomposition
Objective 3: Balanced agents to prevent bottlenecks
Automated Decomposition
Our solution: find a partition that maximizes
Optimal partition computed using the METIS graph partitioning package
For some natural multi-agent domains, partition is not what you expect
E.g., few agents are grouped together
Able to partition domains without apparent multi-agent structure
Benefit, however, is not too high
G({A
i}
i=1,…,m) = PR(aÎ A
iÙa is private)* PR(aÏ A
i)
How Well DoesΓ({A i }) Predict Performance?
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0 0.1 0.2 0.3 0.4 0.5 0.6
Runtime/Max Runtime
Symmetry Score
logistics 9-0 rovers 5 satellites 6 zenotravel 9
Summary
MA-STRIPS is a simple extension of STRIPS that allows us to model diverse multi-agent planning problems
Agents: “real” agents, or abstractions that capture structure in the action set that help us decompose the problem
When the system is loosely coupled, we can decompose the problem into a global CSP + local planning
As we increase the number of agents, as long as
“interaction complexity” (ω) remains bounded and load remains balanced, we scale polynomially
References
Nissim & Brafman: Cost-optimal planning by self-interested agents. AAAI 2013
Nissim, Apsel & Brafman: Tunneling and Decomposition-Based State Reduction for Optimal Planning. ECAI 2012
Nissim & Brafman: Multi-Agent A* for Parallel and Distributed Systems.
AAMAS 2012
Brafman & Domshlak: On the complexity of planning for agents teams and its implications for single agent planning. AIJ 198: 52-71 (2013)
Brafman, Domshlak, Engel & Tennenholtz: Transferrable Utility Planning Games. AAAI 2010
Nissim, Brafman & Domshlak: A general, fully distributed multi-agent planning algorithm. AAMAS 2010
Brafman, Domshlak, Engel & Tennenholtz: Planning Games. IJCAI 2009