Multi-Agent Planning: Models, Algorithms, Complexity

(1)

Multi-Agent Planning:

Models, Algorithms, Complexity

Ronen I. Brafman

Computer Science Department Ben-Gurion University

Joint work with Raz Nissim (BGU) and Carmel Domshlak (Technion)

(2)

Part II:

Multi-Agent Distributed Search,

Privacy, and Selfish Agents

(3)

Perspectives We Study

 Agents as Abstractions

 Agents as Components

Part I Part

I

(4)

Perspectives We Study

 Agents as Abstractions

 Agents as Components

 Privacy Preserving Agents with Common Goal

 Goal: A group of agents seek to solve a

common goal, but they do not want to share private information

 Example: Companies collaborating on a project, different departments in one company

 Example: Agents with complex models that are difficult to transmit, or proprietary

 Result: Fully distributed, privacy preserving,

algorithms

(5)

Perspectives We Study

 Agents as Abstractions

 Agents as Components

 Privacy Preserving Agents with Common Goal

 Selfish Agents with Shared Task

 Setting: group of utility maximizing agents with a common task

 Example: a group of contractors required for building a house

 Result: distributed, strategy-proof, socially efficient, privacy preserving mechanism

 Also: stable coalitions and stable plans

 In lay terms: a distributed protocol that leads to an optimal solution, preserves privacy, and agents have no incentive to lie or deviate from it.

(6)

Core Model: MA-STRIPS

 All variants we study are based on MA-STRIPS

 MASTRIPS = STRIPS + associate an agent with each action

 STRIPS Model:

 (States, Actions, Initial State, Goal Condition)

 MA-STRIPS Model:

 (States, Actions

1

,…,Actions

m

, Initial State, Goal

Condition)

(7)

Logistics Example

 Each truck and each airplane is an agent

 Truck actions: load into, unload from, move truck

 Load(p?,Ital-Truck,l?), Move(Ital-Truck,l1?,l2?)

 Plane actions: load into, unload from, fly plane

 Notice:

 move action involves truck only

 load action implies multiple agents

 If truck loads a package at the airport, plane cannot load it

(8)

Private and Public Variables

 Variable v is private to A

_i

if it does appear in the description of any action of another

agent

 Example: at(Ital-Track, VRN)

 All other variables are public

 Example: at(p1, TLV)

(9)

Private and Public Actions

 Action a

_i^ε

A

_i

is private to A

_i

if its description contains only private variables

 Example: move(Isra-Truck, BS, TLV)

 Private actions don’t affect nor are they affected by actions of other agents

 All other actions are public

 Example: Load(p1,Ital-Truck, BresciaPO)

 A public action may have private

precondition: at(Ital-Truck, BresciaPO)

(10)

MAP as

CSP+Planning

 CSP makes sure that agent’s public plans are consistent

 Local planning extends public plan of each agent into a full plan

 If the coordination is simple, local planning

requires planning on much smaller domains with fewer variables

 Should be exponentially easier to solver each such problem

 Main complexity factor: difficulty of solving the

global CSP

(11)

Plane

L(3,TLV),U(4,VRN),L(5,VRN),U(6 ,TLV)

Plane

L(3,TLV),U(4,VRN),L(5,VRN),U(6 ,TLV)

Isra- ^Truck

L(1,BS),U(2,TLV),L(9,TLV),U (10,BS)

Isra- ^Truck

L(1,BS),U(2,TLV),L(9,TLV),U (10,BS)

Ital-Truck

L(1,B),U(2,VRN),L(5VRN), U(6,B)

Ital-Truck

L(1,B),U(2,VRN),L(5VRN), U(6,B)

(12)

Plane

L(3,TLV),fly(TLV,VRN),U(4,VRN), L(5,VRN),fly(VRN,TLV),U(6,TLV)

Plane

L(3,TLV),fly(TLV,VRN),U(4,VRN), L(5,VRN),fly(VRN,TLV),U(6,TLV)

Isra- ^Truck

L(1,BS),Move(BS,TLV),U(2,T LV),

L(9,TLV),Move(TLV,BS), U(10,BS)

Isra- ^Truck

L(1,BS),Move(BS,TLV),U(2,T LV),

L(9,TLV),Move(TLV,BS), U(10,BS)

Ital-Truck

L(1,BR),Move(BR,VRN), U(2,VRN),

L(5,VRN),Move(VRN,BR),U(6, BR)

Ital-Truck

L(1,BR),Move(BR,VRN), U(2,VRN),

L(5,VRN),Move(VRN,BR),U(6, BR)

(13)

13

(14)

Practical Difficulties

 CSP method does not scale up well when delta > 3

 Solution: adapt state-of-the-art heuristic forward search algorithms to MAP

 Not all problems have natural distributed structure; a problem when trying to solve general single-agent problems

 Answer 1: No one said this method works on all domains

 Answer 2: Try to discover such structure, if possible

Now!Now!

Part I Part I

(15)

If You’re Using Forward

Search, Why not Centralized?

 Why not send models to one agent who will perform planning

 Less robust (prone to failures)

 Problematic if agent models are complex

 Generative action model implemented as a simulator

 Requires electing a trusted center and having it do all the work

Main Issue: Requires Giving Up Private Information

(16)

Privacy

 Agents do not always want to share all their information with the other agents

 Ad-hoc teams of agents

 Different companies

 Agents seeking to maximize their utility will lie when beneficial

 In MAP: private variables and their values and private actions may contain confidential business data, such as inventory levels or local processes

 Public actions/variables are its interface to the world

 Private actions and variables are just that – private

(17)

Multi-Agent Forward Search

 A multi-agent version of forward search

 Can be used for optimal planning (MAD-A*)

 Can be used for satisficing search (MAD-X)

 The algorithm is inherently distributed

 Has a built-in pruning effect

 Does not require sharing private

information

(18)

MAD-A*

 Each agent perform A* search with the following changes:

 Maintain own open and closed list

 Expand a node using its own actions only

 Send this node to other agents who have an operator applicable to this node

 Run a distributed optimal solution detection procedure

 Method outperforms all current MA planning algorithms

 In fact, was able to solve 6 previously unsolvable single- agent problems (most likely thanks to the pruning effect)

(19)

MAD-A*

 Each agent performs the following:

 Initialize open and closed list to empty

 Special initial agent initializes to initial state

 Retrieve first node n from open list

 If n is a solution, perform distributed optimality check

 Otherwise, expand n using own actions actions only

 Compute h-value and add to open list all children n’

 If n’ was obtained by applying a public action

 Send n’ to all agents to which n’ is relevant

 Message contains n’, g and h values, and creating action

(20)

(21)

(22)

(23)

(24)

(25)

(26)

(27)

27

(28)

Super-linear Speedup?

 Notice that in some cases we get super- linear speedup

 This implies that this is more than simple distribution

 Where does the extra power come from?

 Pruning!

(29)

Pruning Effect of MAD-A*

 Recall: agents send other agents a state only if their last action was public

 We exclude plans in which there are two adjacent private actions that belong to two different actions!

 Each legal plan has a permutation that satisfies this property and is a legal plan with the same outcome

 apr1 a_pr2 a_pr2 a_pb2 …. a_pb5

a

_pb1 = a_pr2 a_pr2 a_pb2 …. a_pb5 apr1

a

_pb1

(30)

MA-FD

 A general, scalable framework for solving the MA planning problem

 Enables developing new algorithms, heuristics, and extensions to richer planning formalism, for optimal and satisfycing planning

 Based on FastDownard [Helmert, 2006]

 FD is rich with implementations of many heuristics and algorithms

 Uses FD’s translator and preprocessor, adjusted to Multi-agent planning

 Standard input + # of agents, their names, and their IP address

 No shared memory

 Agent communicate via TCP/IP

 This allows usage on multi-core machines, networked computers/robots, cloud

 All heuristics available for FD, are immediately accessible

(31)

Maintaining Privacy

 DisCSP – immediate

 Agents only coordinate on their public actions

 Forward search

 Agent need only expand using their operators

 Agents need only know about public actions to determine state relevance

 Heuristic function is now computed locally based on the agent’s own model and the public actions of the others

 Different agents will have different heuristic values

 Not a problem!

 MAD-A* remains optimal, provided all heuristics are admissible

(32)

Selfish Agents

 Goal: getting a group of selfish agent with private information to perform cost-optimal planning

 Main contributions:

 A centralized VCG mechanism for optimal planning

 A privacy preserving, distributed mechanism which is ex-post faithful

 Ex-post faithful = it is in the best interest of each

agent to truthfully follow all aspects of the algorithm, regardless of agents’ private values, provided all

other agents follow the algorithm

(33)

Selfish Agents

 Selfish agents maximize their own utility function

 Will not perform computations and take actions that lower their utility

 May lie or diverge from agreed protocol if profitable

 We assume:

 P_i is the payment given to agent i for its participation

 Costi(π) is the cost of i’s actions in plan π

u

_i

( p , P) = P

_i

- Cost

_i

( p )

(34)

Vickrey-Clarke-Groves Mechanisms

 A general technique in mechanism design

 Agent report private information to a trusted center

 Center computes solution and a payment to be made to the agents

 Payments are computed based on the solutions to the marginal problems in which agent i does not participate.

 Guarantees

 Efficient: outputs solution with maximal sum of utility

 This is called maximizing social welfare

 Strategy-proof: truth-telling is a weakly dominant strategy

(35)

VCG Mechanism for Optimal Multi-Agent Planning

 Agents report their private actions and all action costs to a center

 Given agents’ reports, the center finds an optimal solution π^* as well as π-i* for each marginal planning problem Π-i*

 Each agent receives the following payment:

 Payments are the social cost/benefit of the agent – the impact its participation has on others’ utilities

 Main drawback: centralized, and requires revealing information to the center

P

_i

= (Cost

_j

(

j¹i

å ^p

^-i

⁾ ^{- Cost}

^j

⁽ ^p ^*))

(36)

Example – Logistics Domain

3 agents: A1,A2,A3 2 packages p1,p2, 2 locations L,R

All agents and packages at L

Package destination is R

Load/Unload costs as follows:

 P1: A1 = 1, A2 = 2, A3 = 2

 P2: A1 = 2, A2 = 1, A3 = 2

Move costs 1 for all agents

 Optimal solution cost = 6

A1 handles P1, A2 handles P2

Optimal cost of Π-A1, Π-A2 = 8

Replacing A1 or A2 by A3 adds 2

Ρ1 = Ρ2 = 8 – 3 = 5

 Optimal solution cost to Π_-A3= 6

Ρ3 = 6 – 6 = 0

P

_i

= (Cost

_j

(

j¹i

å ^p

^-i

⁾ ^{- Cost}

^j

⁽ ^p ^*))

(37)

Distributed Mechanism

 Goal: preserve privacy while maintaining VCG’s strong guarantees, and supporting a distributed

implementation

 Solution:

 The agents compute the solution as well as the payments

 We use MAD-A* to maintain privacy

 Agents have new opportunities to manipulate the mechanism beyond lying about costs, such as not performing all steps

 We prove that they have no motivation to deviate individually

(38)

Selfish MAD-A*

 Run MAD-A* on Π with all agents to find π*

 For all agents i

 Run MAD-A* on Π-i with all agents except i to find π- i*

 Agents j≠i compute and send it to the bank

 The bank computes

 Bank pays P

_i to every agent i and broadcasts π* to

all agents

 P

ij

is the cost j incurs if i is removed

P

_ij

= (Cost

_j

(

j¹i

å ^p ^*

^-i

⁾ ^-Cost

^j

⁽ ^p ^*))

P_i = P_ij

j¹i

å

(39)

Optimizing Selfish MAD-A*

 Selfish MAD-A requires running m+1 sequential* search runs, duplicating search effort

 We can improve this by executing a multi-goal search on one (distributed) search space

 States are additionally identified by the agent participating in their formation

 Goal state is treated as a candidate solution for Π

_-

I

if i does not participate in the plan leading to it

 States irrelevant to the remaining unsolved

problems are pruned

(40)

Experimental

Results

(41)

Discussion

 The person who “orders” the plan from the team has to pay more than the cost of the optimal plan

 In cases where some agent has very high value, this can be substantial

 Makes sense either if agents are replaceable

 E.g., there are many electricians and plumbers

 Or if plan optimality is very important

 E.g., a government wants to build a road quickly

 A well known issue in mechanism design that arises in routing, too

 In planning and routing we seek a least cost path from source to sink

(42)

Planning Games

 Planning games formalize and study situation where the goal is not an optimal plan, but rather a stable plan

 Intuitively, a plan is stable if the agents that perform it do not have motivation to quit the team

 There are different ways of formalizing them

 The plan must also specify payment to each agent

 Combine ideas from planning and coalitional game theory

 Coalitional game theory studies stable coalitions

 The planning model is MASTRIPS

 The algorithms extend our CSP formulation to handle stability

(43)

Summary

 Using the MASTRIPS problem, we investigated planning problems in which agents are not fully cooperative

 May want to maintain private information

 May have own utility function

 MAD-A* provides a very efficient, privacy

preserving method for solving these problems

 This method forms the basis for practical

planning algorithms in both contexts

(44)

Future Work

 How private is private?

 Agents do not explicitly report their private information, but some of it may be deduced by others based on information sent during the

algorithm

 Handling more complex models

 Interacting actions

 Uncertainty

 Task allocation – can be captured by giving different agents similar actions

 Applications

 More realistic models for certain settings

 How do we get this to work when some agents are indispensable

 How do we get this to work when some agents are replaceable

(45)

References

 Nissim & Brafman: Cost-optimal planning by self-interested agents. AAAI 2013

 Nissim, Apsel & Brafman: Tunneling and Decomposition-Based State Reduction for Optimal Planning. ECAI 2012

 Nissim & Brafman: Multi-Agent A* for Parallel and Distributed Systems. AAMAS 2012

 Brafman & Domshlak: On the complexity of planning for agents teams and its implications for single agent planning. AIJ 198: 52-71 (2013)

 Brafman, Domshlak, Engel & Tennenholtz: Transferrable Utility Planning Games. AAAI 2010

 Nissim, Brafman & Domshlak: A general, fully distributed multi- agent planning algorithm. AAMAS 2010

 Brafman, Domshlak, Engel & Tennenholtz: Planning Games.

IJCAI 2009

Multi-Agent Planning: Models, Algorithms, Complexity