• Non ci sono risultati.

Multi-Agent Planning: Models, Algorithms, Complexity

N/A
N/A
Protected

Academic year: 2021

Condividi "Multi-Agent Planning: Models, Algorithms, Complexity"

Copied!
45
0
0

Testo completo

(1)

Multi-Agent Planning:

Models, Algorithms, Complexity

Ronen I. Brafman

Computer Science Department Ben-Gurion University

Joint work with Raz Nissim (BGU) and Carmel Domshlak (Technion)

(2)

Part II:

Multi-Agent Distributed Search,

Privacy, and Selfish Agents

(3)

Perspectives We Study

 Agents as Abstractions

 Agents as Components

Part I Part

I

(4)

Perspectives We Study

 Agents as Abstractions

 Agents as Components

 Privacy Preserving Agents with Common Goal

Goal: A group of agents seek to solve a

common goal, but they do not want to share private information

Example: Companies collaborating on a project, different departments in one company

Example: Agents with complex models that are difficult to transmit, or proprietary

Result: Fully distributed, privacy preserving,

algorithms

(5)

Perspectives We Study

Agents as Abstractions

Agents as Components

Privacy Preserving Agents with Common Goal

 Selfish Agents with Shared Task

Setting: group of utility maximizing agents with a common task

Example: a group of contractors required for building a house

Result: distributed, strategy-proof, socially efficient, privacy preserving mechanism

 Also: stable coalitions and stable plans

 In lay terms: a distributed protocol that leads to an optimal solution, preserves privacy, and agents have no incentive to lie or deviate from it.

(6)

Core Model: MA-STRIPS

 All variants we study are based on MA-STRIPS

 MASTRIPS = STRIPS + associate an agent with each action

 STRIPS Model:

 (States, Actions, Initial State, Goal Condition)

 MA-STRIPS Model:

 (States, Actions

1

,…,Actions

m

, Initial State, Goal

Condition)

(7)

Logistics Example

 Each truck and each airplane is an agent

 Truck actions: load into, unload from, move truck

 Load(p?,Ital-Truck,l?), Move(Ital-Truck,l1?,l2?)

 Plane actions: load into, unload from, fly plane

 Notice:

move action involves truck only

load action implies multiple agents

 If truck loads a package at the airport, plane cannot load it

(8)

Private and Public Variables

 Variable v is private to A

i

if it does appear in the description of any action of another

agent

 Example: at(Ital-Track, VRN)

 All other variables are public

 Example: at(p1, TLV)

(9)

Private and Public Actions

 Action a

iε

A

i

is private to A

i

if its description contains only private variables

 Example: move(Isra-Truck, BS, TLV)

 Private actions don’t affect nor are they affected by actions of other agents

 All other actions are public

 Example: Load(p1,Ital-Truck, BresciaPO)

 A public action may have private

precondition: at(Ital-Truck, BresciaPO)

(10)

MAP as

CSP+Planning

 CSP makes sure that agent’s public plans are consistent

 Local planning extends public plan of each agent into a full plan

 If the coordination is simple, local planning

requires planning on much smaller domains with fewer variables

 Should be exponentially easier to solver each such problem

 Main complexity factor: difficulty of solving the

global CSP

(11)

Plane

L(3,TLV),U(4,VRN),L(5,VRN),U(6 ,TLV)

Plane

L(3,TLV),U(4,VRN),L(5,VRN),U(6 ,TLV)

Isra- Truck

L(1,BS),U(2,TLV),L(9,TLV),U (10,BS)

Isra- Truck

L(1,BS),U(2,TLV),L(9,TLV),U (10,BS)

Ital-Truck

L(1,B),U(2,VRN),L(5VRN), U(6,B)

Ital-Truck

L(1,B),U(2,VRN),L(5VRN), U(6,B)

(12)

Plane

L(3,TLV),fly(TLV,VRN),U(4,VRN), L(5,VRN),fly(VRN,TLV),U(6,TLV)

Plane

L(3,TLV),fly(TLV,VRN),U(4,VRN), L(5,VRN),fly(VRN,TLV),U(6,TLV)

Isra- Truck

L(1,BS),Move(BS,TLV),U(2,T LV),

L(9,TLV),Move(TLV,BS), U(10,BS)

Isra- Truck

L(1,BS),Move(BS,TLV),U(2,T LV),

L(9,TLV),Move(TLV,BS), U(10,BS)

Ital-Truck

L(1,BR),Move(BR,VRN), U(2,VRN),

L(5,VRN),Move(VRN,BR),U(6, BR)

Ital-Truck

L(1,BR),Move(BR,VRN), U(2,VRN),

L(5,VRN),Move(VRN,BR),U(6, BR)

(13)

13

(14)

Practical Difficulties

 CSP method does not scale up well when delta > 3

 Solution: adapt state-of-the-art heuristic forward search algorithms to MAP

 Not all problems have natural distributed structure; a problem when trying to solve general single-agent problems

 Answer 1: No one said this method works on all domains

 Answer 2: Try to discover such structure, if possible

Now!Now!

Part I Part I

(15)

If You’re Using Forward

Search, Why not Centralized?

 Why not send models to one agent who will perform planning

 Less robust (prone to failures)

 Problematic if agent models are complex

Generative action model implemented as a simulator

 Requires electing a trusted center and having it do all the work

Main Issue: Requires Giving Up Private Information

(16)

Privacy

 Agents do not always want to share all their information with the other agents

 Ad-hoc teams of agents

 Different companies

 Agents seeking to maximize their utility will lie when beneficial

 In MAP: private variables and their values and private actions may contain confidential business data, such as inventory levels or local processes

 Public actions/variables are its interface to the world

 Private actions and variables are just that – private

(17)

Multi-Agent Forward Search

 A multi-agent version of forward search

 Can be used for optimal planning (MAD-A*)

 Can be used for satisficing search (MAD-X)

 The algorithm is inherently distributed

 Has a built-in pruning effect

 Does not require sharing private

information

(18)

MAD-A*

 Each agent perform A* search with the following changes:

Maintain own open and closed list

Expand a node using its own actions only

Send this node to other agents who have an operator applicable to this node

Run a distributed optimal solution detection procedure

 Method outperforms all current MA planning algorithms

 In fact, was able to solve 6 previously unsolvable single- agent problems (most likely thanks to the pruning effect)

(19)

MAD-A*

 Each agent performs the following:

 Initialize open and closed list to empty

 Special initial agent initializes to initial state

Retrieve first node n from open list

If n is a solution, perform distributed optimality check

 Otherwise, expand n using own actions actions only

Compute h-value and add to open list all children n’

If n’ was obtained by applying a public action

 Send n’ to all agents to which n’ is relevant

 Message contains n’, g and h values, and creating action

(20)
(21)
(22)
(23)
(24)
(25)
(26)
(27)

27

(28)

Super-linear Speedup?

 Notice that in some cases we get super- linear speedup

 This implies that this is more than simple distribution

 Where does the extra power come from?

 Pruning!

(29)

Pruning Effect of MAD-A*

 Recall: agents send other agents a state only if their last action was public

 We exclude plans in which there are two adjacent private actions that belong to two different actions!

 Each legal plan has a permutation that satisfies this property and is a legal plan with the same outcome

apr1 apr2 apr2 apb2 …. apb5

a

pb1 = apr2 apr2 apb2 …. apb5 apr1

a

pb1

(30)

MA-FD

 A general, scalable framework for solving the MA planning problem

Enables developing new algorithms, heuristics, and extensions to richer planning formalism, for optimal and satisfycing planning

 Based on FastDownard [Helmert, 2006]

FD is rich with implementations of many heuristics and algorithms

 Uses FD’s translator and preprocessor, adjusted to Multi-agent planning

Standard input + # of agents, their names, and their IP address

 No shared memory

 Agent communicate via TCP/IP

This allows usage on multi-core machines, networked computers/robots, cloud

 All heuristics available for FD, are immediately accessible

(31)

Maintaining Privacy

 DisCSP – immediate

Agents only coordinate on their public actions

 Forward search

Agent need only expand using their operators

Agents need only know about public actions to determine state relevance

Heuristic function is now computed locally based on the agent’s own model and the public actions of the others

 Different agents will have different heuristic values

 Not a problem!

 MAD-A* remains optimal, provided all heuristics are admissible

(32)

Selfish Agents

 Goal: getting a group of selfish agent with private information to perform cost-optimal planning

 Main contributions:

 A centralized VCG mechanism for optimal planning

 A privacy preserving, distributed mechanism which is ex-post faithful

 Ex-post faithful = it is in the best interest of each

agent to truthfully follow all aspects of the algorithm, regardless of agents’ private values, provided all

other agents follow the algorithm

(33)

Selfish Agents

 Selfish agents maximize their own utility function

 Will not perform computations and take actions that lower their utility

 May lie or diverge from agreed protocol if profitable

 We assume:

Pi is the payment given to agent i for its participation

Costi(π) is the cost of i’s actions in plan π

u

i

( p , P) = P

i

- Cost

i

( p )

(34)

Vickrey-Clarke-Groves Mechanisms

 A general technique in mechanism design

 Agent report private information to a trusted center

 Center computes solution and a payment to be made to the agents

Payments are computed based on the solutions to the marginal problems in which agent i does not participate.

 Guarantees

Efficient: outputs solution with maximal sum of utility

 This is called maximizing social welfare

Strategy-proof: truth-telling is a weakly dominant strategy

(35)

VCG Mechanism for Optimal Multi-Agent Planning

 Agents report their private actions and all action costs to a center

 Given agents’ reports, the center finds an optimal solution π* as well as π-i* for each marginal planning problem Π-i*

 Each agent receives the following payment:

 Payments are the social cost/benefit of the agent – the impact its participation has on others’ utilities

 Main drawback: centralized, and requires revealing information to the center

P

i

= (Cost

j

(

j¹i

å p

-i

) - Cost

j

( p *))

(36)

Example – Logistics Domain

3 agents: A1,A2,A3 2 packages p1,p2, 2 locations L,R

All agents and packages at L

Package destination is R

Load/Unload costs as follows:

P1: A1 = 1, A2 = 2, A3 = 2

P2: A1 = 2, A2 = 1, A3 = 2

Move costs 1 for all agents

 Optimal solution cost = 6

A1 handles P1, A2 handles P2

Optimal cost of Π-A1, Π-A2 = 8

Replacing A1 or A2 by A3 adds 2

Ρ1 = Ρ2 = 8 – 3 = 5

 Optimal solution cost to Π-A3 = 6

Ρ3 = 6 – 6 = 0

P

i

= (Cost

j

(

j¹i

å p

-i

) - Cost

j

( p *))

(37)

Distributed Mechanism

 Goal: preserve privacy while maintaining VCG’s strong guarantees, and supporting a distributed

implementation

 Solution:

 The agents compute the solution as well as the payments

 We use MAD-A* to maintain privacy

 Agents have new opportunities to manipulate the mechanism beyond lying about costs, such as not performing all steps

 We prove that they have no motivation to deviate individually

(38)

Selfish MAD-A*

 Run MAD-A* on Π with all agents to find π*

 For all agents i

 Run MAD-A* on Π-i with all agents except i to find π- i*

 Agents j≠i compute and send it to the bank

 The bank computes

 Bank pays P

i to every agent i and broadcasts π* to

all agents

 P

ij

is the cost j incurs if i is removed

P

ij

= (Cost

j

(

j¹i

å p *

-i

) -Cost

j

( p *))

Pi = Pij

j¹i

å

(39)

Optimizing Selfish MAD-A*

 Selfish MAD-A* requires running m+1 sequential search runs, duplicating search effort

 We can improve this by executing a multi-goal search on one (distributed) search space

 States are additionally identified by the agent participating in their formation

 Goal state is treated as a candidate solution for Π

-

I

if i does not participate in the plan leading to it

 States irrelevant to the remaining unsolved

problems are pruned

(40)

Experimental

Results

(41)

Discussion

 The person who “orders” the plan from the team has to pay more than the cost of the optimal plan

In cases where some agent has very high value, this can be substantial

Makes sense either if agents are replaceable

 E.g., there are many electricians and plumbers

Or if plan optimality is very important

 E.g., a government wants to build a road quickly

 A well known issue in mechanism design that arises in routing, too

In planning and routing we seek a least cost path from source to sink

(42)

Planning Games

 Planning games formalize and study situation where the goal is not an optimal plan, but rather a stable plan

 Intuitively, a plan is stable if the agents that perform it do not have motivation to quit the team

There are different ways of formalizing them

The plan must also specify payment to each agent

 Combine ideas from planning and coalitional game theory

Coalitional game theory studies stable coalitions

The planning model is MASTRIPS

The algorithms extend our CSP formulation to handle stability

(43)

Summary

 Using the MASTRIPS problem, we investigated planning problems in which agents are not fully cooperative

 May want to maintain private information

 May have own utility function

 MAD-A* provides a very efficient, privacy

preserving method for solving these problems

 This method forms the basis for practical

planning algorithms in both contexts

(44)

Future Work

 How private is private?

Agents do not explicitly report their private information, but some of it may be deduced by others based on information sent during the

algorithm

 Handling more complex models

Interacting actions

Uncertainty

Task allocation – can be captured by giving different agents similar actions

 Applications

 More realistic models for certain settings

How do we get this to work when some agents are indispensable

How do we get this to work when some agents are replaceable

(45)

References

 Nissim & Brafman: Cost-optimal planning by self-interested agents. AAAI 2013

 Nissim, Apsel & Brafman: Tunneling and Decomposition-Based State Reduction for Optimal Planning. ECAI 2012

 Nissim & Brafman: Multi-Agent A* for Parallel and Distributed Systems. AAMAS 2012

 Brafman & Domshlak: On the complexity of planning for agents teams and its implications for single agent planning. AIJ 198: 52-71 (2013)

 Brafman, Domshlak, Engel & Tennenholtz: Transferrable Utility Planning Games. AAAI 2010

 Nissim, Brafman & Domshlak: A general, fully distributed multi- agent planning algorithm. AAMAS 2010

 Brafman, Domshlak, Engel & Tennenholtz: Planning Games.

IJCAI 2009

Riferimenti

Documenti correlati

Regardless, the fast winds – which are the ones most likely to affect their host galaxy due to their large velocity and therefore the most interesting in terms of

 There is a plan consisting of local agents that achieves the local preconditions of every public action in its sequence given from the state. following the

Nevertheless, reasoning about knowledge and beliefs is not as direct as reasoning on the “physical” state of the world. That is because expressing, for example, belief relations

It is interesting to notice how the Gradient Projection and the Dual Ascent - only generators as agents - algorithms are not influenced by this kind of error, because they estimate

Non lasciarmi” è un film drammatico – ispirato al romanzo di Kazuo Ishiguro - con la regia di Mark Romanek e interpretato da Carey Mulligan, Andrew Garfield e Keira

spesso riunite sotto un unico metodo, chiamato DFMA. La combinazione dei due metodi parte con la valutazione dei problemi di montaggio e passa poi a quelli di fabbricazione,

However, implementation of reinforcement learning techniques to multi-agent sys- tems have some additional challenges [ 58 , 59 ]. There are two ways to implement reinforcement

The challenges associated with team formation involve three principle problems: determining how agents will be allocated to address the high-level problem, maintaining consistency