• Non ci sono risultati.

Adversarial patrolling with spatially uncertain alarm signals

N/A
N/A
Protected

Academic year: 2021

Condividi "Adversarial patrolling with spatially uncertain alarm signals"

Copied!
38
0
0

Testo completo

(1)

Contents lists available atScienceDirect

Artificial

Intelligence

www.elsevier.com/locate/artint

Adversarial

patrolling

with spatially

uncertain

alarm

signals

Nicola Basilico

a

,

, Giuseppe De Nittis

b

, Nicola Gatti

b

aDepartmentofComputerScience,UniversityofMilan,Milano,Italy

bDipartimentodiElettronica,InformazioneeBioingegneria,PolitecnicodiMilano,Milano,Italy

a

r

t

i

c

l

e

i

n

f

o

a

b

s

t

r

a

c

t

Articlehistory:

Received 15 June 2015

Received in revised form 6 August 2016 Accepted 26 February 2017

Available online 4 March 2017

Keywords:

Security games Adversarial patrolling Algorithmic game theory

When securing complex infrastructures orlarge environments, constantsurveillance of everyarea is not affordable.To cope withthisissue, a commoncountermeasureis the usageof cheapbut wide-rangedsensors, abletodetect suspicious events thatoccur in largeareas,supportingpatrollerstoimprovetheeffectivenessoftheirstrategies.However, suchsensors are commonly affectedby uncertainty. In the present paper, wefocus on spatiallyuncertainalarmsignals.Thatis,thealarmsystemisabletodetect anattackbut itisuncertainontheexactposition wheretheattackistakingplace.Thisiscommonwhen theareatobesecurediswide,suchasinborderpatrollingandfairsitesurveillance.We propose,tothebestofourknowledge,thefirstPatrollingSecurityGamewhereaDefender

issupportedbyaspatiallyuncertainalarmsystem,whichnon-deterministicallygenerates

signals onceatargetisunderattack.WeshowthatfindingtheoptimalstrategyisFNP-hard evenintreegraphsandAPX-hardinarbitrarygraphs.Weprovidetwo(exponentialtime) exactalgorithms andtwo (polynomialtime)approximationalgorithms.Finally,weshow that,withoutfalsepositivesandmisseddetections,thebestpatrollingstrategyreducesto stayinaplace,waitforasignal,and respondtoitatbest.Thisstrategyisoptimal even withnon-negligiblemisseddetectionrates,which,unfortunately,affecteverycommercial alarmsystem. We evaluate our methods in simulation,assessing both quantitativeand qualitativeaspects.

©2017ElsevierB.V.Allrightsreserved.

1. Introduction

SecurityGamesmodelthetaskofprotectingphysicalenvironmentsasanon-cooperativegamebetweena

Defender and

an

Attacker

[1].Thesegamesusuallytakeplaceundera

Stackelberg (a.k.a. leader–follower)

paradigm[2],wheretheDefender (leader) commits to astrategy and the Attacker(follower) first observes such commitmentandthen best responds to it. As discussed in the seminal work [3],finding a leader–follower equilibrium is computationally tractable in games with one follower andcomplete information, while it becomes hard in Bayesian games with differenttypes of Attacker. The availability of such computationallytractable aspects of SecurityGames led tothe development ofalgorithms capable of scalingup tolargeproblems,makingthemdeployableinthesecurityenforcingsystemsofseveralreal-worldapplications. The first notable examples are the deployment ofpolice checkpoints atthe Los AngelsInternationalAirport [4] andthe schedulingoffederalairmarshalsovertheU.S.domesticairlineflights[5].Morerecentcasestudiesincludethepositioning of U.S.CoastGuardpatrols to securecrowdedplaces,bridges, andferries [6]andthearrangementof cityguards tostop fare evasioninLos AngelesMetro [7].Finally,a similar approachisbeingtested andevaluated inUganda, Africa, forthe

*

Corresponding author.

E-mailaddress:[email protected](N. Basilico). http://dx.doi.org/10.1016/j.artint.2017.02.007 0004-3702/©2017 Elsevier B.V. All rights reserved.

(2)

protection of wildlife [8]. Thus, Security Games emerged asan interesting game theoretical tool andthen showed their on-the-fieldeffectivenessinanumberofrealsecurityscenarios.

We focus ona specific classof security games, calledPatrolling Security Games. Thesegames are modeled as infinite-horizon extensive-form games in which the Defender controls one or more patrollers moving within an environment, represented asa finite graph.The Attacker, besides havingknowledge of the strategy to whichthe Defender committed to, can observethe movementsof thepatrollers atany time andusesuch informationin decidingthe mostconvenient timeandtargetlocationtoattack[9].Whenmultiplepatrollersareavailable,coordinatingthematbestisingeneralahard taskwhich,besidescomputationalaspects,mustalsokeepintoaccountcommunicationissues[10].However,thepatrolling problemis tractable, even withmultiple patrollers,in border security(e.g., lineand cyclegraphs), when patrollershave homogeneousmovingandsensingcapabilitiesandalltheverticescomposingthebordersharethesamefeatures[11]. Scal-ing thismodelinvolved thestudy ofhowto compute patrollingstrategies inscenarios wherethe Attacker isallowed to performmultipleattacks[12].Similarly,coordinationstrategiesamongmultipledefendersare investigatedin[13].In[14], theauthors studythecaseinwhich thereisa temporaldiscount onthe targets. Extensions arediscussed in[15], where coordination strategiesbetweendefenders areexplored, in[16],wherea resourcecan covermultipletargets, andin[17]

whereattackscanbe detectedatdifferentstageswithdifferentassociated utilities.Finally,some theoreticalresultsabout propertiesofspecific patrollingsettingsareprovided in[18]. Inthepresentpaper,weprovide anewmodelofPatrolling SecurityGamesinwhichtheDefenderissupportedbyan

alarm system deployed

intheenvironment.

1.1. Motivating scenarios

Often,inlargeenvironments,aconstantsurveillanceofeveryareaisnotaffordablewhilefocused inspectionstriggered by alarms are more convenient.Real-world applications includeUAVs surveillance oflarge infrastructures [19], wildfires detectionwithCCDcameras[20],agriculturalfieldsmonitoring[21],surveillancebased onwirelesssensornetworks[22], andborderpatrolling[23].Alarmsystemsareinpracticeaffectedby

detection uncertainty,

e.g.,misseddetectionsandfalse positives,and

localization (a.k.a.

spatial)uncertainty,e.g.,thealarmsystemisuncertainabouttheexacttargetunderattack. Wesummarilydescribetwopracticalsecurityproblemsthatcanbeascribedtothiscategory.Wereportthemasexamples, presentingfeatures andrequirementsthatourmodelcan properlydealwith. Intherestofthe paperwewill necessarily take a generalstance,butwe encourage thereaderto keep inmind thesetwo casesasreferenceapplications fora real deploymentofourmodel.

1.1.1. Fight to illegal poaching

Poachingisawidespreadenvironmentalcrimethatcausestheendangermentofwildlifeinseveralregionsoftheworld. Itsdevastatingimpactmakesthedevelopmentofsurveillancetechniquestocontrastthiskindofactivitiesoneofthemost importantmattersinnationalandinternationaldebates. Poachingtypicallytakesplaceovervastandsavageareas,making itcostly andineffective tosolelyrelyon persistentpatrolby rangersquads.Toovercomethisissue,recentdevelopments havefocused onprovidingrangers withenvironmental monitoringsystemstobetter plantheir inspections, concentrating theminareaswithlargelikelihoodofspottingacrime.SuchsystemsincludetheuseofUAVsflyingoverthearea,alarmed fences, andon-the-field sensors trying to recognize anomalous activities.1 In all these cases, technologiesare meant to

workasanalarmsystem:oncetheillegalactivityisrecognized,asignal issenttotherangersbasestationfromwherea responseisundertaken.Inthegreatmajorityofcases,asignalcorrespondstoaspatiallyuncertainlocalizationoftheillegal activity.Forexample,acamera-equippedUAVcanspotthepresenceofapickupinaforbiddenarea butcannotderivethe actuallocationtowhichpoachersaremoving.Inthesameway,alarmed fencesandsensorscanonlytransmitthelocation ofviolated entrancesorforbiddenpassages.Inall thesecasesasignal impliesarestricted,yetnotprecise,localizationof thepoachingactivity.TheuseofSecurityGamesinthisparticulardomainisnotnew(see,forexample,[8]).However, our model allows thecomputation of alarm responsestrategies fora given alarmsystem deployed on thefield. This can be doneby adoptingadiscretizationoftheenvironment,whereeach targetcorrespondstoasector, valuesare relatedtothe expectedpopulationofanimalsinthatsector,anddeadlinesrepresenttheexpectedcompletiontimeofillegalhunts(these parameterscanbederivedfromdata,asdiscussedin[8]).

1.1.2. Safety of fair sites

Fairsarelargepubliceventsattendedbythousandsofvisitors,wheretheproblemofguaranteeingsafetyforthehosting facilitiescanbeveryhard.Forexample,Expo2015,therecentUniversalExpositionhostedinMilan,Italy,sawanaverageof about100,000visitsperday.Thisposestheneedforcarefullyaddressing safetyrisks,whichcanalsoderivefromplanned act of vandalism or terrorist attacks. Besides security guards patrols, fair sites are often endowed with locally installed monitoringsystems.Expo2015employedaround200bafflegatesand400metaldetectorsattheentranceofthesite.The internalareawasconstantlymonitoredby4000surveillancecamerasandby700guards.Likely,whenoneormoreofthese devices/personnelidentifiedasecuritybreach,asignalwassenttothecontrolroomtogetherwithacircumscribedrequest of intervention. This approach is required because, especially in this kind of environments, detecting a security breach

(3)

andneutralizing itare verydifferenttasks. Thelatterone,in particular,usually requiresagreater effort involvingspecial equipmentandpersonnelwhoseemploymentonademandbasisismuchmoreconvenient.Moreover,thedetectinglocation ofa threatisinmanycasesdifferentfromthelocation whereitcouldbe neutralized,making therequestofintervention spatially uncertain.Forinstance, consider asecurity guard ora surveillance camera detecting the visitors’ reactions to a shootingrampageperformedbysomeattacker.Inexampleslikethese, wecanrestrict theareawherethesecuritybreach happenedbutnopreciseinformationaboutthelocationcanbegatheredsincetheattackerwillprobablyhavemoved.Our modelcouldbe appliedtoprovideapolicy withwhichscheduleinterventionsuponasecuritybreachisdetectedinsome particular section ofthesite. Insuch case, targetscouldcorrespond tobuildings orother installationswhere visitorscan go.Values anddeadlinescanbechosen accordingtotheimportance oftargets, theirexpectedcrowding,andtherequired responsepriority.

1.2. Alarms and security games

While theproblemofmanaging a sensornetworkto optimallyguard security-criticalinfrastructures hasbeen investi-gatedinrestricteddomains,e.g.[24],theproblemofintegratingalarmsignalstogetherwithadversarialpatrollingisalmost completelyunexplored.Theonlyworkthatcanbeclassifiedunderthisscopeis[25].Thepaperproposesaskeletonmodel of analarm systemwheresensors havenospatial uncertaintyindetecting attackson single targets. Theauthors analyse howsensoryinformationcanimprovetheeffectivenessofpatrollingstrategiesinadversarialsettings.Theyshowthat,when sensors arenotaffectedby falsenegativesandfalse positives,thebeststrategyprescribes thatthepatrollerjustresponds toan alarmsignalrushingtothetarget underattack withoutpatrollingtheenvironment.Asaconsequence,insuch cases themodeltreatmentbecomes trivial.Ontheother hand,whensensorsareaffectedonlyby falsenegatives,thetreatment canbecarriedoutbymeansofaneasyvariationofthealgorithmforthecasewithoutsensors[9].Inthelastcase,where false positivesareadmitted,theproblembecomescomputationally intractable.Tothebestofourknowledge,noprevious resultdealingwithspatialuncertainalarmsignalsinadversarialpatrollingispresentintheliterature.

Effectively exploitinganalarm systemanddetermining agood deploymentforit(e.g.,selecting thelocation whereto installsensors)arecomplementarybutradicallydifferentproblems.Theresultsweprovideinthisworklieinthescopeof the firstone while thetreatment ofthesecond one isleft forfuture works.In other words,we assume thata deployed alarm systemis givenandwe dealwiththeproblemofstrategicallyexploitingitatbest.Anyapproach tosearch forthe optimaldeploymentshouldknow,inprinciple,howtoevaluatepossibledeployments.Insuchsense,ourproblemneedsto beaddressedbeforeonemightdealwiththedeploymentone.

1.3. Contributions

In thispaper,wepropose thefirst SecurityGamethat integratesaspatially uncertainalarmsystemin game-theoretic settings forpatrolling.2 Analarmsignalcarriestheinformationaboutthesetoftargetsthatcan beunderattackanditis

describedbytheprobabilityofbeinggeneratedwheneachtargetisattacked.Theanalysisandthemainresultswepropose inthisworkaredevotedtothebasicgamemodelwheretheDefendercancontrolasinglepatrollerandthealarmsystem is immune tofalse positivesandfalse negatives,making spatialuncertainty itsonly limitation.As ourresultsshow,such assumptions donot playdownthe significanceofthearisingcomputational challengeswhoseresolutionisaprerequisite formorecomplexsettings.Indeed,extensionsofthismodelthatgeneralizetomulti-resourcesettings[27]andthatconsider falsenegatives[28]arebuiltonthebasicresultprovidedinthiswork.Thegameweconsidercanbedecomposedinafinite numberoffinite-horizonsubgames,eachcalledSignalResponseGamefrom

v (SRG-v)

andcapturingthesituationinwhich theDefenderisinavertexv and theAttackerattackedatarget,andaninfinite-horizongame,calledPatrollingGame(PG), in whichtheDefendermovesinabsence ofanyalarmsignal. Weshow thatSRG-v is FNP-hardfortree-basedtopologies andthat,forarbitrarygraphs,isAPX-hard.Weprovidetwoexactalgorithms.Thefirstone,basedondynamicprogramming, performsabreadth-firstsearch,whilethesecondone,basedonbranch-and-boundapproach,performsadepth-firstsearch. Weusethesametwoapproachestodesigntwoapproximationalgorithms.

Then,westudythePG,showingthatwhennofalsepositivesandnomisseddetectionsarepresent,theoptimalDefender strategy isto stayina fixedlocation, waitforasignal, andrespondtoitatbest.Thisstrategy keepsbeingoptimaleven whennon-negligiblemisseddetectionratesareallowed.Weexperimentallyevaluatethescalabilityofourexactalgorithms andwecomparethemwiththeapproximation onesintermsofsolutionqualityandcomputetimes,investigatinginhard instances thegapbetweenourhardness resultsandthetheoreticalguaranteesofourapproximationalgorithms.Weshow thatourapproximationalgorithmsprovideveryhighqualitysolutionseveninhardinstances.Finally,weprovideanexample ofresolutionforarealisticinstance,basedonExpo2015,andweshowthat ourexactalgorithmscanbeappliedforsuch kind ofsettings.Moreover, inourrealistic instancewe assesshow theoptimalpatrolling strategycoincides witha static placementevenwhenallowingafalsenegativerateoflessorequalto30%.

(4)

Fig. 1. Example of patrolling setting. 1.4. Paper structure

InSection2,weintroduceourgamemodel.InSection3,westudytheproblemoffindingthestrategyoftheDefender forrespondingtoanalarmsignal.InSection4,westudythepatrollingproblem. InSection5,weexperimentallyevaluate our algorithms.In Section 6, we briefly discussthe mainSecurity Games research directions that havebeenexplored in thelastdecades.Finally,Section7concludesthepaper.

Appendix A

includesthemosttechnicalproofsofthepaperwhile

Appendix Bdiscussessomeparticularresultsobtainedforspecialclassesofinstances.

Appendix C

reportssome additional experimentalresultsand

Appendix D

providesatablesummarizingthenotationsymbolsusedinthepaper.

2. Problemstatement

Inthissectionweformalizetheproblemwestudy.Moreprecisely,inSection2.1wedescribethepatrollingsettingand thegamemodel,whileinSection2.2westatethecomputationalquestionsweshalladdressinthiswork.

2.1. Game model

Initially,inSection2.1.1,weintroduceabasicpatrollingsecuritygamemodelintegratingthemainfeaturesfrommodels currently studied in literature. Next, in Section 2.1.2, we extend our game model by introducing alarm signals. In Sec-tion 2.1.3, we depict the game tree ofour patrolling security game withalarm signalsand we decompose it in notable subgamestofacilitateitsstudy.Toeasepresentation,wesummarizedournotationsymbolsin

Table D.3

.

2.1.1. Basic patrolling security game

As iscustomary inthe Artificial Intelligenceliterature [9,14], we dealwithdiscrete, both intermsof spaceandtime, patrollingsettings,representinganapproximationofacontinuousenvironment.Specifically,wemodeltheenvironmentto be patrolledasan undirectedconnectedgraph G

= (

V

,

E

)

,wherevertices representdifferentareasconnectedby various corridors/roads represented through theedges. Timeis discretized inturns. Edges areassigned weights representingthe numberoftimestepsneededtotraversethem.Ifnotstatedotherwise,weshallassumeunitaryweights.With

ω

i,j weshall denotethe shortesttime togo fromi to j. Wedefine T

V the subset ofsensiblevertices,called

targets,

that mustbe protectedfrompossibleattacks.Eachtarget

t

T is characterizedbyavalue

π

(

t

)

∈ (

0

,

1

]

andapenetrationtime

d

(

t

)

∈ N

+, whichmeasuresthenumberofturnsneededtocompleteanattackto t.

Example1. We report in Fig. 1 an example of patrolling setting. Here, V

= {

v0

,

v1

,

v2

,

v3

,

v4

}

, T

= {

t1

,

t2

,

t3

,

t4

}

where

ti

=

vifor

i

∈ {

1

,

2

,

3

,

4

}

.Allthetargets

t present

thesamevalue

π

(

t

)

andthesamepenetrationtime

d

(

t

)

. Ateachturn,an

Attacker

A

anda

Defender

D

playsimultaneouslyhavingthefollowingavailableactions:

if

A

hasnot attackedinthepreviousturns,it canobservethepositionof

D

inthegraph

G

3 anddecide whetherto

attackatargetortowaitfora turn.Theattackisinstantaneous,meaningthatthereisnodelaybetweenthedecision toattackandtheactualpresenceofathreatintheselectedtarget4;

D

hasnoinformationabouttheactionsundertakenby

A

inpreviousturnsandselectsthenextvertextopatrolamong thoseadjacenttothecurrentone;eachmovementisanon-preemptivetraversalofasingleedge

(

v

,

v

)

E.

Thegamemayconcludeincorrespondenceofanyofthetwofollowingevents.Thefirstoneiswhen

D

patrolsatarget

t that isunderattackby

A

fromlessthan

d

(

t

)

turns.Insuch casetheattack ispreventedand

A

iscaptured.Thesecond 3 Partial observability of Aover the position of Dcan be introduced, as discussed in[29].

4 This is a worst-case assumption according to which Ais as strong as possible. It can be relaxed by associating execution costs to the A’s actions, as

(5)

Fig. 2. Examples of alarm systems.

one iswhentarget

t is

attackedand

D

doesnotpatrol

t during

the

d

(

t

)

turnsthat followthebeginning oftheattack. In such case, theattack issuccessfuland

A

escapes withoutbeingcaptured.When

A

iscaptured,

D

receivesa utility of1 and

A

receives autility of0.When anattack to

t has

success,

D

receives1

π

(

t

)

and

A

receives

π

(

t

)

.The gamemay notconcludeif

A

decidestowaitforeveryobservedpositionof

D

,thusneverattacking.Insuchcase,

D

receives1 and

A

receives 0.(Anotherintuitive waytothinkatthispayoffstructure istoassume that

D

receivesan initialutility of1 and then loses

π

(

t

)

whenevertarget

t is

successfullycompromised.)Notice thatthegameis constantsumandthereforeit is equivalenttoazero-sumgamethroughapositiveaffinetransformation.

The above gamemodel isinextensive form(beingplayed sequentially), withimperfectinformation (

D

not observing the actions undertaken by

A

), and withinfinite horizon(

A

beingin the position to wait forever). The fact that

A

can observetheactions undertakenby

D

beforeactingmakesthe

leader–follower equilibrium

thenaturalsolutionconceptfor ourproblem, where

D

isthe

leader and

A

isthe

follower.

Sincewe focusonzero-sum games,theleader’s strategyatthe leader–follower equilibriumis its maxminstrategy andit canbe found by employing linear mathematicalprogramming, whichrequirespolynomialtimeinthenumberofactionsavailabletotheplayers[31].

2.1.2. Introducing alarm signals

Weextendthegamemodeldescribedintheprevioussectionbyintroducinga

spatial uncertain alarm system that

canbe exploitedby

D

.The basicideais thatan alarmsystemusesa numberofsensorsspread overtheenvironment togather informationaboutpossibleattacksandraisesanalarmsignalatanytimeanattackoccurs.Thealarmsignalprovidessome information aboutthe location (target)wherethe attackis ongoing,butit isaffectedby uncertainty.In otherwords, the alarmsystemdetectsanattackbutitisuncertainaboutthetargetunderattack.Formally,thealarmsystemisdefinedasa pair

(

S

,

p

)

,where

S

= {

s1

,

· · · ,

sm

}

isasetof

m

1 signals and p

:

S

×

T

→ [

0

,

1

]

isafunctionthatspecifiestheprobability ofhavingthesystemgeneratingasignal s given thattarget

t has

beenattacked,wedenotesuch probabilitywith p

(

s

|

t

)

. With aslightabuseofnotation, forasignal

s we

define T

(

s

)

= {

t

T

|

p

(

s

|

t

)

>

0

}

and,similarly, foratarget

t we

have

S

(

t

)

= {

s

S

|

p

(

s

|

t

)

>

0

}

.Inthiswork,weassumethat:

thealarmsystemisnotaffectedbyfalsepositives(signalsgeneratedwhennoattackhasoccurred).Formally,

p

(

s

| )

=

0,where



indicatesthatnotargetsareunderattack;

thealarmsystemisnotaffectedbyfalsenegatives(signalsnotgeneratedeventhoughanattackhasoccurred).Formally,

p

(

|

t

)

=

0, where

indicates that no signals have been generated; in Section 4 we will show that the optimal strategieswecomputeunderthisassumptioncanpreserveoptimalityeveninpresenceofnon-negligiblefalsenegatives rates.

Example2.We report two examples of alarm systemsfor the patrolling settingdepicted in

Fig. 1

. The first example is reportedin

Fig. 2

(a).Itisalow-accuracyalarmsystemthatgeneratesthesamesignalanytimeatargetisunderattackand thereforethatdoesnotprovideanyinformationaboutthetargetunderattack.Thesecondexampleisreportedin

Fig. 2

(b). Itprovidesmoreaccurateinformationaboutthelocalizationoftheattackthanthepreviousexample.Here,thereceptionof asignal

s

i implies,underauniformstrategyof

A

,ahighprobabilityofanattackontarget

t

i.Namely,when

t

iisattacked thealarmsystemgenerates

s

i withhighprobabilityandadifferentsignalotherwise.

(6)

Giventhepresence ofanalarm systemdefinedasabove, thegamemechanismchanges inthefollowingway.At each turn, before deciding its next move,

D

observes whether ornot a signal hasbeen generated by the alarm systemand then makes its decision considering such information. This introduces in our game a node of chance implementing the non-deterministicselectionofsignals.

2.1.3. The game tree and its decomposition

Herewedepict thegametreeofourgamemodel,decomposing itinsomerecurrentsubgames.Aportionofthegame isdepictedin

Fig. 3

.Suchtreecanbereadalongthefollowingsteps.

Root of the tree.

A

decideswhethertowaitforaturn(thisactionisdenotedas



toindicatethatnotargetisattacked) ortoattackatarget

t

T (this actionisdenotedbythelabel

t of

theattackedtarget).

Second level of the tree.

N

denotesthealarmsystem,representedbyanature-typeplayer.Itsbehavioris

a priori specified

bytheconditional probabilitymassfunction p, whichdeterminesthegeneratedsignal giventheattackperformedby

A

.Inparticular,itisusefultodistinguishbetweentwocases:

(a) if

A

doesnotattack(action



),thennosignalwillbegeneratedundertheassumptionthat p

(

| )

=

1; (b) if

A

attackstarget

t,

asignal

s will

bedrawnfrom

S

(

t

)

withprobability

p

(

s

|

t

)

(recallthat

p

(⊥

|

t

)

=

0).

Third level of the tree.

D

observesthesignalgeneratedbythealarmsystemanddecidesthenextvertextopatrolamong thoseadjacenttothecurrentone(thecurrentvertexisinitiallychosenby

D

).

Fourth level of the tree and on. It isusefultodistinguishbetweentwocases:

(a) ifnoattackispresent,thenthetreeofthesubgamestartingfromhereisthesameofthetreeofthewholegame, exceptforthepositionof

D

thatmaybe differentfromtheinitialone(noticethatinthiscaseeachof

A

’sdecision nodesisasingletoninformationset,modelingthefactthat

A

canobserve

D

’sposition);

(b) ifanattackistakingplaceontarget

t,

thenonly

D

canact.

Suchgametreecanbedecomposedinanumberoffiniterecurrentsubgamessuchthatthebeststrategiesoftheagents ineachsubgame areindependentfromthoseinothersubgames. Thisdecompositionallowsustoapplya

divide et impera

approach,simplifyingtheresolutionoftheproblemoffindinganequilibrium.Moreprecisely,wedenotewith



oneofthese subgames.Wedefine



asagamesubtreethatcanbeextractedfromthetreeof

Fig. 3

asfollows.Given

D

’scurrentvertex

v

V , selectadecisionnodefor

A

andcallit

i.

Then,extractthesubtreerootedin

i discarding

thebranchcorrespondingto action



(noattack).5Intuitively,suchsubgamemodelstheplayers’interactionwhentheDefenderisinsomegivenvertex v and theAttacker willperforman attack. Asa consequence,each



obtainedinsuch wayis finite(once anattack on

t

started,themaximumlengthofthegameis

d

(

t

)

).Moreover,thesetof

different



swecanextractisfinitesincewehaveone subgameforeachpossiblecurrentvertexfor

D

.As aconsequence,we canextractatmost

|

V

|

differentsubgames. Notice that,duetotheinfinitehorizon,eachsubgamecanrecuran infinitenumberoftimesalongthegametree.However,being suchrepetitionsindependentandthegamezero-sum,weonlyneedtosolveonesubgametoobtaintheoptimalstrategyto beappliedineachofitsrepetitions.Inotherwords,whenassumingthatanattackwillbeperformed,theagents’strategies can be split in a number of independent strategiessolely depending on thecurrent position of

D

. The reasonwhy we discarded thebranchcorresponding toaction



ineach subgameis thatwe seektodeal withsuch complementary case exploitingasimplebackwardinductionapproach,asexplainedlater.

First,wecall

Signal

Response Game from v the subgame



definedasaboveandcharacterizedbyavertex

v representing

thecurrentvertexof

D

(forbrevity,weshalluseSRG-v).InanSRG-v,thegoalof

D

istofindthebeststrategystartingfrom vertex v torespondtoanyalarmsignal.AlltheSRG-vsareindependentandthusthebeststrategyineach subgamedoes notdependonthestrategiesoftheothersubgames.TheintuitionisthatthebeststrategyinanSRG-v doesnotdependon theverticesvisitedby

D

before theattack.Givenan SRG-v,we denoteby

σ

vD,s thestrategyof

D

onceobservedsignal

s,

by

σ

vD thestrategyprofile

σ

vD

= (

σ

vD,s1

. . . ,

σ

vD,sm

)

of

D

,andby

σ

vAthestrategyof

A

.LetusnoticethatinanSRG-v,given

asignal

s,

D

istheonlyagent thatplaysandthereforeeachsequenceofmovescan becollapsedina singleaction.Thus, SRG-v isessentiallyatwo-levelgameinwhich

A

decidesthetargettoattackand

D

decidesthesequenceofmovesonthe graph.

Then,accordingtoclassicalbackward inductionarguments[32],oncewehavefoundthebeststrategiesofeach SRG-v, we can substitute the subgameswith the agents’ equilibriumutilities andthen we can find the best strategy of

D

for patrollingthe verticeswheneverno alarmsignal has beenraised andthebest strategyof attack for

A

.We call thislast problem

Patrolling Game (for

conciseness,we shallusePG).We denoteby

σ

D and

σ

A thestrategiesof

D

and

A

, respec-tively,inthePG.

5 Rigorously speaking, our definition of subgame is not compliant with the definition provided in Game Theory[32], which requires that all the actions

of a node belong to the same subgame (and therefore we could not separate action from actions t).

However, we can slightly change the structure of

our game making our definition of subgame compliant with the one from game theory. More precisely, it is sufficient to split each node of Ainto two nodes: in the first Adecides to attack a target or to wait for one turn, and in the second, conditioned to the fact that Adecided to attack, Adecides which target to attack. This way, the subtree whose root is the second node of Ais a subgame compliant with game theory. It can be easily observed that this change to the game tree structure does not affect the behavior of the agents.

(7)

N. Basilico et al. / Artificial Int elligence 246 (2017) 220–257

Fig. 3. Gametree,v isassumedtothebecurrentvertexforD,r isacollapsedsequenceofmoves,calledroute,thatDperformsonthegraph,dashedlinesindicateinformationsets,andfunctionUA(ri,tj)

(8)

2.2. The computational questions we pose

Ourcontributionsfocusonthedesignofalgorithmstofindan equilibriumforourgameanddevelopalongfourcentral questions.Thefirstonestemsdirectlyfromourproblemdefinition.

Question1.

Which is the best patrolling strategy for

D

maximizing its expected utility?

Clearly,thisproblemisrelatedtowhatwecalledPGinourgamedecomposition.Inordertobuildan answer,wepose otherthreequestionsthat,instead,involvetheothersubgamecalledSRG-v.

Question2.

Given a starting vertex v and a signal s, is there any strategy allowing

D

to visit all the targets in T

(

s

)

, each within its deadline?

Question3.

Given a starting vertex v and a signal s, is there any pure strategy giving

D

an expected utility of at least k? Question4.

Given a starting vertex v and a signal s, is there any mixed strategy giving

D

an expected utility of at least k?

These questions are not independent from each other. In particular, answering to the last three is a prerequisite to solving the first one. For thisreason, we shall take a bottom-up approach answering the above questions starting from the last three andthen dealing withthe first one atthe whole-gamelevel. Also, Questions 3 and 4 are not easierthan

Question 2sohardnessresultsforthislastonecanbeextendedtotheothers.

3. Signalresponsegame

InthissectionwestartbydealingwiththeresolutionofSRG-v.Specifically,inSection3.1weprovethehardnessofthe problem, analyzingits computationalcomplexity.Then, inSection 3.2andinSection 3.3we proposetwo algorithms,the firstbasedon

dynamic programming (breadth-first

search)whilethesecondadoptsa

branch and bound (depth-first

search) paradigm. Furthermore,we provide a variation for each algorithm, approximating the optimal solution. For the sake of presentation,themosttechnicalproofsarereportedinAppendix A.

3.1. Complexity results

Theaimofthissectionistoassessthecomputationalcomplexityoffindinganexactorapproximateequilibriumforour gamemodel.Furthermore,weaimatidentifyingthesourceofhardnessofourproblemandthemostefficientalgorithmfor solvingit.

Each SRG-v is characterized by

|

T

|

actions available to

A

, each corresponding to a target t, and by O

(

|

V

|

maxt{d(t)}

)

decisionnodesof

D

.Theportionofgametreeplayedby

D

canbesafelyreducedbyobservingthat

D

willmovebetween anytwo targets along a shortestpath.This allows usto discardfrom thetree all the decisionnodes where

D

occupies a non-targetvertex. Nevertheless, this reduction keeps the size of the game tree exponential in the parameters of the game,specifically O

(

|

T

|

|T|

)

.6 Noticethat,theexponentialsizeofthegametreedoesnotconstituteaproofthatfindingthe equilibriumstrategiesofanSRG-v requiresexponentialtimeintheparametersofthegamebecauseitdoesnotexcludethe existence ofsome compactrepresentationof

D

’sstrategies,e.g., Markovianstrategies.Thefirst resultweprovide implies thatsucharepresentationisveryunlikelytoexistorthat,ifitexists,itunlikelycanbecomputedinpolynomialtime.Call

gv theexpectedutilityof

A

fromSRG-v (1

gv isthenthecorrespondingutilityfor

D

).Wedefinethefollowingproblem.

Definition1

(k–SRG-v).

Thedecisionproblem

k–SRG-v is

definedas: INSTANCE:aninstanceofSRG-v;

QUESTION:isthereany

σ

D suchthat,when

A

playsitsbestresponse,itholdsthat gv

k?

Theorem1.

k–SRG-v is

NP-hard even when

|

S

|

=

1 and

the graph is a tree.

7

Theaboveresultshowsthatw.r.t.treegraphs:

answeringto

Question 1

isFNP-hard,

answeringto

Questions 2,

3,4isNP-hard. 6 A more accurate bound is O(|T|min{|T|,maxt{d(t)}}).

(9)

Forthe sake ofcompleteness,noticethat a maxminstrategywithsupport upperboundedby

|

T

|

(that is,thenumberof actionsavailabletotheminplayer

A

)alwaysexists[33].

Now,giventhat weestablishedthatthemainsourceofhardnessoftheproblemiscomputingthestrategyspaceof

D

, wefocusontheproblemoffindinganefficientrepresentationforit.Westartwithsomedefinitions.

Definition2

(Route).

Givenastartingvertexv and asignal

s,

aroute

r is

afinitesequenceofverticeswhere,called

r

(

i

)

the

i-th vertex,

r

(

0

)

=

v and r

(

i

)

T

(

s

)

forany

i

>

0.Withaslightoverloadofnotation, call T

(

r

)

thesetoftargetsfromthe sequence.

We restrict ourattention toa special class ofroutes that we call covering. Fora route r and i

>

0, define Ar

(

r

(

i

))

=



i−1

h=0

ω

r(h),r(h+1).Such value gives thetime neededby

D

toarrive attarget

r

(

i

)

after following agraph walk along the

shortestpathsbetweenconsecutiveverticesofthesequence

r

(

0

),

r

(

1

),

. . . ,

r

(

i

)

.

Definition3

(Covering route).

Acoveringrouteisaroute

r where

Ar

(

t

)

d

(

t

)

forany

t

T

(

r

)

.

Coveringroutesareabstractionsfor

D

’savailablepure strategieswhenthecurrentvertexis v and asignal s has been generated.If

r is

acoveringrouteforvertex

v and

signal

s and

T

(

r

)

T

(

s

)

isthesetoftargetsin

r,

then wecanalways instantiate

r to

agraphwalkfor

D

(asequenceofmoveson G) thatguarantees tocapture

A

onanytarget in

T

(

r

)

.Such walk issimplyobtainedbystarting from

r

(

0

)

=

v and thencoveringanyshortestpathbetween

r

(

i

)

and

r

(

i

+

1

)

.Thetotal temporalcostofsuchwalkisexpressedby

c

(

r

)

=

Ar

(

r

(|

T

(

r

)|))

.Weshallinformallyrefertosuchprocessas

walking a route r.

Coveringroutesthen constitutethe actionspacefor

D

is aSRG-v game.Evenwhenconsidering asingle signal

s,

the numbersuchofsuchroutesis O

(

|

T

(

s

)

|

|T(s)|

)

intheworst case.However,somecoveringrouteswillneverbeplayedby

D

duetoanyofthefollowingtwodominancearguments[34] anddiscardingdominatedroutesiscrucialinthedesignofan efficientalgorithm.

Definition4 (Intra-set dominance). Given a starting vertex v, a signal s and two different covering routes r

,

r such that

T

(

r

)

=

T

(

r

)

,if

c

(

r

)

c

(

r

)

then

r dominates r

.

Definition5

(Inter-set dominance).

Givenastartingvertex

v,

asignal

s and

twodifferentcoveringroutes

r

,

r,ifT

(

r

)

T

(

r

)

then

r dominates r

.

Furthermore,itisconvenienttointroducetheconceptof

covering set,

whichisstrictlyrelatedtotheconceptofcovering route.Itisdefinedasfollows.

Definition6

(Covering set).

Givenastartingvertex

v and

asignal

s,

acoveringset(covset) Q is asubset ofT

(

s

)

suchthat thereexistsacoveringroute

r with T

(

r

)

=

Q .

Let usfocuson Definition 4.It suggeststhat we can safelyuseonly one route per coveringset. Coveringsets suffice for describingall the outcomesofthe game,since theagents’payoffs dependonly onthe fact that

A

attacksa target t

that is coveredby

D

ornot, andinthe worst caseare O

(

2|T(s)|

)

,witha remarkablereduction ofthe search spacew.r.t. O

(

|

T

(

s

)

|

|T(s)|

)

. However, any algorithmrestricting on covering sets should be able todetermine whether ornot a set of

targetsisacoveringone,whichisadifficultproblemaswell.

Definition7

(COV-SET).

ThedecisionproblemCOV-SETisdefinedas: INSTANCE:aninstanceofSRG-v withatargetset

T ;

QUESTION:is

T a

coveringset?(Equivalently,doesT admit anycoveringroute

r?)

Theorem2.

COV-SET is

NP-complete.

Therefore,computingacoveringrouteforagivensetoftargets(ordecidingthatnocoveringrouteexists)isnotdoable in polynomial time unlessP

=

NP. Thisshowsthat, whilecovering sets sufficefordefining the payoffsofthe gameand thereforethesizeofthepayoffsmatrixcanbeboundedbythenumberofcoveringsets,inpracticewealsoneedcovering routes to certify that a givensubset of targets is covering. The impossibility to confine our algorithms to the space of covering setsseems tosuggesta complexityworse than O

(

2|T(s)|

)

.However,inthenext section weprovidean algorithm

withcomplexity

O

(

2|T(s)|

)

(neglectingpolynomialterms)toenumerateallandonlythecoveringsetsand,foreachofthem,

theassociatedcoveringroutewithminimumcost.

Let usfocuson Definition 5. Inter-Setdominancecan be leveraged tointroduce theconcept of

maximal covering

sets (androutes)whichcouldenableafurtherreductioninthesetofactionsavailableto

D

.

(10)

Definition8

(MAXIMALITY).

Givenacoveringset Q

=

T

(

r

)

forsome

r,

wesaythat Q and r are maximalifthereisnoother coveringroute

r

suchthat Q

T

(

r

)

.

In the best case, when there is a route covering all the targets, the numberof maximal covering sets is 1 (and we can safelyrestrict to a single minimum cost coveringroute over that set), while thenumber ofcovering sets (including the non-maximal ones) is 2|T(s)|. Thus, considering only maximal covering sets allows an exponential reduction of the payoffsmatrix.In theworst case, whenall the possiblesubsetscomposed of

|

T

(

s

)|/

2 targets are maximal coveringsets, thenumberofmaximalcoveringsetsis

O

(

2|T(s)|−2

)

,whilethenumberofcoveringsetsis

O

(

2|T(s)|−1

)

,allowingareduction

ofthepayoffsmatrixby afactorof2.Furthermore,ifweknew

a priori that

Q is a maximalcoveringset,wecould avoid searching forcovering routes for anyset of targetsthat strictly contains Q . When designing an algorithm to solve this problem,

Definition 5

couldthenbeexploitedtointroducepruningtechniquestosaveaveragecomputetime.However,the followingresultshowsthatdecidingifacoveringsetismaximalishard.

Definition9

(MAX–COV-SET).

ThedecisionproblemMAX–COV-SETisdefinedas:

INSTANCE:aninstanceofSRG-v withatargetset T and acoveringset

T



T ;

QUESTION:is Tmaximal?

Theorem3.

There is no polynomial-time algorithm for MAX–COV-SET unless

P

=

NP.

Nevertheless,weshow hereafter thatthereexists analgorithm computingallandonlythemaximal coveringsetsand oneroute foreach ofthem (whichpotentiallyleads toanexponentialreduction ofthetime neededforsolvingthelinear program) with only an additional polynomial cost w.r.t. the enumeration of all the covering sets. Therefore, neglecting polynomialterms,ouralgorithmhasacomplexityof

O

(

2|T(s)|

)

.

Finally,wefocuson thecomplexityofapproximatingthebest solutioninanSRG-v.When

D

restrictsitsstrategiesto bepure,theproblemisclearlynotapproximableinpolynomialtimeevenwhentheapproximationratiodependson

|

T

(

s

)|

. Thebasicintuitionisthat,ifagameinstanceadmitsthemaximal coveringroutethatcoversall thetargetsandthevalue ofall the targetsis 1,then eitherthe maximalcovering route is played returninga utility of1 to

D

oranyother route isplayed returninga utilityof 0,butno polynomial-timealgorithmcanfindthe maximalcovering routecovering allthe targets,unlessP

=

NP.Ontheotherhand,itisinterestingtoinvestigatethecaseinwhichnorestrictiontopurestrategies ispresent.Weshowthattheproblemkeepsbeinghard.

Theorem4.

The

optimization version of k–SRG-v, say OPT–SRG-v, is APX-hard even in the restricted case in which the graph is metric, there is only one signal s, all targets t

T

(

s

)

have the same penetration time d

(

t

)

, and there exists a maximal covering route covering all the targets.

Theabovetheoremallowsustomaketwoimportantremarks.

Remark1. The above result does not exclude the existence of constant-ratio approximation algorithms for OPT–SRG-v.

We conjecture that itis unlikely.OPT–SRG-v presents similarities withthe (metric)DEADLINE–TSP, where thegoal isto findthelongestpathofverticeseach traversed beforeitsdeadline. TheDEADLINE–TSPdoesnot admitanyconstant-ratio approximationalgorithm [35]andthebest-knownapproximationalgorithm haslogarithmicapproximation ratio[36].The followingobservationscanbeproducedabouttherelationshipsbetweenOPT–SRG-v andDEADLINE–TSP:

whenthe maximal route coveringall the targetsintheOPT–SRG-v exists, the optimalsolutionof theOPT–SRG-v is alsooptimalfortheDEADLINE–TSPappliedtothesamegraph;

when themaximal route covering all thetargets inthe OPT–SRG-v doesnot exist,the optimalsolutions ofthe two problemsaredifferent,evenwhenwerestrictustopure-strategysolutionsfortheOPT–SRG-v;

approximatingtheoptimalsolutionoftheDEADLINE–TSPdoesnotgiveadirecttechnique toapproximateOPT–SRG-v, since we should enumerateall the subsets of targetsand foreach subset of targetswe wouldneed to execute the approximation of the DEADLINE–TSP, butthis would require exponential time. We notice in addition that even the totalnumberofsetsoftargetswithlogarithmicsizeisnotpolynomial,being

(

2log2(|T|)

)

,andthereforeanyalgorithm

enumeratingthemwouldrequireexponentialtime;

whentheoptimalsolutionoftheOPT–SRG-v israndomized,examplesofoptimalsolutionsinwhichmaximalcovering routes are not played can be produced, showing that at the optimum it is not strictly necessary to play maximal coveringroutes,butevenapproximationssuffice.

Remark2.Ifitwerepossibletomap DEADLINE–TSPinstancestoOPT–SRG-v instanceswherethemaximalcoveringroute

covering all thetargetsexists, then itwould trivially followthat OPT–SRG-v doesnot admit anyconstant-approximation ratio.Wewerenotabletofindsuchamappingandwe conjecturethat,ifthereisan approximation-preservingreduction

(11)

fromDEADLINE–TSPtoOPT–SRG-v,thenwecannotrestrict tosuch instances.ThestudyofinstancesofOPT–SRG-v where mixedstrategiesmaybeoptimalmakethetreatmentverychallenging.

3.2. Dynamic-programming algorithm

Westartbypresentingtwoalgorithms.Thefirstoneisexact,whilethesecondoneisanapproximationalgorithm.Both algorithmsarebasedonadynamicprogrammingapproach.

3.2.1. Exact algorithm

Inthissectionweprovideanalgorithmtocomputethestrategiesavailableto

D

when

v is

thecurrentvertexandsignal

s has beengeneratedbythealarmsystem.Theideaistoadoptadynamicprogrammingmethodthatenumeratescovering setsofincreasingcardinalities.Eachcoveringsetisobtainedbyasequenceofexpansionsthat,startingfromtheemptyset, add onetarget ateach iteration.Wedenoteacovering setby Qk

v,t where

k indicates

itscardinalitywhile v and t denote the startingvertexof

D

andthelasttarget addedtothe set,respectively.The algorithmoperatesinsuch awaythat the sequence oftargetscorresponding to theexpansionsmadeto obtain Qvk,t isa covering routeforthat set.Informally,we shall call it the

generative route of

Qkv,t and we willdenote withc

(

Qkv,t

)

its temporalcost. We choose toobtain thisby requiringanyexpansiontobe admissiblewithrespectto threeconditions.Givenaset Qk

v,t a newset Qvk,+w1

=

Qkv,t

∪ {

w

}

canbeobtainedif:

1. w isnotcoveredinthecurrentcoveringset,thatis

w

T

(

s

)

\

Qk v,t;

2. w canbecoveredby

d

(

w

)

byextendingthegenerativerouteof Qvk,t withashortestwalkfrom

t to w,

thatis

d

(

w

)

c

(

Qkv,t

)

+

ω

t,w;

3. call

t

atargetvisitedbyashortestpathfrom

t to w,

if

t



/

Qk

v,t thenitcannotbecovered,thatis

d

(

t

)

<

c

(

Qvk,t

)

+

ω

t,t. Conditions 1 and 2requireanyexpansion to formanew coveringset consistentwith

Definition 6

,thus ensuring the algorithm’ssoundness.Condition 3requires

Q

k

v,t tobea

proper descriptor

ofitsgenerativeroute,meaningthatsuchroute, oncewalked,coversuniquelythetargetsincludedin

Q

k

v,t andnottargetsoutsideit.Thislastrequirementisanoperational choice to reduce the numberof expansionsmadein each iteration ofthe algorithm. Consider forinstance agraph with degree boundedby 3, Condition 3allows usto generateintheworst case3 new covering setsateach expansion round insteadof

|

T

|

.NoticethatunderCondition 3thealgorithmcanstillgenerateanymaximalcoveringset.

Lemma5.

Any maximal covering set can be generated from the empty set with a sequence of admissible expansions.

Proofsketch. Consideranexpansionwhich,byaddinga target w, violatesCondition 3foratarget

t

.Such expansioncan always be split intwo admissible ones. Thefirst adding

t

,the second adding w. The samerationaleworks formultiple expansionsandmultiple

t

andclearlyappliestomaximalcoveringsetswhichareasubsetofcoveringsets.

2

Condition 3can beexploitedtoreducetherequiredcompute timebut,rigorouslyspeaking,itpresentsadrawback.To establish iftarget w can be added to set Qk

v,t the algorithmneeds tocheck every shortest pathfrom

t to

w, andsuch paths can be,inthe worstcase, exponentiallymany.We cancope withthisby adoptingthe followingsimplification. We fixasetofcanonicalshortestpathsbyrunningtheFloyd–Warshallalgorithm.Then,given

t and

w, wefetchthecanonical shortestpathbetweenthemandwecheckCondition 3assuming thatsuch pathisunique.Iftheconditionisnot verified underthisassumptionthenitisalsonot verifiedinitsoriginal definition.Ifotherwise,thecondition isverified,then the algorithm(assumingvalidityofConditions 1and 2)mightobtainacoveringsetQ

¯

k+1

v,w whichisnotaproperone,meaning that by walkingits generative route atleastone target t

∈ ¯

/

Qkv,+w1 gets covered.Let usassume that this isthe caseand, w.l.o.g., that

t

 isthe onlyinvalidating target. Thealgorithm’s currentsolution representation(the set Q

¯

k+1

v,w) would then be inconsistent withthe actual solution(the generative route).By Lemma 5 though, weknow that thealgorithm would generatealsoset Qkv+,w2 makingtwoadmissibleexpansionswith

t

and

w to Q

kv,t.Since Qvk+,w2

= ¯

Qvk,+w1

∪ {

t

}

thenon-proper covering set Q

¯

k+1

v,w isremovableby inter-set dominance(Definition 5). Obviously thisworkaroundcomes atan additional cost:thealgorithmunnecessarilygeneratestheset Q

¯

vk+,w1 whichundertheproperdefinitionofCondition 3wouldhavebeen avoided.Still,theabovesolutionturnedoutveryconvenientsinceexponentialmultiplicityofshortestpathsisveryunlikely ingraphsrepresentingrealenvironments.

In Algorithm 1 we provide full technical details. Covering sets obtainedby the algorithm are groupedin collections:

C

k

v,t denotes thecollectionofallcoveringsets ofcardinality

k where

thelastexpansionaddedtarget

t.

Aftertherequired initializationsteps(Lines1and2)foranygenerated

Q

vk,t1wecomputethesetofadmissibleexpansions

Q

+(Line6)andwe applyeachoneofthem(Line8).InStep9,wemakeuseofaprocedurecalled

Search

(

Q

,

C)

where

Q is

acoveringsetand

C

isacollectionofcoveringsets.Theprocedureoutputs

Q if

Q

C

and

otherwise.(Weadoptedanefficientimplementation ofsuch procedurewhichcanrunin O

(

|

T

(

s

)

|)

.Moreprecisely,werepresentacoveringset Q as abinary vectoroflength

(12)

|

T

(

s

)

|

wherethe

i-th

componentisset to1 iftarget ti

Q and 0 otherwise.A collectionofcovering sets C can thenbe representedasa binary treewith depth

|

T

(

s

)|

.The membershipof acovering set Q to collectionC is represented with a branchofthetreeinsuch awaythat if

t

i

Q then wehavea left edgeatdepth

i

1 onsuch branch.We caneasily determineif Q

C

bycheckingiftraversingaleft(right)edgeinthetreeeachtimeweread a1 (0)in Q ’s binaryvector wereachaleafnodeatdepth

|

T

(

s

)

|

.Theinsertionofanewcoveringsetinthecollectioncanbedoneinthesamewayby traversingexistingedgesandexpandingthetreewherenecessary.)Ifthenewlygeneratedcoveringsetisnotpresentinits collectionorisalreadypresentwithahighercost(Step10),thencollectionandcostareupdated(Steps11and12).

Algorithm1DP-ComputeCovSets(v

,

s). 1:∀tT(s): C0v,v= {∅} 2: c(∅)=0 3: for all k∈ {1 . . .|T(s)|}do 4: forall tT(s)do 5: forall Qkv,t1∈ C k−1 v,t do

6: Q+= {wT(s)|Conditions 1–3 are satisfied}

7: forall wQ+do 8: Qk v,w=Qkv,t1∪ {w} 9: U=Search(Qk v,w,Ckv,w) 10: if c(U)>c(Qvk,t1)+ωt,wthen 11: Ck v,w= Ckv,w∪ {Qvk,w} 12: c(Qk v,w)=c(Qkv,t1)+ωt,w 13: return{Ck v,t:tT(s),k≤ |T(s)|}

After

Algorithm 1

completeditsexecution,foranyarbitrary

T



T we caneasilyobtainthetemporalcostofitsshortest coveringrouteas

c

(

T

)

=

min

QY|T|c

(

Q

)

where Y|T|

= ∪

tT

{

Search

(

T

,

C

|T

|

v,t

)}

(notice that ifT isnot acovering setthen c

(

T

)

= ∞

).Forthe sakeof simplicity,

Algorithm 1doesnotspecifyhowtocarryouttwosub-taskswedescribeinthefollowing.

Thefirstoneisthe

annotation of dominated covering sets.

EachtimeSteps11and12areexecuted,acoveringsetisadded tosomecollection.Letuscallit Q and assumeithascardinality

k.

Eachtimeanew

Q has

tobeincludedatcardinality

k,

wemarkallthecoveringsetsatcardinality

k

1 thataredominatedby Q (Definition 5).Thenumberofsetsthatcanbe dominatedisintheworstcase

|

Q

|

sinceeachofthemhastobesearchedincollection

C

kv,t1foreachfeasibleterminal

t and,

iffound,markedasdominated.Thenumberofterminaltargetsandthecardinalityof Q are atmost

n and,

asdescribed above,the Search procedure takes O

(|

T

(

s

)|)

.Therefore,dominatedcoveringsetscanbeannotatedwitha O

(|

T

(

s

)|

3

)

extra cost ateachiteration of

Algorithm 1

.We canonly markandnotdeletedominated coveringsets sincethey cangenerate non-dominatedonesinthenextiteration.

The secondtaskis the

generation of routes.

Todothiswe maintaina listofgeneratingroutes byiteratively appending terminal vertex w to the generative route of Qvk,t1 when set Qkv,t1

∪ {

w

}

is included inits corresponding collection.At theendofthealgorithmonlyroutesthatcorrespondtonon-dominatedcoveringsetsarereturned.Maintainingsuch alist introducesa O

(

1

)

cost.

Theorem6.Algorithm 1is an exact algorithm and has worst-case complexity of O

(

|

T

(

s

)

|

22|T(s)|

)

since it has to compute covering sets up to cardinality

|

T

(

s

)

|

. With annotations of dominances and routes generation the whole algorithm yields a worst-case complexity of O

(|

T

(

s

)|

52|T(s)|

)

.

Noticethatthealgorithmisexactsinceitisbasedonanenumerationprocedure.

Example3.We providea simpleexampleof executionof

Algorithm 1

. Considera probleminstancewitha single signal, arbitrarytargetvalueswhiletopologyandpenetrationtimesareasfollow:

(13)

Wereporttheexpansionsmadebythealgorithmforincreasingcardinalities(valueof

k)

inthefollowingtable. k=0 k=1 k=2 k=3 ∅,c=0 Q1 v,t3= {t3},c=1 Q 2 v,t1= {t1,t3},c=2 Q 3 v,t2= {t1,t2,t3},c=3 Q3 v,t4= {t1,t3,t4},c=4 Q2 v,t2= {t2,t3},c=2 Q 3 v,t1= {t1,t2,t3},c=3 Q3 v,t4= {t2,t3,t4},c=4 Q2 v,t4= {t3,t4},c=2 Q 3 v,t1= {t1,t3,t4},c=4, Q3 v,t2= {t2,t3,t4},c=4,Q1 v,t4= {t4},c=1 Q 2 v,t3= {t3,t4},c=2 Q 3 v,t1= {t1,t3,t4},c=3, Q3 v,t2= {t2,t3,t4},c=3,

NoticethatConstraint 3intervenesbothwhenexpandingcoveringsetswith

k

=

1 and

k

=

2.Inthetable,

c indicates

the temporal costofthe relativecovsetwhilethecovsetmarked with



dominatestheone markedwith

whilethecovset markedwith



dominatestheonemarkedwith



.

3.2.2. Approximation algorithm

The dynamicprogrammingalgorithmpresentedintheprevious section cannotbe directlyadoptedtoapproximatethe maximal coveringroutes. Wenotice that even inthe casewe introduce a logarithmic upper bound over thesize of the coveringsetsgeneratedby

Algorithm 1

,wecouldobtainanumberofroutesthatis

O

(

2log2(|T(s)|)

)

,andthereforeexponential.

Thus, ourgoalistodesigna polynomial-timealgorithm thatgeneratesa polynomialnumberof

good covering

routes. We observethatifwehaveatotalorderovertheverticesandweworkoveracompletegraphofthetargetswhereeachedge correspondstotheshortestpath,wecanfindinpolynomialtimethemaximalcoveringroutessubjecttotheconstraintthat, givenanypairoftargets

t

,

tinaroute,

t can

precede

t

intherouteonlyif

t precedes t

 intheorder.Wecall

monotonic

a routesatisfyingagiventotalorder.

Algorithm 2

returns themaximalmonotoniccoveringrouteswhenthetotalorderis lexicographic(trivially,tochangetheorder,itissufficienttore-labelthetargets).

Algorithm 2isbasedondynamicprogrammingandworksasfollows.R

(

k

,

l

)

isamatrixstoringineachcellone route, while L

(

k

,

l

)

isamatrixstoringineachcellthemaximumlatenessofthecorrespondingroute(seebelowforthemeaning of

k and l).

Themaximumlatenessofaroute

r captures

themaximumdelaybetweenatarget’s firstvisitanditsdeadline. Formally, itisdefinedasmaxtT(s)Ar

(

t

)

d

(

t

)

. Theroute storedin R

(

k

,

l

)

isthe onewiththe minimumlatenessamong all the monotonic onescovering l targets wheretk is thefirst visitedtarget. Thus, basically,when l

=

1, R

(

k

,

l

)

contains theroute



v

,

tk



,while,when

l

>

1,

R

(

k

,

l

)

isdefinedappendingto R

(

k

,

1

)

thebest(intermsofminimizingthemaximum lateness)route R

(

k

,

l

1

)

forevery

k



>

k, inordertosatisfy thetotalorder.The wholesetofroutesin R are returned.8 Thecomplexityof

Algorithm 2

is O

(|

T

(

s

)|

3

)

,exceptthetimeneededtofindalltheshortestpaths.

Algorithm2MonotonicLongestRoute(v

,

s). 1:∀k,k∈ {1,2,. . . ,|T(s)|},R(k,k)= ∅,L(k,k)= +∞,CR(k)= ∅,CL(k)= +∞ 2: for allk∈ {|T(s)|,|T(s)|−1,. . . ,1}do 3: foralll∈ {1,2,. . . ,|T(s)|}do 4: if l=1 then 5: R(k,l)= v,tk 6: L(k,l)=ωv,tkd(tk) 7: else 8: forall ks.t. |T(s)|≥k>|T(s)|−k do 9: CR(k)= R(k,1),R(k,l−1) 10: CL(k)=max{L(k,1),ωv,tk+ωtk,k−ωv,k+L(k,l−1)} 11: j=arg minj{CL(j)} 12: if CL(j)0 then 13: R(k,l)CR(j) 14: L(k,l)CL(j) 15: return R

We usedifferenttotalorders(breakingtiesrandomly)over thesetoftargets,collecting all theroutesgeneratedusing eachtotalorder:

increasingorder

ω

v,t:therationaleisthattargetscloseto v will bevisitedbeforetargetsfarfrom

v;

increasingorder

d

(

t

)

:therationaleisthattargetswithshortdeadlineswillbevisitedbeforetargetswithlongdeadlines;

increasing order

d

(

t

)

ω

v,t (wecallthisquantity

excess time):

therationaleisthattargetswithshortexcesstimewill bevisitedbeforetargetswithlongexcesstime.

8 We notice that dominance can be applied to discard dominated routes. However, in this case, the improvement would be negligible since the total

(14)

Algorithm3Branch-and-Bound(v,

s, ρ

). 1: C Lmax← ∅ 2: C Lmin← ∅ 3: for all tT(s)do 4: ifωv,td(t)then 5: Tree-Search(ρ· |T(s)|,v,t) 6: return C Lmax

Inaddition,weusea randomrestartgeneratingrandompermutationsoverthetargets.

Theorem7.Algorithm 2provides an approximation with ratio

(

|T1(s)|

)

.

Proofsketch. Theworstcasefortheapproximationratioofouralgorithmoccurswhenthecoveringrouteincludingallthe targetsexistsandeachcoveringroute returnedby ourheuristic algorithmvisitsonlyonetarget. Inthatcase,the optimal expectedutilityof

D

is1.Ouralgorithm,intheworst caseinwhich

π

(

t

)

=

1 foreverytarget

t,

returnsanapproximation ratio

(

|T1(s)|

)

.Itisstraightforwardtoseethat,inothercases,theapproximationratioislarger.

2

3.3. Branch-and-bound algorithms

Thedynamicprogrammingalgorithmpresentedintheprevioussectionessentiallyimplementsabreadth-firstsearch.In some specificsituations,depth-firstsearchcould outperformbreadth-firstsearch,e.g.,whenpenetration timesare relaxed andgoodheuristicsleadadepth-firstsearchtofindinabrieftimethemaximalcoveringroute,avoidingtoscanan expo-nentialnumberofroutesasthebreadth-first searchwoulddo. Inthissection,we adoptthebranch-and-bound approach to design both an exact algorithm andan approximation algorithm. Inparticular, in Section 3.3.1 we describe ourexact algorithm,whileinSection3.3.2wepresenttheapproximationone.

3.3.1. Exact algorithm

Ourbranch-and-boundalgorithm(see

Algorithm 3

)isatree-searchbasedalgorithmworkingonthespaceofthecovering routesandreturningasetofcoveringroutesR. Itworksasfollows.

Initialstep. Weexploittwoglobalsetvariables,

C L

min and

C L

maxinitiallysettoempty(Steps 1–2of

Algorithm 3

).These variables contain

closed covering

routes,namely coveringrouteswhich cannot befurther expandedwithout violatingthe penetrationtimeofatleastonetargetduringthevisit.

C L

maxcontainsthecoveringroutesreturnedbythealgorithm,while

C Lmin is usedforpruningasdiscussedbelow.Givena starting vertexv and a signal s, foreach target

t

T

(

s

)

such that

ω

v,t

d

(

t

)

wegenerateacoveringroute

r

= 

v

,

t



with

r

(

0

)

=

v and r

(

1

)

=

t (Steps 5 of

Algorithm 3

).Thus,

D

hasatleast onecoveringroute pertargetthat canbecoveredintimefrom v. Noticethatif, forsome

t,

such minimalroutedoesnot exist,then target

t cannot

becovered (even theshortestpathfromthe starting vertexv cannot guaranteecapture).This doesnotguaranteethat

A

willattack

t with

fullprobabilitysince,dependingonthevalues

π

,

A

couldfindmoreprofitable torandomizeoveradifferentsetoftargets.Themeaningofparameter

ρ

(usedinLine 5of

Algorithm 3

)isdescribedbelow.

Routeexpansions. Thesubsequentstepsessentiallyevolveoneachbranchaccordingtoadepth-firstsearchwith back-trackinglimitedby

ρ

.The choiceof

ρ

directlyinfluences thebehavior ofthealgorithmandconsequently itscomplexity. Eachnode inthesearch treerepresentsa route

r built

sofarstarting froman initialroute



v

,

t



.Ateach iteration,route

r is expandedbyinsertinga newtargetataparticularposition.Wedenotewith

r

+

(

q

,

p

)

theroute obtainedbyinserting target q after the p-th target in r. Notice that every expansion of r will preserve the relative orderwith which targets alreadypresentin

r will

bevisited.Thecollectionofallthefeasibleexpansions

r

+s(i.e.,theonesthatarecoveringroutes) isdenotedby R+ anditisorderedaccordingtoaheuristicthatwe describebelow.

Algorithm 6

,described below,isused togenerate

R

+(Step 1of

Algorithm 4

).Ineachopenbranch(i.e.,

R

+

= ∅

),ifthedepthofthenodeinthetreeissmalleror equalto



ρ

· |

T

(

s

)

|

thenbacktrackingisdisabled(Steps 7–10of

Algorithm 4

),while,ifthedepthislargerthansuchvalue, isenabled(Steps 5–6of

Algorithm 4

).Thisisequivalenttofixtherelativeorderofthefirst(atmost)



ρ

· |

T

(

s

)

|

inserted targetsinthe currentroute.Inthiscase, with

ρ

=

0 wedonot relyon theheuristicsatall,full backtrackingis enabled, thetreeisfullyexpandedandthereturned R is complete,i.e.,itcontains allthenon-dominatedcovering routes.Route

r

isrepeatedlyexpandedinagreedyfashionuntilnoinsertionispossible.Asaresult,

Algorithm 4

generatesatmost

|

T

(

s

)|

coveringroutes.

Pruning.Algorithm 5 is incharge ofupdating C Lmin andC Lmax each timea route r cannot be expandedand, conse-quently, theassociated branchmustbe closed.We call C Lmin the

minimal set

ofclosed routes. Thismeans that aclosed route

r belongs

to C Lmin onlyifC Lmin doesnotalreadycontainanother

r



r. Steps 1–4of

Algorithm 5

implementsuch condition:first,inSteps 2–3anyroute

r

such that

r



r is removedfrom

C L

min,thenroute

r is

insertedin

C L

min.Routes in

C L

min areusedby

Algorithm 6

inSteps 2and 6forpruningduringthesearch.Moreprecisely,aroute

r is

notexpanded withatarget

q at

position p if thereexistsaroute

r



C Lminsuchthat

r



r+

(

q

,

p

)

.Thispruningruleissafesinceby def-initionif

r



C Lmin,thenallthepossibleexpansionsof

r

areunfeasibleandif

r



r then r can beobtainedbyexpanding from

r

.Thispruningmechanismexplainswhyoncea route

r is

closedisalways inserted inC Lmin withoutcheckingthe

Riferimenti

Documenti correlati

2D: Two-dimensional; 3D: Three-dimensional; bpMRI: Biparametric MRI; CAD: Computer-aided diagnosis; csPCa: Clinically significant PCa; DCE: Dynamic contrast enhanced;

Based on the previous discussion of the results, we can conclude that literature and practitioners seem to be generally aligned on the digital twin paradigm; nevertheless, even if

In (E) DCP of 3 ROIs (yellow circles): the leftmost hyporeflective ROI (leftmost yellow circle) that corresponded to cyst on structural B-scan OCT (visualized in (F) pointed out by

Since all industrial patents, regardless of the technological sector they belong to, are subject to the same law, estimating the legal component of the value of

The impact of these same personality traits was also evaluated on a modified version of the task in which the difference in immediate reward magnitude between disadvantageous

Genetic Department, Institute for Maternal and Child Health - IRCCS “Burlo Garofolo”, Trieste, Italy. 4 Pharmacy and Clinical Pharmacology Department,

We conclude with suggestive evidence that the underlying links between past slavery and current inequality run through the political exclusion of former slaves and the

View of the crystal packing of: (A) BE_I and (B) BE_IV (S enantiomer in green, R enantiomer in pink). For sake of clarity, only the set of coordinates having the highest