Adversarial patrolling with spatially uncertain alarm signals

(1)

Contents lists available atScienceDirect

Artiﬁcial

Intelligence

www.elsevier.com/locate/artint

Adversarial

patrolling

with spatially

uncertain

alarm

signals

Nicola Basilico

a

,

∗

, Giuseppe De Nittis

b

, Nicola Gatti

b

a_Department_of_Computer_Science,_University_of_Milan,_Milano,_Italy

b_Dipartimento_di_Elettronica,_Informazione_e_{Bioingegneria,}_Politecnico_di_Milano,_Milano,_Italy

a

r

t

i

c

l

e

i

n

f

o

a

b

s

t

r

a

c

t

Articlehistory:

Received 15 June 2015

Received in revised form 6 August 2016 Accepted 26 February 2017

Available online 4 March 2017

Keywords:

Security games Adversarial patrolling Algorithmic game theory

When securing complex infrastructures orlarge environments, constantsurveillance of everyarea is not affordable.To cope withthisissue, a commoncountermeasureis the usageof cheapbut wide-rangedsensors, abletodetect suspicious events thatoccur in largeareas,supportingpatrollerstoimprovetheeffectivenessoftheirstrategies.However, suchsensors are commonly affectedby uncertainty. In the present paper, wefocus on spatiallyuncertainalarmsignals.Thatis,thealarmsystemisabletodetect anattackbut itisuncertainontheexactposition wheretheattackistakingplace.Thisiscommonwhen theareatobesecurediswide,suchasinborderpatrollingandfairsitesurveillance.We propose,tothebestofourknowledge,theﬁrstPatrollingSecurityGamewhereaDefender

issupportedbyaspatiallyuncertainalarmsystem,whichnon-deterministicallygenerates

signals onceatargetisunderattack.WeshowthatﬁndingtheoptimalstrategyisFNP-hard evenintreegraphsandAPX-hardinarbitrarygraphs.Weprovidetwo(exponentialtime) exactalgorithms andtwo (polynomialtime)approximationalgorithms.Finally,weshow that,withoutfalsepositivesandmisseddetections,thebestpatrollingstrategyreducesto stayinaplace,waitforasignal,and respondtoitatbest.Thisstrategyisoptimal even withnon-negligiblemisseddetectionrates,which,unfortunately,affecteverycommercial alarmsystem. We evaluate our methods in simulation,assessing both quantitativeand qualitativeaspects.

1. Introduction

SecurityGamesmodelthetaskofprotectingphysicalenvironmentsasanon-cooperativegamebetweena

Defender and

an

Attacker

[1].Thesegamesusuallytakeplaceundera

Stackelberg (a.k.a. leader–follower)

paradigm[2],wheretheDefender (leader) commits to astrategy and the Attacker(follower) first observes such commitmentandthen best responds to it. As discussed in the seminal work [3],finding a leader–follower equilibrium is computationally tractable in games with one follower andcomplete information, while it becomes hard in Bayesian games with differenttypes of Attacker. The availability of such computationallytractable aspects of SecurityGames led tothe development ofalgorithms capable of scalingup tolargeproblems,makingthemdeployableinthesecurityenforcingsystemsofseveralreal-worldapplications. The first notable examples are the deployment ofpolice checkpoints atthe Los AngelsInternationalAirport [4] andthe schedulingoffederalairmarshalsovertheU.S.domesticairlineflights[5].Morerecentcasestudiesincludethepositioning of U.S.CoastGuardpatrols to securecrowdedplaces,bridges, andferries [6]andthearrangementof cityguards tostop fare evasioninLos AngelesMetro [7].Finally,a similar approachisbeingtested andevaluated inUganda, Africa, forthe

*

Corresponding author.

(2)

protection of wildlife [8]. Thus, Security Games emerged asan interesting game theoretical tool andthen showed their on-the-ﬁeldeffectivenessinanumberofrealsecurityscenarios.

We focus ona specific classof security games, calledPatrolling Security Games. Thesegames are modeled as infinite-horizon extensive-form games in which the Defender controls one or more patrollers moving within an environment, represented asa finite graph.The Attacker, besides havingknowledge of the strategy to whichthe Defender committed to, can observethe movementsof thepatrollers atany time andusesuch informationin decidingthe mostconvenient timeandtargetlocationtoattack[9].Whenmultiplepatrollersareavailable,coordinatingthematbestisingeneralahard taskwhich,besidescomputationalaspects,mustalsokeepintoaccountcommunicationissues[10].However,thepatrolling problemis tractable, even withmultiple patrollers,in border security(e.g., lineand cyclegraphs), when patrollershave homogeneousmovingandsensingcapabilitiesandalltheverticescomposingthebordersharethesamefeatures[11]. Scal-ing thismodelinvolved thestudy ofhowto compute patrollingstrategies inscenarios wherethe Attacker isallowed to performmultipleattacks[12].Similarly,coordinationstrategiesamongmultipledefendersare investigatedin[13].In[14], theauthors studythecaseinwhich thereisa temporaldiscount onthe targets. Extensions arediscussed in[15], where coordination strategiesbetweendefenders areexplored, in[16],wherea resourcecan covermultipletargets, andin[17]

whereattackscanbe detectedatdifferentstageswithdifferentassociated utilities.Finally,some theoreticalresultsabout propertiesofspeciﬁc patrollingsettingsareprovided in[18]. Inthepresentpaper,weprovide anewmodelofPatrolling SecurityGamesinwhichtheDefenderissupportedbyan

alarm system deployed

intheenvironment.

1.1. Motivating scenarios

Often,inlargeenvironments,aconstantsurveillanceofeveryareaisnotaffordablewhilefocused inspectionstriggered by alarms are more convenient.Real-world applications includeUAVs surveillance oflarge infrastructures [19], wildﬁres detectionwithCCDcameras[20],agriculturalﬁeldsmonitoring[21],surveillancebased onwirelesssensornetworks[22], andborderpatrolling[23].Alarmsystemsareinpracticeaffectedby

detection uncertainty,

e.g.,misseddetectionsandfalse positives,and

localization (a.k.a.

spatial)uncertainty,e.g.,thealarmsystemisuncertainabouttheexacttargetunderattack. Wesummarilydescribetwopracticalsecurityproblemsthatcanbeascribedtothiscategory.Wereportthemasexamples, presentingfeatures andrequirementsthatourmodelcan properlydealwith. Intherestofthe paperwewill necessarily take a generalstance,butwe encourage thereaderto keep inmind thesetwo casesasreferenceapplications fora real deploymentofourmodel.

1.1.1. Fight to illegal poaching

Poachingisawidespreadenvironmentalcrimethatcausestheendangermentofwildlifeinseveralregionsoftheworld. Itsdevastatingimpactmakesthedevelopmentofsurveillancetechniquestocontrastthiskindofactivitiesoneofthemost importantmattersinnationalandinternationaldebates. Poachingtypicallytakesplaceovervastandsavageareas,making itcostly andineffective tosolelyrelyon persistentpatrolby rangersquads.Toovercomethisissue,recentdevelopments havefocused onprovidingrangers withenvironmental monitoringsystemstobetter plantheir inspections, concentrating theminareaswithlargelikelihoodofspottingacrime.SuchsystemsincludetheuseofUAVsﬂyingoverthearea,alarmed fences, andon-the-ﬁeld sensors trying to recognize anomalous activities.1 _In _all _these _cases, _technologies_are _meant _to

workasanalarmsystem:oncetheillegalactivityisrecognized,asignal issenttotherangersbasestationfromwherea responseisundertaken.Inthegreatmajorityofcases,asignalcorrespondstoaspatiallyuncertainlocalizationoftheillegal activity.Forexample,acamera-equippedUAVcanspotthepresenceofapickupinaforbiddenarea butcannotderivethe actuallocationtowhichpoachersaremoving.Inthesameway,alarmed fencesandsensorscanonlytransmitthelocation ofviolated entrancesorforbiddenpassages.Inall thesecasesasignal impliesarestricted,yetnotprecise,localizationof thepoachingactivity.TheuseofSecurityGamesinthisparticulardomainisnotnew(see,forexample,[8]).However, our model allows thecomputation of alarm responsestrategies fora given alarmsystem deployed on theﬁeld. This can be doneby adoptingadiscretizationoftheenvironment,whereeach targetcorrespondstoasector, valuesare relatedtothe expectedpopulationofanimalsinthatsector,anddeadlinesrepresenttheexpectedcompletiontimeofillegalhunts(these parameterscanbederivedfromdata,asdiscussedin[8]).

1.1.2. Safety of fair sites

Fairsarelargepubliceventsattendedbythousandsofvisitors,wheretheproblemofguaranteeingsafetyforthehosting facilitiescanbeveryhard.Forexample,Expo2015,therecentUniversalExpositionhostedinMilan,Italy,sawanaverageof about100,000visitsperday.Thisposestheneedforcarefullyaddressing safetyrisks,whichcanalsoderivefromplanned act of vandalism or terrorist attacks. Besides security guards patrols, fair sites are often endowed with locally installed monitoringsystems.Expo2015employedaround200baﬄegatesand400metaldetectorsattheentranceofthesite.The internalareawasconstantlymonitoredby4000surveillancecamerasandby700guards.Likely,whenoneormoreofthese devices/personnelidentiﬁedasecuritybreach,asignalwassenttothecontrolroomtogetherwithacircumscribedrequest of intervention. This approach is required because, especially in this kind of environments, detecting a security breach

(3)

andneutralizing itare verydifferenttasks. Thelatterone,in particular,usually requiresagreater effort involvingspecial equipmentandpersonnelwhoseemploymentonademandbasisismuchmoreconvenient.Moreover,thedetectinglocation ofa threatisinmanycasesdifferentfromthelocation whereitcouldbe neutralized,making therequestofintervention spatially uncertain.Forinstance, consider asecurity guard ora surveillance camera detecting the visitors’ reactions to a shootingrampageperformedbysomeattacker.Inexampleslikethese, wecanrestrict theareawherethesecuritybreach happenedbutnopreciseinformationaboutthelocationcanbegatheredsincetheattackerwillprobablyhavemoved.Our modelcouldbe appliedtoprovideapolicy withwhichscheduleinterventionsuponasecuritybreachisdetectedinsome particular section ofthesite. Insuch case, targetscouldcorrespond tobuildings orother installationswhere visitorscan go.Values anddeadlinescanbechosen accordingtotheimportance oftargets, theirexpectedcrowding,andtherequired responsepriority.

1.2. Alarms and security games

While theproblemofmanaging a sensornetworkto optimallyguard security-criticalinfrastructures hasbeen investi-gatedinrestricteddomains,e.g.[24],theproblemofintegratingalarmsignalstogetherwithadversarialpatrollingisalmost completelyunexplored.Theonlyworkthatcanbeclassiﬁedunderthisscopeis[25].Thepaperproposesaskeletonmodel of analarm systemwheresensors havenospatial uncertaintyindetecting attackson single targets. Theauthors analyse howsensoryinformationcanimprovetheeffectivenessofpatrollingstrategiesinadversarialsettings.Theyshowthat,when sensors arenotaffectedby falsenegativesandfalse positives,thebeststrategyprescribes thatthepatrollerjustresponds toan alarmsignalrushingtothetarget underattack withoutpatrollingtheenvironment.Asaconsequence,insuch cases themodeltreatmentbecomes trivial.Ontheother hand,whensensorsareaffectedonlyby falsenegatives,thetreatment canbecarriedoutbymeansofaneasyvariationofthealgorithmforthecasewithoutsensors[9].Inthelastcase,where false positivesareadmitted,theproblembecomescomputationally intractable.Tothebestofourknowledge,noprevious resultdealingwithspatialuncertainalarmsignalsinadversarialpatrollingispresentintheliterature.

Effectively exploitinganalarm systemanddetermining agood deploymentforit(e.g.,selecting thelocation whereto installsensors)arecomplementarybutradicallydifferentproblems.Theresultsweprovideinthisworklieinthescopeof the ﬁrstone while thetreatment ofthesecond one isleft forfuture works.In other words,we assume thata deployed alarm systemis givenandwe dealwiththeproblemofstrategicallyexploitingitatbest.Anyapproach tosearch forthe optimaldeploymentshouldknow,inprinciple,howtoevaluatepossibledeployments.Insuchsense,ourproblemneedsto beaddressedbeforeonemightdealwiththedeploymentone.

1.3. Contributions

In thispaper,wepropose theﬁrst SecurityGamethat integratesaspatially uncertainalarmsystemin game-theoretic settings forpatrolling.2 _An_alarm_signal_carries_the_information_about_the_set_of_targets_that_can _be_under_attack_and_it_is

describedbytheprobabilityofbeinggeneratedwheneachtargetisattacked.Theanalysisandthemainresultswepropose inthisworkaredevotedtothebasicgamemodelwheretheDefendercancontrolasinglepatrollerandthealarmsystem is immune tofalse positivesandfalse negatives,making spatialuncertainty itsonly limitation.As ourresultsshow,such assumptions donot playdownthe significanceofthearisingcomputational challengeswhoseresolutionisaprerequisite formorecomplexsettings.Indeed,extensionsofthismodelthatgeneralizetomulti-resourcesettings[27]andthatconsider falsenegatives[28]arebuiltonthebasicresultprovidedinthiswork.Thegameweconsidercanbedecomposedinafinite numberoffinite-horizonsubgames,eachcalledSignalResponseGamefrom

v (SRG-v)

andcapturingthesituationinwhich theDefenderisinavertexv and theAttackerattackedatarget,andaninfinite-horizongame,calledPatrollingGame(PG), in whichtheDefendermovesinabsence ofanyalarmsignal. Weshow thatSRG-v is FNP-hardfortree-basedtopologies andthat,forarbitrarygraphs,isAPX-hard.Weprovidetwoexactalgorithms.Thefirstone,basedondynamicprogramming, performsabreadth-firstsearch,whilethesecondone,basedonbranch-and-boundapproach,performsadepth-firstsearch. Weusethesametwoapproachestodesigntwoapproximationalgorithms.

Then,westudythePG,showingthatwhennofalsepositivesandnomisseddetectionsarepresent,theoptimalDefender strategy isto stayina ﬁxedlocation, waitforasignal, andrespondtoitatbest.Thisstrategy keepsbeingoptimaleven whennon-negligiblemisseddetectionratesareallowed.Weexperimentallyevaluatethescalabilityofourexactalgorithms andwecomparethemwiththeapproximation onesintermsofsolutionqualityandcomputetimes,investigatinginhard instances thegapbetweenourhardness resultsandthetheoreticalguaranteesofourapproximationalgorithms.Weshow thatourapproximationalgorithmsprovideveryhighqualitysolutionseveninhardinstances.Finally,weprovideanexample ofresolutionforarealisticinstance,basedonExpo2015,andweshowthat ourexactalgorithmscanbeappliedforsuch kind ofsettings.Moreover, inourrealistic instancewe assesshow theoptimalpatrolling strategycoincides witha static placementevenwhenallowingafalsenegativerateoflessorequalto30%.

(4)

Fig. 1. Example of patrolling setting. 1.4. Paper structure

InSection2,weintroduceourgamemodel.InSection3,westudytheproblemofﬁndingthestrategyoftheDefender forrespondingtoanalarmsignal.InSection4,westudythepatrollingproblem. InSection5,weexperimentallyevaluate our algorithms.In Section 6, we brieﬂy discussthe mainSecurity Games research directions that havebeenexplored in thelastdecades.Finally,Section7concludesthepaper.

Appendix A

includesthemosttechnicalproofsofthepaperwhile

Appendix Bdiscussessomeparticularresultsobtainedforspecialclassesofinstances.

Appendix C

reportssome additional experimentalresultsand

Appendix D

providesatablesummarizingthenotationsymbolsusedinthepaper.

2. Problemstatement

Inthissectionweformalizetheproblemwestudy.Moreprecisely,inSection2.1wedescribethepatrollingsettingand thegamemodel,whileinSection2.2westatethecomputationalquestionsweshalladdressinthiswork.

2.1. Game model

Initially,inSection2.1.1,weintroduceabasicpatrollingsecuritygamemodelintegratingthemainfeaturesfrommodels currently studied in literature. Next, in Section 2.1.2, we extend our game model by introducing alarm signals. In Sec-tion 2.1.3, we depict the game tree ofour patrolling security game withalarm signalsand we decompose it in notable subgamestofacilitateitsstudy.Toeasepresentation,wesummarizedournotationsymbolsin

Table D.3

.

2.1.1. Basic patrolling security game

As iscustomary inthe Artiﬁcial Intelligenceliterature [9,14], we dealwithdiscrete, both intermsof spaceandtime, patrollingsettings,representinganapproximationofacontinuousenvironment.Speciﬁcally,wemodeltheenvironmentto be patrolledasan undirectedconnectedgraph G

= (

V

,

E

)

,wherevertices representdifferentareasconnectedby various corridors/roads represented through theedges. Timeis discretized inturns. Edges areassigned weights representingthe numberoftimestepsneededtotraversethem.Ifnotstatedotherwise,weshallassumeunitaryweights.With

ω

∗_i_,_j weshall denotethe shortesttime togo fromi to j. Wedeﬁne T

⊆

V the subset ofsensiblevertices,called

targets,

that mustbe protectedfrompossibleattacks.Eachtarget

t

∈

T is characterizedbyavalue

π

(

t

)

∈ (

0

,

1

]

andapenetrationtime

d

(

t

)

∈ N

+, whichmeasuresthenumberofturnsneededtocompleteanattackto t.

Example1. We report in Fig. 1 an example of patrolling setting. Here, V

= {

v0

,

v1

,

v2

,

v3

,

v4

}

, T

= {

t1

,

t2

,

t3

,

t4

}

where

ti

=

vifor

i

∈ {

1

,

2

,

3

,

4

}

.Allthetargets

t present

thesamevalue

π

(

t

)

andthesamepenetrationtime

d

(

t

)

. Ateachturn,an

Attacker

A

anda

Defender

D

playsimultaneouslyhavingthefollowingavailableactions:

•

if

A

hasnot attackedinthepreviousturns,it canobservethepositionof

D

inthegraph

G

3 _and_decide _whether_to

attackatargetortowaitfora turn.Theattackisinstantaneous,meaningthatthereisnodelaybetweenthedecision toattackandtheactualpresenceofathreatintheselectedtarget4_;

• D

hasnoinformationabouttheactionsundertakenby

A

inpreviousturnsandselectsthenextvertextopatrolamong thoseadjacenttothecurrentone;eachmovementisanon-preemptivetraversalofasingleedge

(

v

,

v

)

∈

E.

Thegamemayconcludeincorrespondenceofanyofthetwofollowingevents.Theﬁrstoneiswhen

D

patrolsatarget

t that isunderattackby

A

fromlessthan

d

(

t

)

turns.Insuch casetheattack ispreventedand

A

iscaptured.Thesecond 3 _{Partial observability of}_A_{over the position of}_D_{can be introduced, as discussed in}_[29]_.

4 _{This is a worst-case assumption according to which}_A_{is as strong as possible. It can be relaxed by associating execution costs to the}_A_{’s actions, as}

(5)

Fig. 2. Examples of alarm systems.

one iswhentarget

t is

attackedand

D

doesnotpatrol

t during

the

d

(

t

)

turnsthat followthebeginning oftheattack. In such case, theattack issuccessfuland

A

escapes withoutbeingcaptured.When

A

iscaptured,

D

receivesa utility of1 and

A

receives autility of0.When anattack to

t has

success,

D

receives1

−

π

(

t

)

and

A

receives

π

(

t

)

.The gamemay notconcludeif

A

decidestowaitforeveryobservedpositionof

D

,thusneverattacking.Insuchcase,

D

receives1 and

A

receives 0.(Anotherintuitive waytothinkatthispayoffstructure istoassume that

D

receivesan initialutility of1 and then loses

π

(

t

)

whenevertarget

t is

successfullycompromised.)Notice thatthegameis constantsumandthereforeit is equivalenttoazero-sumgamethroughapositiveaﬃnetransformation.

The above gamemodel isinextensive form(beingplayed sequentially), withimperfectinformation (

D

not observing the actions undertaken by

A

), and withinﬁnite horizon(

A

beingin the position to wait forever). The fact that

A

can observetheactions undertakenby

D

beforeactingmakesthe

leader–follower equilibrium

thenaturalsolutionconceptfor ourproblem, where

D

isthe

leader and

A

isthe

follower.

Sincewe focusonzero-sum games,theleader’s strategyatthe leader–follower equilibriumis its maxminstrategy andit canbe found by employing linear mathematicalprogramming, whichrequirespolynomialtimeinthenumberofactionsavailabletotheplayers[31].

2.1.2. Introducing alarm signals

Weextendthegamemodeldescribedintheprevioussectionbyintroducinga

spatial uncertain alarm system that

canbe exploitedby

D

.The basicideais thatan alarmsystemusesa numberofsensorsspread overtheenvironment togather informationaboutpossibleattacksandraisesanalarmsignalatanytimeanattackoccurs.Thealarmsignalprovidessome information aboutthe location (target)wherethe attackis ongoing,butit isaffectedby uncertainty.In otherwords, the alarmsystemdetectsanattackbutitisuncertainaboutthetargetunderattack.Formally,thealarmsystemisdeﬁnedasa pair

(

S

,

p

)

,where

S

= {

s1

,

· · · ,

sm

}

isasetof

m

≥

1 signals and p

:

S

×

T

→ [

0

,

1

]

isafunctionthatspeciﬁestheprobability ofhavingthesystemgeneratingasignal s given thattarget

t has

beenattacked,wedenotesuch probabilitywith p

(

s

|

t

)

. With aslightabuseofnotation, forasignal

s we

deﬁne T

(

s

)

= {

t

∈

T

|

p

(

s

|

t

)

>

0

}

and,similarly, foratarget

t we

have

S

(

t

)

= {

s

∈

S

|

p

(

s

|

t

)

>

0

}

.Inthiswork,weassumethat:

•

thealarmsystemisnotaffectedbyfalsepositives(signalsgeneratedwhennoattackhasoccurred).Formally,

p

(

s

| )

=

0,where

indicatesthatnotargetsareunderattack;

•

thealarmsystemisnotaffectedbyfalsenegatives(signalsnotgeneratedeventhoughanattackhasoccurred).Formally,

p

(

⊥

|

t

)

=

0, where

⊥

indicates that no signals have been generated; in Section 4 we will show that the optimal strategieswecomputeunderthisassumptioncanpreserveoptimalityeveninpresenceofnon-negligiblefalsenegatives rates.

Example2.We report two examples of alarm systemsfor the patrolling settingdepicted in

Fig. 1

. The ﬁrst example is reportedin

Fig. 2

(a).Itisalow-accuracyalarmsystemthatgeneratesthesamesignalanytimeatargetisunderattackand thereforethatdoesnotprovideanyinformationaboutthetargetunderattack.Thesecondexampleisreportedin

Fig. 2

(b). Itprovidesmoreaccurateinformationaboutthelocalizationoftheattackthanthepreviousexample.Here,thereceptionof asignal

s

i implies,underauniformstrategyof

A

,ahighprobabilityofanattackontarget

t

i.Namely,when

t

iisattacked thealarmsystemgenerates

s

i withhighprobabilityandadifferentsignalotherwise.

(6)

Giventhepresence ofanalarm systemdeﬁnedasabove, thegamemechanismchanges inthefollowingway.At each turn, before deciding its next move,

D

observes whether ornot a signal hasbeen generated by the alarm systemand then makes its decision considering such information. This introduces in our game a node of chance implementing the non-deterministicselectionofsignals.

2.1.3. The game tree and its decomposition

Herewedepict thegametreeofourgamemodel,decomposing itinsomerecurrentsubgames.Aportionofthegame isdepictedin

Fig. 3

.Suchtreecanbereadalongthefollowingsteps.

•

Root of the tree.

A

decideswhethertowaitforaturn(thisactionisdenotedas

toindicatethatnotargetisattacked) ortoattackatarget

t

∈

T (this actionisdenotedbythelabel

t of

theattackedtarget).

•

Second level of the tree.

N

denotesthealarmsystem,representedbyanature-typeplayer.Itsbehavioris

a priori speciﬁed

bytheconditional probabilitymassfunction p, whichdeterminesthegeneratedsignal giventheattackperformedby

A

.Inparticular,itisusefultodistinguishbetweentwocases:

(a) if

A

doesnotattack(action

),thennosignalwillbegeneratedundertheassumptionthat p

(

⊥

| )

=

1; (b) if

A

attackstarget

t,

asignal

s will

bedrawnfrom

S

(

t

)

withprobability

p

(

s

|

t

)

(recallthat

p

(⊥

|

t

)

=

0).

•

Third level of the tree.

D

observesthesignalgeneratedbythealarmsystemanddecidesthenextvertextopatrolamong thoseadjacenttothecurrentone(thecurrentvertexisinitiallychosenby

D

).

•

Fourth level of the tree and on. It isusefultodistinguishbetweentwocases:

(a) ifnoattackispresent,thenthetreeofthesubgamestartingfromhereisthesameofthetreeofthewholegame, exceptforthepositionof

D

thatmaybe differentfromtheinitialone(noticethatinthiscaseeachof

A

’sdecision nodesisasingletoninformationset,modelingthefactthat

A

canobserve

D

’sposition);

(b) ifanattackistakingplaceontarget

t,

thenonly

D

canact.

Suchgametreecanbedecomposedinanumberofﬁniterecurrentsubgamessuchthatthebeststrategiesoftheagents ineachsubgame areindependentfromthoseinothersubgames. Thisdecompositionallowsustoapplya

divide et impera

approach,simplifyingtheresolutionoftheproblemofﬁndinganequilibrium.Moreprecisely,wedenotewith

oneofthese subgames.Wedeﬁne

asagamesubtreethatcanbeextractedfromthetreeof

Fig. 3

asfollows.Given

D

’scurrentvertex

v

∈

V , selectadecisionnodefor

A

andcallit

i.

Then,extractthesubtreerootedin

i discarding

thebranchcorrespondingto action

(noattack).5_Intuitively,_such_subgame_models_the_players’_interaction_when_the_Defender_is_in_some_given_vertex v and theAttacker willperforman attack. Asa consequence,each

obtainedinsuch wayis ﬁnite(once anattack on

t

started,themaximumlengthofthegameis

d

(

t

)

).Moreover,thesetof

different

swecanextractisﬁnitesincewehaveone subgameforeachpossiblecurrentvertexfor

D

.As aconsequence,we canextractatmost

|

V

|

differentsubgames. Notice that,duetotheinﬁnitehorizon,eachsubgamecanrecuran inﬁnitenumberoftimesalongthegametree.However,being suchrepetitionsindependentandthegamezero-sum,weonlyneedtosolveonesubgametoobtaintheoptimalstrategyto beappliedineachofitsrepetitions.Inotherwords,whenassumingthatanattackwillbeperformed,theagents’strategies can be split in a number of independent strategiessolely depending on thecurrent position of

D

. The reasonwhy we discarded thebranchcorresponding toaction

ineach subgameis thatwe seektodeal withsuch complementary case exploitingasimplebackwardinductionapproach,asexplainedlater.

First,wecall

Signal

Response Game from v the subgame

deﬁnedasaboveandcharacterizedbyavertex

v representing

thecurrentvertexof

D

(forbrevity,weshalluseSRG-v).InanSRG-v,thegoalof

D

istoﬁndthebeststrategystartingfrom vertex v torespondtoanyalarmsignal.AlltheSRG-vsareindependentandthusthebeststrategyineach subgamedoes notdependonthestrategiesoftheothersubgames.TheintuitionisthatthebeststrategyinanSRG-v doesnotdependon theverticesvisitedby

D

before theattack.Givenan SRG-v,we denoteby

σ

vD,s thestrategyof

D

onceobservedsignal

s,

by

σ

vD thestrategyproﬁle

σ

vD

= (

σ

vD,s1

. . . ,

σ

vD,sm

)

of

D

,andby

σ

vAthestrategyof

A

.LetusnoticethatinanSRG-v,given

asignal

s,

D

istheonlyagent thatplaysandthereforeeachsequenceofmovescan becollapsedina singleaction.Thus, SRG-v isessentiallyatwo-levelgameinwhich

A

decidesthetargettoattackand

D

decidesthesequenceofmovesonthe graph.

Then,accordingtoclassicalbackward inductionarguments[32],oncewehavefoundthebeststrategiesofeach SRG-v, we can substitute the subgameswith the agents’ equilibriumutilities andthen we can ﬁnd the best strategy of

D

for patrollingthe verticeswheneverno alarmsignal has beenraised andthebest strategyof attack for

A

.We call thislast problem

Patrolling Game (for

conciseness,we shallusePG).We denoteby

σ

D and

σ

A thestrategiesof

D

and

A

, respec-tively,inthePG.

5 _{Rigorously speaking, our deﬁnition of subgame is not compliant with the deﬁnition provided in Game Theory}_[32]_{, which requires that all the actions}

of a node belong to the same subgame (and therefore we could not separate action from actions t).

However, we can slightly change the structure of

our game making our definition of subgame compliant with the one from game theory. More precisely, it is sufficient to split each node of Ainto two nodes: in the first Adecides to attack a target or to wait for one turn, and in the second, conditioned to the fact that Adecided to attack, Adecides which target to attack. This way, the subtree whose root is the second node of Ais a subgame compliant with game theory. It can be easily observed that this change to the game tree structure does not affect the behavior of the agents.

(7)

N. Basilico et al. / Artiﬁcial Int elligence 246 (2017) 220–257

Fig. 3. Gametree,v isassumedtothebecurrentvertexforD,r isacollapsedsequenceofmoves,calledroute,thatDperformsonthegraph,dashedlinesindicateinformationsets,andfunctionUA(ri,tj)

(8)

2.2. The computational questions we pose

Ourcontributionsfocusonthedesignofalgorithmstofindan equilibriumforourgameanddevelopalongfourcentral questions.Thefirstonestemsdirectlyfromourproblemdefinition.

Question1.

Which is the best patrolling strategy for

D

maximizing its expected utility?

Clearly,thisproblemisrelatedtowhatwecalledPGinourgamedecomposition.Inordertobuildan answer,wepose otherthreequestionsthat,instead,involvetheothersubgamecalledSRG-v.

Question2.

Given a starting vertex v and a signal s, is there any strategy allowing

D

to visit all the targets in T

(

s

)

, each within its deadline?

Question3.

Given a starting vertex v and a signal s, is there any pure strategy giving

D

an expected utility of at least k? Question4.

Given a starting vertex v and a signal s, is there any mixed strategy giving

D

an expected utility of at least k?

These questions are not independent from each other. In particular, answering to the last three is a prerequisite to solving the ﬁrst one. For thisreason, we shall take a bottom-up approach answering the above questions starting from the last three andthen dealing withthe ﬁrst one atthe whole-gamelevel. Also, Questions 3 and 4 are not easierthan

Question 2sohardnessresultsforthislastonecanbeextendedtotheothers.

3. Signalresponsegame

InthissectionwestartbydealingwiththeresolutionofSRG-v.Speciﬁcally,inSection3.1weprovethehardnessofthe problem, analyzingits computationalcomplexity.Then, inSection 3.2andinSection 3.3we proposetwo algorithms,the ﬁrstbasedon

dynamic programming (breadth-ﬁrst

search)whilethesecondadoptsa

branch and bound (depth-ﬁrst

search) paradigm. Furthermore,we provide a variation for each algorithm, approximating the optimal solution. For the sake of presentation,themosttechnicalproofsarereportedinAppendix A.

3.1. Complexity results

Theaimofthissectionistoassessthecomputationalcomplexityofﬁndinganexactorapproximateequilibriumforour gamemodel.Furthermore,weaimatidentifyingthesourceofhardnessofourproblemandthemosteﬃcientalgorithmfor solvingit.

Each SRG-v is characterized by

|

T

|

actions available to

A

, each corresponding to a target t, and by O

(

|

V

|

maxt{d(t)}

₎

decisionnodesof

D

.Theportionofgametreeplayedby

D

canbesafelyreducedbyobservingthat

D

willmovebetween anytwo targets along a shortestpath.This allows usto discardfrom thetree all the decisionnodes where

D

occupies a non-targetvertex. Nevertheless, this reduction keeps the size of the game tree exponential in the parameters of the game,speciﬁcally O

(

|

T

|

|T|

)

.6 Noticethat,theexponentialsizeofthegametreedoesnotconstituteaproofthatﬁndingthe equilibriumstrategiesofanSRG-v requiresexponentialtimeintheparametersofthegamebecauseitdoesnotexcludethe existence ofsome compactrepresentationof

D

’sstrategies,e.g., Markovianstrategies.Theﬁrst resultweprovide implies thatsucharepresentationisveryunlikelytoexistorthat,ifitexists,itunlikelycanbecomputedinpolynomialtime.Call

gv theexpectedutilityof

A

fromSRG-v (1

−

gv isthenthecorrespondingutilityfor

D

).Wedeﬁnethefollowingproblem.

Deﬁnition1

(k–SRG-v).

Thedecisionproblem

k–SRG-v is

deﬁnedas: INSTANCE:aninstanceofSRG-v;

QUESTION:isthereany

σ

D suchthat,when

A

playsitsbestresponse,itholdsthat gv

≤

k?

Theorem1.

k–SRG-v is

NP-hard even when

|

S

|

=

1 and

the graph is a tree.

7

Theaboveresultshowsthatw.r.t.treegraphs:

•

answeringto

Question 1

isFNP-hard,

•

answeringto

Questions 2,

3,4isNP-hard. 6 _{A more accurate bound is}_O₍_|_T_|min{|T|,maxt{d(t)}}₎_.

(9)

Forthe sake ofcompleteness,noticethat a maxminstrategywithsupport upperboundedby

|

T

|

(that is,thenumberof actionsavailabletotheminplayer

A

)alwaysexists[33].

Now,giventhat weestablishedthatthemainsourceofhardnessoftheproblemiscomputingthestrategyspaceof

D

, wefocusontheproblemoffindinganefficientrepresentationforit.Westartwithsomedefinitions.

Deﬁnition2

(Route).

Givenastartingvertexv and asignal

s,

aroute

r is

aﬁnitesequenceofverticeswhere,called

r

(

i

)

the

i-th vertex,

r

(

0

)

=

v and r

(

i

)

∈

T

(

s

)

forany

i

>

0.Withaslightoverloadofnotation, call T

(

r

)

thesetoftargetsfromthe sequence.

We restrict ourattention toa special class ofroutes that we call covering. Fora route r and i

>

0, deﬁne Ar

(

r

(

i

))

=

i−1

h=0

ω

∗r(h),r(h+1).Such value gives thetime neededby

D

toarrive attarget

r

(

i

)

after following agraph walk along the

shortestpathsbetweenconsecutiveverticesofthesequence

r

(

0

),

r

(

1

),

. . . ,

r

(

i

)

.

Deﬁnition3

(Covering route).

Acoveringrouteisaroute

r where

Ar

(

t

)

≤

d

(

t

)

forany

t

∈

T

(

r

)

.

Coveringroutesareabstractionsfor

D

’savailablepure strategieswhenthecurrentvertexis v and asignal s has been generated.If

r is

acoveringrouteforvertex

v and

signal

s and

T

(

r

)

⊆

T

(

s

)

isthesetoftargetsin

r,

then wecanalways instantiate

r to

agraphwalkfor

D

(asequenceofmoveson G) thatguarantees tocapture

A

onanytarget in

T

(

r

)

.Such walk issimplyobtainedbystarting from

r

(

0

)

=

v and thencoveringanyshortestpathbetween

r

(

i

)

and

r

(

i

+

1

)

.Thetotal temporalcostofsuchwalkisexpressedby

c

(

r

)

=

Ar

(

r

(|

T

(

r

)|))

.Weshallinformallyrefertosuchprocessas

walking a route r.

Coveringroutesthen constitutethe actionspacefor

D

is aSRG-v game.Evenwhenconsidering asingle signal

s,

the numbersuchofsuchroutesis O

(

|

T

(

s

)

|

|T(s)|

₎

_in_the_worst _case._However,_some_covering_routes_will_never_be_played_by

_D

duetoanyofthefollowingtwodominancearguments[34] anddiscardingdominatedroutesiscrucialinthedesignofan eﬃcientalgorithm.

Deﬁnition4 (Intra-set dominance). Given a starting vertex v, a signal s and two different covering routes r

,

r such that

T

(

r

)

=

T

(

r

)

,if

c

(

r

)

≤

c

(

r

)

then

r dominates r

.

Deﬁnition5

(Inter-set dominance).

Givenastartingvertex

v,

asignal

s and

twodifferentcoveringroutes

r

,

r,ifT

(

r

)

⊃

T

(

r

)

then

r dominates r

.

Furthermore,itisconvenienttointroducetheconceptof

covering set,

whichisstrictlyrelatedtotheconceptofcovering route.Itisdeﬁnedasfollows.

Deﬁnition6

(Covering set).

Givenastartingvertex

v and

asignal

s,

acoveringset(covset) Q is asubset ofT

(

s

)

suchthat thereexistsacoveringroute

r with T

(

r

)

=

Q .

Let usfocuson Deﬁnition 4.It suggeststhat we can safelyuseonly one route per coveringset. Coveringsets suﬃce for describingall the outcomesofthe game,since theagents’payoffs dependonly onthe fact that

A

attacksa target t

that is coveredby

D

ornot, andinthe worst caseare O

(

2|T(s)|

₎

_,_with_a _remarkable_reduction _of_the _search _space_w.r.t. O

(

|

T

(

s

)

|

|T(s)|

₎

_. _However, _any _algorithm_restricting _on _covering _sets _should _be _able _to_determine _whether _or_not _a _set _of

targetsisacoveringone,whichisadiﬃcultproblemaswell.

Deﬁnition7

(COV-SET).

ThedecisionproblemCOV-SETisdeﬁnedas: INSTANCE:aninstanceofSRG-v withatargetset

T ;

QUESTION:is

T a

coveringset?(Equivalently,doesT admit anycoveringroute

r?)

Theorem2.

COV-SET is

NP-complete.

Therefore,computingacoveringrouteforagivensetoftargets(ordecidingthatnocoveringrouteexists)isnotdoable in polynomial time unlessP

=

NP. Thisshowsthat, whilecovering sets sufficefordefining the payoffsofthe gameand thereforethesizeofthepayoffsmatrixcanbeboundedbythenumberofcoveringsets,inpracticewealsoneedcovering routes to certify that a givensubset of targets is covering. The impossibility to confine our algorithms to the space of covering setsseems tosuggesta complexityworse than O

(

2|T(s)|

₎

_._However,_in_the_next _section _we_provide_an _algorithm

withcomplexity

O

(

2|T(s)|

₎

_(neglecting_polynomial_terms)_to_enumerate_all_and_only_the_covering_sets_and,_for_each_of_them,

theassociatedcoveringroutewithminimumcost.

Let usfocuson Deﬁnition 5. Inter-Setdominancecan be leveraged tointroduce theconcept of

maximal covering

sets (androutes)whichcouldenableafurtherreductioninthesetofactionsavailableto

D

.

(10)

Deﬁnition8

(MAXIMALITY).

Givenacoveringset Q

=

T

(

r

)

forsome

r,

wesaythat Q and r are maximalifthereisnoother coveringroute

r

suchthat Q

⊂

T

(

r

)

.

In the best case, when there is a route covering all the targets, the numberof maximal covering sets is 1 (and we can safelyrestrict to a single minimum cost coveringroute over that set), while thenumber ofcovering sets (including the non-maximal ones) is 2|T(s)|. Thus, considering only maximal covering sets allows an exponential reduction of the payoffsmatrix.In theworst case, whenall the possiblesubsetscomposed of

|

T

(

s

)|/

2 targets are maximal coveringsets, thenumberofmaximalcoveringsetsis

O

(

2|T(s)|−2

₎

_,_while_the_number_of_covering_sets_is

_O

₍

₂|T(s)|−1

₎

_,_allowing_a_reduction

ofthepayoffsmatrixby afactorof2.Furthermore,ifweknew

a priori that

Q is a maximalcoveringset,wecould avoid searching forcovering routes for anyset of targetsthat strictly contains Q . When designing an algorithm to solve this problem,

Deﬁnition 5

couldthenbeexploitedtointroducepruningtechniquestosaveaveragecomputetime.However,the followingresultshowsthatdecidingifacoveringsetismaximalishard.

Deﬁnition9

(MAX–COV-SET).

ThedecisionproblemMAX–COV-SETisdeﬁnedas:

INSTANCE:aninstanceofSRG-v withatargetset T and acoveringset

T

⊂

T ;

QUESTION:is Tmaximal?

Theorem3.

There is no polynomial-time algorithm for MAX–COV-SET unless

P

=

NP.

Nevertheless,weshow hereafter thatthereexists analgorithm computingallandonlythemaximal coveringsetsand oneroute foreach ofthem (whichpotentiallyleads toanexponentialreduction ofthetime neededforsolvingthelinear program) with only an additional polynomial cost w.r.t. the enumeration of all the covering sets. Therefore, neglecting polynomialterms,ouralgorithmhasacomplexityof

O

(

2|T(s)|

₎

_.

Finally,wefocuson thecomplexityofapproximatingthebest solutioninanSRG-v.When

D

restrictsitsstrategiesto bepure,theproblemisclearlynotapproximableinpolynomialtimeevenwhentheapproximationratiodependson

|

T

(

s

)|

. Thebasicintuitionisthat,ifagameinstanceadmitsthemaximal coveringroutethatcoversall thetargetsandthevalue ofall the targetsis 1,then eitherthe maximalcovering route is played returninga utility of1 to

D

oranyother route isplayed returninga utilityof 0,butno polynomial-timealgorithmcanﬁndthe maximalcovering routecovering allthe targets,unlessP

=

NP.Ontheotherhand,itisinterestingtoinvestigatethecaseinwhichnorestrictiontopurestrategies ispresent.Weshowthattheproblemkeepsbeinghard.

Theorem4.

The

optimization version of k–SRG-v, say OPT–SRG-v, is APX-hard even in the restricted case in which the graph is metric, there is only one signal s, all targets t

∈

T

(

s

)

have the same penetration time d

(

t

)

, and there exists a maximal covering route covering all the targets.

Theabovetheoremallowsustomaketwoimportantremarks.

Remark1. The above result does not exclude the existence of constant-ratio approximation algorithms for OPT–SRG-v.

We conjecture that itis unlikely.OPT–SRG-v presents similarities withthe (metric)DEADLINE–TSP, where thegoal isto ﬁndthelongestpathofverticeseach traversed beforeitsdeadline. TheDEADLINE–TSPdoesnot admitanyconstant-ratio approximationalgorithm [35]andthebest-knownapproximationalgorithm haslogarithmicapproximation ratio[36].The followingobservationscanbeproducedabouttherelationshipsbetweenOPT–SRG-v andDEADLINE–TSP:

•

whenthe maximal route coveringall the targetsintheOPT–SRG-v exists, the optimalsolutionof theOPT–SRG-v is alsooptimalfortheDEADLINE–TSPappliedtothesamegraph;

•

when themaximal route covering all thetargets inthe OPT–SRG-v doesnot exist,the optimalsolutions ofthe two problemsaredifferent,evenwhenwerestrictustopure-strategysolutionsfortheOPT–SRG-v;

•

approximatingtheoptimalsolutionoftheDEADLINE–TSPdoesnotgiveadirecttechnique toapproximateOPT–SRG-v, since we should enumerateall the subsets of targetsand foreach subset of targetswe wouldneed to execute the approximation of the DEADLINE–TSP, butthis would require exponential time. We notice in addition that even the totalnumberofsetsoftargetswithlogarithmicsizeisnotpolynomial,being

(

2log2(|T|)

₎

_,_and_therefore_any_algorithm

enumeratingthemwouldrequireexponentialtime;

•

whentheoptimalsolutionoftheOPT–SRG-v israndomized,examplesofoptimalsolutionsinwhichmaximalcovering routes are not played can be produced, showing that at the optimum it is not strictly necessary to play maximal coveringroutes,butevenapproximationssuﬃce.

Remark2.Ifitwerepossibletomap DEADLINE–TSPinstancestoOPT–SRG-v instanceswherethemaximalcoveringroute

covering all thetargetsexists, then itwould trivially followthat OPT–SRG-v doesnot admit anyconstant-approximation ratio.Wewerenotabletoﬁndsuchamappingandwe conjecturethat,ifthereisan approximation-preservingreduction

(11)

fromDEADLINE–TSPtoOPT–SRG-v,thenwecannotrestrict tosuch instances.ThestudyofinstancesofOPT–SRG-v where mixedstrategiesmaybeoptimalmakethetreatmentverychallenging.

3.2. Dynamic-programming algorithm

Westartbypresentingtwoalgorithms.Theﬁrstoneisexact,whilethesecondoneisanapproximationalgorithm.Both algorithmsarebasedonadynamicprogrammingapproach.

3.2.1. Exact algorithm

Inthissectionweprovideanalgorithmtocomputethestrategiesavailableto

D

when

v is

thecurrentvertexandsignal

s has beengeneratedbythealarmsystem.Theideaistoadoptadynamicprogrammingmethodthatenumeratescovering setsofincreasingcardinalities.Eachcoveringsetisobtainedbyasequenceofexpansionsthat,startingfromtheemptyset, add onetarget ateach iteration.Wedenoteacovering setby Qk

v,t where

k indicates

itscardinalitywhile v and t denote the startingvertexof

D

andthelasttarget addedtothe set,respectively.The algorithmoperatesinsuch awaythat the sequence oftargetscorresponding to theexpansionsmadeto obtain Q_vk_,_t isa covering routeforthat set.Informally,we shall call it the

generative route of

Qkv,t and we willdenote withc

(

Qkv,t

)

its temporalcost. We choose toobtain thisby requiringanyexpansiontobe admissiblewithrespectto threeconditions.Givenaset Qk

v,t a newset Qvk,+w1

=

Qkv,t

∪ {

w

}

canbeobtainedif:

1. w isnotcoveredinthecurrentcoveringset,thatis

w

∈

T

(

s

)

\

Qk v,t;

2. w canbecoveredby

d

(

w

)

byextendingthegenerativerouteof Q_vk_,_t withashortestwalkfrom

t to w,

thatis

d

(

w

)

≥

c

(

Qk_v_,_t

)

+

ω

_t∗_,_w;

3. call

t

atargetvisitedbyashortestpathfrom

t to w,

if

t

∈

/

Qk

v,t thenitcannotbecovered,thatis

d

(

t

)

<

c

(

Qvk,t

)

+

ω

∗t,t. Conditions 1 and 2requireanyexpansion to formanew coveringset consistentwith

Deﬁnition 6

,thus ensuring the algorithm’ssoundness.Condition 3requires

Q

k

v,t tobea

proper descriptor

ofitsgenerativeroute,meaningthatsuchroute, oncewalked,coversuniquelythetargetsincludedin

Q

k

v,t andnottargetsoutsideit.Thislastrequirementisanoperational choice to reduce the numberof expansionsmadein each iteration ofthe algorithm. Consider forinstance agraph with degree boundedby 3, Condition 3allows usto generateintheworst case3 new covering setsateach expansion round insteadof

|

T

|

.NoticethatunderCondition 3thealgorithmcanstillgenerateanymaximalcoveringset.

Lemma5.

Any maximal covering set can be generated from the empty set with a sequence of admissible expansions.

Proofsketch. Consideranexpansionwhich,byaddinga target w, violatesCondition 3foratarget

t

.Such expansioncan always be split intwo admissible ones. Theﬁrst adding

t

,the second adding w. The samerationaleworks formultiple expansionsandmultiple

t

andclearlyappliestomaximalcoveringsetswhichareasubsetofcoveringsets.

2

Condition 3can beexploitedtoreducetherequiredcompute timebut,rigorouslyspeaking,itpresentsadrawback.To establish iftarget w can be added to set Qk

v,t the algorithmneeds tocheck every shortest pathfrom

t to

w, andsuch paths can be,inthe worstcase, exponentiallymany.We cancope withthisby adoptingthe followingsimpliﬁcation. We ﬁxasetofcanonicalshortestpathsbyrunningtheFloyd–Warshallalgorithm.Then,given

t and

w, wefetchthecanonical shortestpathbetweenthemandwecheckCondition 3assuming thatsuch pathisunique.Iftheconditionisnot verified underthisassumptionthenitisalsonot verifiedinitsoriginal definition.Ifotherwise,thecondition isverified,then the algorithm(assumingvalidityofConditions 1and 2)mightobtainacoveringsetQ

¯

k+1

v,w whichisnotaproperone,meaning that by walkingits generative route atleastone target t

∈ ¯

/

Qkv,+w1 gets covered.Let usassume that this isthe caseand, w.l.o.g., that

t

isthe onlyinvalidating target. Thealgorithm’s currentsolution representation(the set Q

¯

k+1

v,w) would then be inconsistent withthe actual solution(the generative route).By Lemma 5 though, weknow that thealgorithm would generatealsoset Qkv+,w2 makingtwoadmissibleexpansionswith

t

and

w to Q

kv,t.Since Qvk+,w2

= ¯

Qvk,+w1

∪ {

t

}

thenon-proper covering set Q

¯

k+1

v,w isremovableby inter-set dominance(Deﬁnition 5). Obviously thisworkaroundcomes atan additional cost:thealgorithmunnecessarilygeneratestheset Q

¯

vk+,w1 whichundertheproperdeﬁnitionofCondition 3wouldhavebeen avoided.Still,theabovesolutionturnedoutveryconvenientsinceexponentialmultiplicityofshortestpathsisveryunlikely ingraphsrepresentingrealenvironments.

In Algorithm 1 we provide full technical details. Covering sets obtainedby the algorithm are groupedin collections:

C

k

v,t denotes thecollectionofallcoveringsets ofcardinality

k where

thelastexpansionaddedtarget

t.

Aftertherequired initializationsteps(Lines1and2)foranygenerated

Q

vk−,t1wecomputethesetofadmissibleexpansions

Q

+(Line6)andwe applyeachoneofthem(Line8).InStep9,wemakeuseofaprocedurecalled

Search

(

Q

,

C)

where

Q is

acoveringsetand

C

isacollectionofcoveringsets.Theprocedureoutputs

Q if

Q

∈

C

and

∅

otherwise.(Weadoptedaneﬃcientimplementation ofsuch procedurewhichcanrunin O

(

|

T

(

s

)

|)

.Moreprecisely,werepresentacoveringset Q as abinary vectoroflength

(12)

|

T

(

s

)

|

wherethe

i-th

componentisset to1 iftarget ti

∈

Q and 0 otherwise.A collectionofcovering sets C can thenbe representedasa binary treewith depth

|

T

(

s

)|

.The membershipof acovering set Q to collectionC is represented with a branchofthetreeinsuch awaythat if

t

i

∈

Q then wehavea left edgeatdepth

i

−

1 onsuch branch.We caneasily determineif Q

∈

C

bycheckingiftraversingaleft(right)edgeinthetreeeachtimeweread a1 (0)in Q ’s binaryvector wereachaleafnodeatdepth

|

T

(

s

)

|

.Theinsertionofanewcoveringsetinthecollectioncanbedoneinthesamewayby traversingexistingedgesandexpandingthetreewherenecessary.)Ifthenewlygeneratedcoveringsetisnotpresentinits collectionorisalreadypresentwithahighercost(Step10),thencollectionandcostareupdated(Steps11and12).

Algorithm1DP-ComputeCovSets(v

,

s). 1:∀t∈T(s): C0v,v= {∅} 2: c(∅)=0 3: for all k∈ {1 . . .|T(s)|}do 4: forall t∈T(s)do 5: forall Qkv−,t1∈ C k−1 v,t do

6: Q+= {w∈T(s)|Conditions 1–3 are satisﬁed}

7: forall w∈Q+do 8: Qk v,w=Qkv−,t1∪ {w} 9: U=Search(Qk v,w,Ckv,w) 10: if c(U)>c(Qvk−,t1)+ωt∗,wthen 11: Ck v,w= Ckv,w∪ {Qvk,w} 12: c(Qk v,w)=c(Qkv−,t1)+ωt∗,w 13: return{Ck v,t:t∈T(s),k≤ |T(s)|}

After

Algorithm 1

completeditsexecution,foranyarbitrary

T

⊆

T we caneasilyobtainthetemporalcostofitsshortest coveringrouteas

c∗

(

T

)

=

min

Q∈Y_|_T_|c

(

Q

)

where Y_|T|

= ∪

t∈T

{

Search

(

T

,

C

|T

_|

v,t

)}

(notice that ifT isnot acovering setthen c∗

(

T

)

= ∞

).Forthe sakeof simplicity,

Algorithm 1doesnotspecifyhowtocarryouttwosub-taskswedescribeinthefollowing.

Theﬁrstoneisthe

annotation of dominated covering sets.

EachtimeSteps11and12areexecuted,acoveringsetisadded tosomecollection.Letuscallit Q and assumeithascardinality

k.

Eachtimeanew

Q has

tobeincludedatcardinality

k,

wemarkallthecoveringsetsatcardinality

k

−

1 thataredominatedby Q (Deﬁnition 5).Thenumberofsetsthatcanbe dominatedisintheworstcase

|

Q

|

sinceeachofthemhastobesearchedincollection

C

k_v−_,_t1foreachfeasibleterminal

t and,

iffound,markedasdominated.Thenumberofterminaltargetsandthecardinalityof Q are atmost

n and,

asdescribed above,the Search procedure takes O

(|

T

(

s

)|)

.Therefore,dominatedcoveringsetscanbeannotatedwitha O

(|

T

(

s

)|

3

)

extra cost ateachiteration of

Algorithm 1

.We canonly markandnotdeletedominated coveringsets sincethey cangenerate non-dominatedonesinthenextiteration.

The secondtaskis the

generation of routes.

Todothiswe maintaina listofgeneratingroutes byiteratively appending terminal vertex w to the generative route of Q_vk−_,_t1 when set Qk_v−_,_t1

∪ {

w

}

is included inits corresponding collection.At theendofthealgorithmonlyroutesthatcorrespondtonon-dominatedcoveringsetsarereturned.Maintainingsuch alist introducesa O

(

1

)

cost.

Theorem6.Algorithm 1is an exact algorithm and has worst-case complexity of O

(

|

T

(

s

)

|

22|T(s)|

₎

_{since it has to compute covering sets} up to cardinality

|

T

(

s

)

|

. With annotations of dominances and routes generation the whole algorithm yields a worst-case complexity of O

(|

T

(

s

)|

52|T(s)|

₎

_.

Noticethatthealgorithmisexactsinceitisbasedonanenumerationprocedure.

Example3.We providea simpleexampleof executionof

Algorithm 1

. Considera probleminstancewitha single signal, arbitrarytargetvalueswhiletopologyandpenetrationtimesareasfollow:

(13)

Wereporttheexpansionsmadebythealgorithmforincreasingcardinalities(valueof

k)

inthefollowingtable. k=0 k=1 k=2 k=3 ∅,c=0 Q1 v,t3= {t3},c=1 Q 2 v,t1= {t1,t3},c=2 Q 3 v,t2= {t1,t2,t3},c=3 Q3 v,t4= {t1,t3,t4},c=4 Q2 v,t2= {t2,t3},c=2 Q 3 v,t1= {t1,t2,t3},c=3 Q3 v,t4= {t2,t3,t4},c=4 Q2 v,t4= {t3,t4},c=2 Q 3 v,t1= {t1,t3,t4},c=4, Q3 v,t2= {t2,t3,t4},c=4,♦ Q1 v,t4= {t4},c=1 Q 2 v,t3= {t3,t4},c=2 Q 3 v,t1= {t1,t3,t4},c=3, Q3 v,t2= {t2,t3,t4},c=3,

NoticethatConstraint 3intervenesbothwhenexpandingcoveringsetswith

k

=

1 and

k

=

2.Inthetable,

c indicates

the temporal costofthe relativecovsetwhilethecovsetmarked with

dominatestheone markedwith

♦

whilethecovset markedwith

dominatestheonemarkedwith

.

3.2.2. Approximation algorithm

The dynamicprogrammingalgorithmpresentedintheprevious section cannotbe directlyadoptedtoapproximatethe maximal coveringroutes. Wenotice that even inthe casewe introduce a logarithmic upper bound over thesize of the coveringsetsgeneratedby

Algorithm 1

,wecouldobtainanumberofroutesthatis

O

(

2log2(|T(s)|)

₎

_,_and_therefore_exponential.

Thus, ourgoalistodesigna polynomial-timealgorithm thatgeneratesa polynomialnumberof

good covering

routes. We observethatifwehaveatotalorderovertheverticesandweworkoveracompletegraphofthetargetswhereeachedge correspondstotheshortestpath,wecanﬁndinpolynomialtimethemaximalcoveringroutessubjecttotheconstraintthat, givenanypairoftargets

t

,

tinaroute,

t can

precede

t

intherouteonlyif

t precedes t

intheorder.Wecall

monotonic

a routesatisfyingagiventotalorder.

Algorithm 2

returns themaximalmonotoniccoveringrouteswhenthetotalorderis lexicographic(trivially,tochangetheorder,itissuﬃcienttore-labelthetargets).

Algorithm 2isbasedondynamicprogrammingandworksasfollows.R

(

k

,

l

)

isamatrixstoringineachcellone route, while L

(

k

,

l

)

isamatrixstoringineachcellthemaximumlatenessofthecorrespondingroute(seebelowforthemeaning of

k and l).

Themaximumlatenessofaroute

r captures

themaximumdelaybetweenatarget’s ﬁrstvisitanditsdeadline. Formally, itisdeﬁnedasmaxt∈T(s)Ar

(

t

)

−

d

(

t

)

. Theroute storedin R

(

k

,

l

)

isthe onewiththe minimumlatenessamong all the monotonic onescovering l targets wheretk is theﬁrst visitedtarget. Thus, basically,when l

=

1, R

(

k

,

l

)

contains theroute

v

,

tk

,while,when

l

>

1,

R

(

k

,

l

)

isdeﬁnedappendingto R

(

k

,

1

)

thebest(intermsofminimizingthemaximum lateness)route R

(

k

,

l

−

1

)

forevery

k

>

k, inordertosatisfy thetotalorder.The wholesetofroutesin R are returned.8 Thecomplexityof

Algorithm 2

is O

(|

T

(

s

)|

3

₎

_,_except_the_time_needed_to_ﬁnd_all_the_shortest_paths.

Algorithm2MonotonicLongestRoute(v

,

s). 1:∀k,k∈ {1,2,. . . ,|T(s)|},R(k,k)= ∅,L(k,k)= +∞,CR(k)= ∅,CL(k)= +∞ 2: for all∀k∈ {|T(s)|,|T(s)|−1,. . . ,1}do 3: forall∀l∈ {1,2,. . . ,|T(s)|}do 4: if l=1 then 5: R(k,l)= v,tk 6: L(k,l)=ω∗v,tk−d(tk) 7: else 8: forall ks.t. |T(s)|≥k>|T(s)|−k do 9: CR(k)= R(k,1),R(k,l−1) 10: CL(k)=max{L(k,1),ω∗v,tk+ω ∗ tk,k−ω ∗ v,k+L(k,l−1)} 11: j=arg minj{CL(j)} 12: if CL(j)≤0 then 13: R(k,l)←CR(j) 14: L(k,l)←CL(j) 15: return R

We usedifferenttotalorders(breakingtiesrandomly)over thesetoftargets,collecting all theroutesgeneratedusing eachtotalorder:

•

increasingorder

ω

∗v,t:therationaleisthattargetscloseto v will bevisitedbeforetargetsfarfrom

v;

•

increasingorder

d

(

t

)

:therationaleisthattargetswithshortdeadlineswillbevisitedbeforetargetswithlongdeadlines;

•

increasing order

d

(

t

)

−

ω

∗v,t (wecallthisquantity

excess time):

therationaleisthattargetswithshortexcesstimewill bevisitedbeforetargetswithlongexcesstime.

8 _{We notice that dominance can be applied to discard dominated routes. However, in this case, the improvement would be negligible since the total}

(14)

Algorithm3Branch-and-Bound(v,

s, ρ

). 1: C Lmax← ∅ 2: C Lmin← ∅ 3: for all t∈T(s)do 4: ifω∗v,t≤d(t)then 5: Tree-Search(ρ· |T(s)|,v,t) 6: return C Lmax

Inaddition,weusea randomrestartgeneratingrandompermutationsoverthetargets.

Theorem7.Algorithm 2provides an approximation with ratio

(

_|_T1₍_s₎_|

)

.

Proofsketch. Theworstcasefortheapproximationratioofouralgorithmoccurswhenthecoveringrouteincludingallthe targetsexistsandeachcoveringroute returnedby ourheuristic algorithmvisitsonlyonetarget. Inthatcase,the optimal expectedutilityof

D

is1.Ouralgorithm,intheworst caseinwhich

π

(

t

)

=

1 foreverytarget

t,

returnsanapproximation ratio

(

_|_T1₍_s₎_|

)

.Itisstraightforwardtoseethat,inothercases,theapproximationratioislarger.

2

3.3. Branch-and-bound algorithms

Thedynamicprogrammingalgorithmpresentedintheprevioussectionessentiallyimplementsabreadth-firstsearch.In some specificsituations,depth-firstsearchcould outperformbreadth-firstsearch,e.g.,whenpenetration timesare relaxed andgoodheuristicsleadadepth-firstsearchtofindinabrieftimethemaximalcoveringroute,avoidingtoscanan expo-nentialnumberofroutesasthebreadth-first searchwoulddo. Inthissection,we adoptthebranch-and-bound approach to design both an exact algorithm andan approximation algorithm. Inparticular, in Section 3.3.1 we describe ourexact algorithm,whileinSection3.3.2wepresenttheapproximationone.

3.3.1. Exact algorithm

Ourbranch-and-boundalgorithm(see

Algorithm 3

)isatree-searchbasedalgorithmworkingonthespaceofthecovering routesandreturningasetofcoveringroutesR. Itworksasfollows.

Initialstep. Weexploittwoglobalsetvariables,

C L

min and

C L

maxinitiallysettoempty(Steps 1–2of

Algorithm 3

).These variables contain

closed covering

routes,namely coveringrouteswhich cannot befurther expandedwithout violatingthe penetrationtimeofatleastonetargetduringthevisit.

C L

maxcontainsthecoveringroutesreturnedbythealgorithm,while

C Lmin is usedforpruningasdiscussedbelow.Givena starting vertexv and a signal s, foreach target

t

∈

T

(

s

)

such that

ω

∗v,t

≤

d

(

t

)

wegenerateacoveringroute

r

=

v

,

t

with

r

(

0

)

=

v and r

(

1

)

=

t (Steps 5 of

Algorithm 3

).Thus,

D

hasatleast onecoveringroute pertargetthat canbecoveredintimefrom v. Noticethatif, forsome

t,

such minimalroutedoesnot exist,then target

t cannot

becovered (even theshortestpathfromthe starting vertexv cannot guaranteecapture).This doesnotguaranteethat

A

willattack

t with

fullprobabilitysince,dependingonthevalues

π

,

A

couldﬁndmoreproﬁtable torandomizeoveradifferentsetoftargets.Themeaningofparameter

ρ

(usedinLine 5of

Algorithm 3

)isdescribedbelow.

Routeexpansions. Thesubsequentstepsessentiallyevolveoneachbranchaccordingtoadepth-ﬁrstsearchwith back-trackinglimitedby

ρ

.The choiceof

ρ

directlyinﬂuences thebehavior ofthealgorithmandconsequently itscomplexity. Eachnode inthesearch treerepresentsa route

r built

sofarstarting froman initialroute

v

,

t

.Ateach iteration,route

r is expandedbyinsertinga newtargetataparticularposition.Wedenotewith

r

+

(

q

,

p

)

theroute obtainedbyinserting target q after the p-th target in r. Notice that every expansion of r will preserve the relative orderwith which targets alreadypresentin

r will

bevisited.Thecollectionofallthefeasibleexpansions

r

+s(i.e.,theonesthatarecoveringroutes) isdenotedby R+ anditisorderedaccordingtoaheuristicthatwe describebelow.

Algorithm 6

,described below,isused togenerate

R

+(Step 1of

Algorithm 4

).Ineachopenbranch(i.e.,

R

+

= ∅

),ifthedepthofthenodeinthetreeissmalleror equalto

ρ

· |

T

(

s

)

|

thenbacktrackingisdisabled(Steps 7–10of

Algorithm 4

),while,ifthedepthislargerthansuchvalue, isenabled(Steps 5–6of

Algorithm 4

).Thisisequivalenttoﬁxtherelativeorderoftheﬁrst(atmost)

ρ

· |

T

(

s

)

|

inserted targetsinthe currentroute.Inthiscase, with

ρ

=

0 wedonot relyon theheuristicsatall,full backtrackingis enabled, thetreeisfullyexpandedandthereturned R is complete,i.e.,itcontains allthenon-dominatedcovering routes.Route

r

isrepeatedlyexpandedinagreedyfashionuntilnoinsertionispossible.Asaresult,

Algorithm 4

generatesatmost

|

T

(

s

)|

coveringroutes.

Pruning.Algorithm 5 is incharge ofupdating C Lmin andC Lmax each timea route r cannot be expandedand, conse-quently, theassociated branchmustbe closed.We call C Lmin the

minimal set

ofclosed routes. Thismeans that aclosed route

r belongs

to C Lmin onlyifC Lmin doesnotalreadycontainanother

r

⊆

r. Steps 1–4of

Algorithm 5

implementsuch condition:ﬁrst,inSteps 2–3anyroute

r

such that

r

⊇

r is removedfrom

C L

min,thenroute

r is

insertedin

C L

min.Routes in

C L

min areusedby

Algorithm 6

inSteps 2and 6forpruningduringthesearch.Moreprecisely,aroute

r is

notexpanded withatarget

q at

position p if thereexistsaroute

r

∈

C Lminsuchthat

r

⊆

r+

(

q

,

p

)

.Thispruningruleissafesinceby def-initionif

r

∈

C Lmin,thenallthepossibleexpansionsof

r

areunfeasibleandif

r

⊆

r then r can beobtainedbyexpanding from

r

.Thispruningmechanismexplainswhyoncea route

r is

closedisalways inserted inC Lmin withoutcheckingthe