Contents lists available atScienceDirect
Artificial
Intelligence
www.elsevier.com/locate/artint
Adversarial
patrolling
with spatially
uncertain
alarm
signals
Nicola Basilico
a,
∗
, Giuseppe De Nittis
b, Nicola Gatti
baDepartmentofComputerScience,UniversityofMilan,Milano,Italy
bDipartimentodiElettronica,InformazioneeBioingegneria,PolitecnicodiMilano,Milano,Italy
a
r
t
i
c
l
e
i
n
f
o
a
b
s
t
r
a
c
t
Articlehistory:
Received 15 June 2015
Received in revised form 6 August 2016 Accepted 26 February 2017
Available online 4 March 2017
Keywords:
Security games Adversarial patrolling Algorithmic game theory
When securing complex infrastructures orlarge environments, constantsurveillance of everyarea is not affordable.To cope withthisissue, a commoncountermeasureis the usageof cheapbut wide-rangedsensors, abletodetect suspicious events thatoccur in largeareas,supportingpatrollerstoimprovetheeffectivenessoftheirstrategies.However, suchsensors are commonly affectedby uncertainty. In the present paper, wefocus on spatiallyuncertainalarmsignals.Thatis,thealarmsystemisabletodetect anattackbut itisuncertainontheexactposition wheretheattackistakingplace.Thisiscommonwhen theareatobesecurediswide,suchasinborderpatrollingandfairsitesurveillance.We propose,tothebestofourknowledge,thefirstPatrollingSecurityGamewhereaDefender
issupportedbyaspatiallyuncertainalarmsystem,whichnon-deterministicallygenerates
signals onceatargetisunderattack.WeshowthatfindingtheoptimalstrategyisFNP-hard evenintreegraphsandAPX-hardinarbitrarygraphs.Weprovidetwo(exponentialtime) exactalgorithms andtwo (polynomialtime)approximationalgorithms.Finally,weshow that,withoutfalsepositivesandmisseddetections,thebestpatrollingstrategyreducesto stayinaplace,waitforasignal,and respondtoitatbest.Thisstrategyisoptimal even withnon-negligiblemisseddetectionrates,which,unfortunately,affecteverycommercial alarmsystem. We evaluate our methods in simulation,assessing both quantitativeand qualitativeaspects.
©2017ElsevierB.V.Allrightsreserved.
1. Introduction
SecurityGamesmodelthetaskofprotectingphysicalenvironmentsasanon-cooperativegamebetweena
Defender and
an
Attacker
[1].ThesegamesusuallytakeplaceunderaStackelberg (a.k.a. leader–follower)
paradigm[2],wheretheDefender (leader) commits to astrategy and the Attacker(follower) first observes such commitmentandthen best responds to it. As discussed in the seminal work [3],finding a leader–follower equilibrium is computationally tractable in games with one follower andcomplete information, while it becomes hard in Bayesian games with differenttypes of Attacker. The availability of such computationallytractable aspects of SecurityGames led tothe development ofalgorithms capable of scalingup tolargeproblems,makingthemdeployableinthesecurityenforcingsystemsofseveralreal-worldapplications. The first notable examples are the deployment ofpolice checkpoints atthe Los AngelsInternationalAirport [4] andthe schedulingoffederalairmarshalsovertheU.S.domesticairlineflights[5].Morerecentcasestudiesincludethepositioning of U.S.CoastGuardpatrols to securecrowdedplaces,bridges, andferries [6]andthearrangementof cityguards tostop fare evasioninLos AngelesMetro [7].Finally,a similar approachisbeingtested andevaluated inUganda, Africa, forthe*
Corresponding author.E-mailaddress:[email protected](N. Basilico). http://dx.doi.org/10.1016/j.artint.2017.02.007 0004-3702/©2017 Elsevier B.V. All rights reserved.
protection of wildlife [8]. Thus, Security Games emerged asan interesting game theoretical tool andthen showed their on-the-fieldeffectivenessinanumberofrealsecurityscenarios.
We focus ona specific classof security games, calledPatrolling Security Games. Thesegames are modeled as infinite-horizon extensive-form games in which the Defender controls one or more patrollers moving within an environment, represented asa finite graph.The Attacker, besides havingknowledge of the strategy to whichthe Defender committed to, can observethe movementsof thepatrollers atany time andusesuch informationin decidingthe mostconvenient timeandtargetlocationtoattack[9].Whenmultiplepatrollersareavailable,coordinatingthematbestisingeneralahard taskwhich,besidescomputationalaspects,mustalsokeepintoaccountcommunicationissues[10].However,thepatrolling problemis tractable, even withmultiple patrollers,in border security(e.g., lineand cyclegraphs), when patrollershave homogeneousmovingandsensingcapabilitiesandalltheverticescomposingthebordersharethesamefeatures[11]. Scal-ing thismodelinvolved thestudy ofhowto compute patrollingstrategies inscenarios wherethe Attacker isallowed to performmultipleattacks[12].Similarly,coordinationstrategiesamongmultipledefendersare investigatedin[13].In[14], theauthors studythecaseinwhich thereisa temporaldiscount onthe targets. Extensions arediscussed in[15], where coordination strategiesbetweendefenders areexplored, in[16],wherea resourcecan covermultipletargets, andin[17]
whereattackscanbe detectedatdifferentstageswithdifferentassociated utilities.Finally,some theoreticalresultsabout propertiesofspecific patrollingsettingsareprovided in[18]. Inthepresentpaper,weprovide anewmodelofPatrolling SecurityGamesinwhichtheDefenderissupportedbyan
alarm system deployed
intheenvironment.1.1. Motivating scenarios
Often,inlargeenvironments,aconstantsurveillanceofeveryareaisnotaffordablewhilefocused inspectionstriggered by alarms are more convenient.Real-world applications includeUAVs surveillance oflarge infrastructures [19], wildfires detectionwithCCDcameras[20],agriculturalfieldsmonitoring[21],surveillancebased onwirelesssensornetworks[22], andborderpatrolling[23].Alarmsystemsareinpracticeaffectedby
detection uncertainty,
e.g.,misseddetectionsandfalse positives,andlocalization (a.k.a.
spatial)uncertainty,e.g.,thealarmsystemisuncertainabouttheexacttargetunderattack. Wesummarilydescribetwopracticalsecurityproblemsthatcanbeascribedtothiscategory.Wereportthemasexamples, presentingfeatures andrequirementsthatourmodelcan properlydealwith. Intherestofthe paperwewill necessarily take a generalstance,butwe encourage thereaderto keep inmind thesetwo casesasreferenceapplications fora real deploymentofourmodel.1.1.1. Fight to illegal poaching
Poachingisawidespreadenvironmentalcrimethatcausestheendangermentofwildlifeinseveralregionsoftheworld. Itsdevastatingimpactmakesthedevelopmentofsurveillancetechniquestocontrastthiskindofactivitiesoneofthemost importantmattersinnationalandinternationaldebates. Poachingtypicallytakesplaceovervastandsavageareas,making itcostly andineffective tosolelyrelyon persistentpatrolby rangersquads.Toovercomethisissue,recentdevelopments havefocused onprovidingrangers withenvironmental monitoringsystemstobetter plantheir inspections, concentrating theminareaswithlargelikelihoodofspottingacrime.SuchsystemsincludetheuseofUAVsflyingoverthearea,alarmed fences, andon-the-field sensors trying to recognize anomalous activities.1 In all these cases, technologiesare meant to
workasanalarmsystem:oncetheillegalactivityisrecognized,asignal issenttotherangersbasestationfromwherea responseisundertaken.Inthegreatmajorityofcases,asignalcorrespondstoaspatiallyuncertainlocalizationoftheillegal activity.Forexample,acamera-equippedUAVcanspotthepresenceofapickupinaforbiddenarea butcannotderivethe actuallocationtowhichpoachersaremoving.Inthesameway,alarmed fencesandsensorscanonlytransmitthelocation ofviolated entrancesorforbiddenpassages.Inall thesecasesasignal impliesarestricted,yetnotprecise,localizationof thepoachingactivity.TheuseofSecurityGamesinthisparticulardomainisnotnew(see,forexample,[8]).However, our model allows thecomputation of alarm responsestrategies fora given alarmsystem deployed on thefield. This can be doneby adoptingadiscretizationoftheenvironment,whereeach targetcorrespondstoasector, valuesare relatedtothe expectedpopulationofanimalsinthatsector,anddeadlinesrepresenttheexpectedcompletiontimeofillegalhunts(these parameterscanbederivedfromdata,asdiscussedin[8]).
1.1.2. Safety of fair sites
Fairsarelargepubliceventsattendedbythousandsofvisitors,wheretheproblemofguaranteeingsafetyforthehosting facilitiescanbeveryhard.Forexample,Expo2015,therecentUniversalExpositionhostedinMilan,Italy,sawanaverageof about100,000visitsperday.Thisposestheneedforcarefullyaddressing safetyrisks,whichcanalsoderivefromplanned act of vandalism or terrorist attacks. Besides security guards patrols, fair sites are often endowed with locally installed monitoringsystems.Expo2015employedaround200bafflegatesand400metaldetectorsattheentranceofthesite.The internalareawasconstantlymonitoredby4000surveillancecamerasandby700guards.Likely,whenoneormoreofthese devices/personnelidentifiedasecuritybreach,asignalwassenttothecontrolroomtogetherwithacircumscribedrequest of intervention. This approach is required because, especially in this kind of environments, detecting a security breach
andneutralizing itare verydifferenttasks. Thelatterone,in particular,usually requiresagreater effort involvingspecial equipmentandpersonnelwhoseemploymentonademandbasisismuchmoreconvenient.Moreover,thedetectinglocation ofa threatisinmanycasesdifferentfromthelocation whereitcouldbe neutralized,making therequestofintervention spatially uncertain.Forinstance, consider asecurity guard ora surveillance camera detecting the visitors’ reactions to a shootingrampageperformedbysomeattacker.Inexampleslikethese, wecanrestrict theareawherethesecuritybreach happenedbutnopreciseinformationaboutthelocationcanbegatheredsincetheattackerwillprobablyhavemoved.Our modelcouldbe appliedtoprovideapolicy withwhichscheduleinterventionsuponasecuritybreachisdetectedinsome particular section ofthesite. Insuch case, targetscouldcorrespond tobuildings orother installationswhere visitorscan go.Values anddeadlinescanbechosen accordingtotheimportance oftargets, theirexpectedcrowding,andtherequired responsepriority.
1.2. Alarms and security games
While theproblemofmanaging a sensornetworkto optimallyguard security-criticalinfrastructures hasbeen investi-gatedinrestricteddomains,e.g.[24],theproblemofintegratingalarmsignalstogetherwithadversarialpatrollingisalmost completelyunexplored.Theonlyworkthatcanbeclassifiedunderthisscopeis[25].Thepaperproposesaskeletonmodel of analarm systemwheresensors havenospatial uncertaintyindetecting attackson single targets. Theauthors analyse howsensoryinformationcanimprovetheeffectivenessofpatrollingstrategiesinadversarialsettings.Theyshowthat,when sensors arenotaffectedby falsenegativesandfalse positives,thebeststrategyprescribes thatthepatrollerjustresponds toan alarmsignalrushingtothetarget underattack withoutpatrollingtheenvironment.Asaconsequence,insuch cases themodeltreatmentbecomes trivial.Ontheother hand,whensensorsareaffectedonlyby falsenegatives,thetreatment canbecarriedoutbymeansofaneasyvariationofthealgorithmforthecasewithoutsensors[9].Inthelastcase,where false positivesareadmitted,theproblembecomescomputationally intractable.Tothebestofourknowledge,noprevious resultdealingwithspatialuncertainalarmsignalsinadversarialpatrollingispresentintheliterature.
Effectively exploitinganalarm systemanddetermining agood deploymentforit(e.g.,selecting thelocation whereto installsensors)arecomplementarybutradicallydifferentproblems.Theresultsweprovideinthisworklieinthescopeof the firstone while thetreatment ofthesecond one isleft forfuture works.In other words,we assume thata deployed alarm systemis givenandwe dealwiththeproblemofstrategicallyexploitingitatbest.Anyapproach tosearch forthe optimaldeploymentshouldknow,inprinciple,howtoevaluatepossibledeployments.Insuchsense,ourproblemneedsto beaddressedbeforeonemightdealwiththedeploymentone.
1.3. Contributions
In thispaper,wepropose thefirst SecurityGamethat integratesaspatially uncertainalarmsystemin game-theoretic settings forpatrolling.2 Analarmsignalcarriestheinformationaboutthesetoftargetsthatcan beunderattackanditis
describedbytheprobabilityofbeinggeneratedwheneachtargetisattacked.Theanalysisandthemainresultswepropose inthisworkaredevotedtothebasicgamemodelwheretheDefendercancontrolasinglepatrollerandthealarmsystem is immune tofalse positivesandfalse negatives,making spatialuncertainty itsonly limitation.As ourresultsshow,such assumptions donot playdownthe significanceofthearisingcomputational challengeswhoseresolutionisaprerequisite formorecomplexsettings.Indeed,extensionsofthismodelthatgeneralizetomulti-resourcesettings[27]andthatconsider falsenegatives[28]arebuiltonthebasicresultprovidedinthiswork.Thegameweconsidercanbedecomposedinafinite numberoffinite-horizonsubgames,eachcalledSignalResponseGamefrom
v (SRG-v)
andcapturingthesituationinwhich theDefenderisinavertexv and theAttackerattackedatarget,andaninfinite-horizongame,calledPatrollingGame(PG), in whichtheDefendermovesinabsence ofanyalarmsignal. Weshow thatSRG-v is FNP-hardfortree-basedtopologies andthat,forarbitrarygraphs,isAPX-hard.Weprovidetwoexactalgorithms.Thefirstone,basedondynamicprogramming, performsabreadth-firstsearch,whilethesecondone,basedonbranch-and-boundapproach,performsadepth-firstsearch. Weusethesametwoapproachestodesigntwoapproximationalgorithms.Then,westudythePG,showingthatwhennofalsepositivesandnomisseddetectionsarepresent,theoptimalDefender strategy isto stayina fixedlocation, waitforasignal, andrespondtoitatbest.Thisstrategy keepsbeingoptimaleven whennon-negligiblemisseddetectionratesareallowed.Weexperimentallyevaluatethescalabilityofourexactalgorithms andwecomparethemwiththeapproximation onesintermsofsolutionqualityandcomputetimes,investigatinginhard instances thegapbetweenourhardness resultsandthetheoreticalguaranteesofourapproximationalgorithms.Weshow thatourapproximationalgorithmsprovideveryhighqualitysolutionseveninhardinstances.Finally,weprovideanexample ofresolutionforarealisticinstance,basedonExpo2015,andweshowthat ourexactalgorithmscanbeappliedforsuch kind ofsettings.Moreover, inourrealistic instancewe assesshow theoptimalpatrolling strategycoincides witha static placementevenwhenallowingafalsenegativerateoflessorequalto30%.
Fig. 1. Example of patrolling setting. 1.4. Paper structure
InSection2,weintroduceourgamemodel.InSection3,westudytheproblemoffindingthestrategyoftheDefender forrespondingtoanalarmsignal.InSection4,westudythepatrollingproblem. InSection5,weexperimentallyevaluate our algorithms.In Section 6, we briefly discussthe mainSecurity Games research directions that havebeenexplored in thelastdecades.Finally,Section7concludesthepaper.
Appendix A
includesthemosttechnicalproofsofthepaperwhileAppendix Bdiscussessomeparticularresultsobtainedforspecialclassesofinstances.
Appendix C
reportssome additional experimentalresultsandAppendix D
providesatablesummarizingthenotationsymbolsusedinthepaper.2. Problemstatement
Inthissectionweformalizetheproblemwestudy.Moreprecisely,inSection2.1wedescribethepatrollingsettingand thegamemodel,whileinSection2.2westatethecomputationalquestionsweshalladdressinthiswork.
2.1. Game model
Initially,inSection2.1.1,weintroduceabasicpatrollingsecuritygamemodelintegratingthemainfeaturesfrommodels currently studied in literature. Next, in Section 2.1.2, we extend our game model by introducing alarm signals. In Sec-tion 2.1.3, we depict the game tree ofour patrolling security game withalarm signalsand we decompose it in notable subgamestofacilitateitsstudy.Toeasepresentation,wesummarizedournotationsymbolsin
Table D.3
.2.1.1. Basic patrolling security game
As iscustomary inthe Artificial Intelligenceliterature [9,14], we dealwithdiscrete, both intermsof spaceandtime, patrollingsettings,representinganapproximationofacontinuousenvironment.Specifically,wemodeltheenvironmentto be patrolledasan undirectedconnectedgraph G
= (
V,
E)
,wherevertices representdifferentareasconnectedby various corridors/roads represented through theedges. Timeis discretized inturns. Edges areassigned weights representingthe numberoftimestepsneededtotraversethem.Ifnotstatedotherwise,weshallassumeunitaryweights.Withω
∗i,j weshall denotethe shortesttime togo fromi to j. Wedefine T⊆
V the subset ofsensiblevertices,calledtargets,
that mustbe protectedfrompossibleattacks.Eachtargett
∈
T is characterizedbyavalueπ
(
t)
∈ (
0,
1]
andapenetrationtimed
(
t)
∈ N
+, whichmeasuresthenumberofturnsneededtocompleteanattackto t.Example1. We report in Fig. 1 an example of patrolling setting. Here, V
= {
v0,
v1
,
v2
,
v3
,
v4
}
, T= {
t1,
t2,
t3,
t4}
whereti
=
vifori
∈ {
1,
2,
3,
4}
.Allthetargetst present
thesamevalueπ
(
t)
andthesamepenetrationtimed
(
t)
. Ateachturn,anAttacker
A
andaDefender
D
playsimultaneouslyhavingthefollowingavailableactions:•
ifA
hasnot attackedinthepreviousturns,it canobservethepositionofD
inthegraphG
3 anddecide whethertoattackatargetortowaitfora turn.Theattackisinstantaneous,meaningthatthereisnodelaybetweenthedecision toattackandtheactualpresenceofathreatintheselectedtarget4;
•
D
hasnoinformationabouttheactionsundertakenbyA
inpreviousturnsandselectsthenextvertextopatrolamong thoseadjacenttothecurrentone;eachmovementisanon-preemptivetraversalofasingleedge(
v,
v)
∈
E.Thegamemayconcludeincorrespondenceofanyofthetwofollowingevents.Thefirstoneiswhen
D
patrolsatargett that isunderattackby
A
fromlessthand
(
t)
turns.Insuch casetheattack ispreventedandA
iscaptured.Thesecond 3 Partial observability of Aover the position of Dcan be introduced, as discussed in[29].4 This is a worst-case assumption according to which Ais as strong as possible. It can be relaxed by associating execution costs to the A’s actions, as
Fig. 2. Examples of alarm systems.
one iswhentarget
t is
attackedandD
doesnotpatrolt during
thed
(
t)
turnsthat followthebeginning oftheattack. In such case, theattack issuccessfulandA
escapes withoutbeingcaptured.WhenA
iscaptured,D
receivesa utility of1 andA
receives autility of0.When anattack tot has
success,D
receives1−
π
(
t)
andA
receivesπ
(
t)
.The gamemay notconcludeifA
decidestowaitforeveryobservedpositionofD
,thusneverattacking.Insuchcase,D
receives1 andA
receives 0.(Anotherintuitive waytothinkatthispayoffstructure istoassume thatD
receivesan initialutility of1 and then losesπ
(
t)
whenevertargett is
successfullycompromised.)Notice thatthegameis constantsumandthereforeit is equivalenttoazero-sumgamethroughapositiveaffinetransformation.The above gamemodel isinextensive form(beingplayed sequentially), withimperfectinformation (
D
not observing the actions undertaken byA
), and withinfinite horizon(A
beingin the position to wait forever). The fact thatA
can observetheactions undertakenbyD
beforeactingmakestheleader–follower equilibrium
thenaturalsolutionconceptfor ourproblem, whereD
istheleader and
A
isthefollower.
Sincewe focusonzero-sum games,theleader’s strategyatthe leader–follower equilibriumis its maxminstrategy andit canbe found by employing linear mathematicalprogramming, whichrequirespolynomialtimeinthenumberofactionsavailabletotheplayers[31].2.1.2. Introducing alarm signals
Weextendthegamemodeldescribedintheprevioussectionbyintroducinga
spatial uncertain alarm system that
canbe exploitedbyD
.The basicideais thatan alarmsystemusesa numberofsensorsspread overtheenvironment togather informationaboutpossibleattacksandraisesanalarmsignalatanytimeanattackoccurs.Thealarmsignalprovidessome information aboutthe location (target)wherethe attackis ongoing,butit isaffectedby uncertainty.In otherwords, the alarmsystemdetectsanattackbutitisuncertainaboutthetargetunderattack.Formally,thealarmsystemisdefinedasa pair(
S,
p
)
,whereS
= {
s1,
· · · ,
sm}
isasetofm
≥
1 signals and p:
S×
T→ [
0,
1]
isafunctionthatspecifiestheprobability ofhavingthesystemgeneratingasignal s given thattargett has
beenattacked,wedenotesuch probabilitywith p(
s|
t)
. With aslightabuseofnotation, forasignals we
define T(
s)
= {
t∈
T|
p(
s|
t)
>
0}
and,similarly, foratargett we
haveS
(
t)
= {
s∈
S|
p(
s|
t)
>
0}
.Inthiswork,weassumethat:•
thealarmsystemisnotaffectedbyfalsepositives(signalsgeneratedwhennoattackhasoccurred).Formally,p
(
s| )
=
0,where
indicatesthatnotargetsareunderattack;•
thealarmsystemisnotaffectedbyfalsenegatives(signalsnotgeneratedeventhoughanattackhasoccurred).Formally,p
(
⊥
|
t)
=
0, where⊥
indicates that no signals have been generated; in Section 4 we will show that the optimal strategieswecomputeunderthisassumptioncanpreserveoptimalityeveninpresenceofnon-negligiblefalsenegatives rates.Example2.We report two examples of alarm systemsfor the patrolling settingdepicted in
Fig. 1
. The first example is reportedinFig. 2
(a).Itisalow-accuracyalarmsystemthatgeneratesthesamesignalanytimeatargetisunderattackand thereforethatdoesnotprovideanyinformationaboutthetargetunderattack.ThesecondexampleisreportedinFig. 2
(b). Itprovidesmoreaccurateinformationaboutthelocalizationoftheattackthanthepreviousexample.Here,thereceptionof asignals
i implies,underauniformstrategyofA
,ahighprobabilityofanattackontargett
i.Namely,whent
iisattacked thealarmsystemgeneratess
i withhighprobabilityandadifferentsignalotherwise.Giventhepresence ofanalarm systemdefinedasabove, thegamemechanismchanges inthefollowingway.At each turn, before deciding its next move,
D
observes whether ornot a signal hasbeen generated by the alarm systemand then makes its decision considering such information. This introduces in our game a node of chance implementing the non-deterministicselectionofsignals.2.1.3. The game tree and its decomposition
Herewedepict thegametreeofourgamemodel,decomposing itinsomerecurrentsubgames.Aportionofthegame isdepictedin
Fig. 3
.Suchtreecanbereadalongthefollowingsteps.•
Root of the tree.A
decideswhethertowaitforaturn(thisactionisdenotedas toindicatethatnotargetisattacked) ortoattackatargett
∈
T (this actionisdenotedbythelabelt of
theattackedtarget).•
Second level of the tree.N
denotesthealarmsystem,representedbyanature-typeplayer.Itsbehaviorisa priori specified
bytheconditional probabilitymassfunction p, whichdeterminesthegeneratedsignal giventheattackperformedby
A
.Inparticular,itisusefultodistinguishbetweentwocases:(a) if
A
doesnotattack(action),thennosignalwillbegeneratedundertheassumptionthat p(
⊥
| )
=
1; (b) ifA
attackstargett,
asignals will
bedrawnfromS
(
t)
withprobabilityp
(
s|
t)
(recallthatp
(⊥
|
t)
=
0).•
Third level of the tree.D
observesthesignalgeneratedbythealarmsystemanddecidesthenextvertextopatrolamong thoseadjacenttothecurrentone(thecurrentvertexisinitiallychosenbyD
).•
Fourth level of the tree and on. It isusefultodistinguishbetweentwocases:(a) ifnoattackispresent,thenthetreeofthesubgamestartingfromhereisthesameofthetreeofthewholegame, exceptforthepositionof
D
thatmaybe differentfromtheinitialone(noticethatinthiscaseeachofA
’sdecision nodesisasingletoninformationset,modelingthefactthatA
canobserveD
’sposition);(b) ifanattackistakingplaceontarget
t,
thenonlyD
canact.Suchgametreecanbedecomposedinanumberoffiniterecurrentsubgamessuchthatthebeststrategiesoftheagents ineachsubgame areindependentfromthoseinothersubgames. Thisdecompositionallowsustoapplya
divide et impera
approach,simplifyingtheresolutionoftheproblemoffindinganequilibrium.Moreprecisely,wedenotewith
oneofthese subgames.Wedefine
asagamesubtreethatcanbeextractedfromthetreeof
Fig. 3
asfollows.GivenD
’scurrentvertexv
∈
V , selectadecisionnodeforA
andcalliti.
Then,extractthesubtreerootedini discarding
thebranchcorrespondingto action(noattack).5Intuitively,suchsubgamemodelstheplayers’interactionwhentheDefenderisinsomegivenvertex v and theAttacker willperforman attack. Asa consequence,each
obtainedinsuch wayis finite(once anattack on
t
started,themaximumlengthofthegameis
d
(
t)
).Moreover,thesetofdifferent
swecanextractisfinitesincewehaveone subgameforeachpossiblecurrentvertexfor
D
.As aconsequence,we canextractatmost|
V|
differentsubgames. Notice that,duetotheinfinitehorizon,eachsubgamecanrecuran infinitenumberoftimesalongthegametree.However,being suchrepetitionsindependentandthegamezero-sum,weonlyneedtosolveonesubgametoobtaintheoptimalstrategyto beappliedineachofitsrepetitions.Inotherwords,whenassumingthatanattackwillbeperformed,theagents’strategies can be split in a number of independent strategiessolely depending on thecurrent position ofD
. The reasonwhy we discarded thebranchcorresponding toactionineach subgameis thatwe seektodeal withsuch complementary case exploitingasimplebackwardinductionapproach,asexplainedlater.
First,wecall
Signal
Response Game from v the subgamedefinedasaboveandcharacterizedbyavertex
v representing
thecurrentvertexof
D
(forbrevity,weshalluseSRG-v).InanSRG-v,thegoalofD
istofindthebeststrategystartingfrom vertex v torespondtoanyalarmsignal.AlltheSRG-vsareindependentandthusthebeststrategyineach subgamedoes notdependonthestrategiesoftheothersubgames.TheintuitionisthatthebeststrategyinanSRG-v doesnotdependon theverticesvisitedbyD
before theattack.Givenan SRG-v,we denotebyσ
vD,s thestrategyofD
onceobservedsignals,
byσ
vD thestrategyprofileσ
vD= (
σ
vD,s1. . . ,
σ
vD,sm)
ofD
,andbyσ
vAthestrategyofA
.LetusnoticethatinanSRG-v,givenasignal
s,
D
istheonlyagent thatplaysandthereforeeachsequenceofmovescan becollapsedina singleaction.Thus, SRG-v isessentiallyatwo-levelgameinwhichA
decidesthetargettoattackandD
decidesthesequenceofmovesonthe graph.Then,accordingtoclassicalbackward inductionarguments[32],oncewehavefoundthebeststrategiesofeach SRG-v, we can substitute the subgameswith the agents’ equilibriumutilities andthen we can find the best strategy of
D
for patrollingthe verticeswheneverno alarmsignal has beenraised andthebest strategyof attack forA
.We call thislast problemPatrolling Game (for
conciseness,we shallusePG).We denotebyσ
D andσ
A thestrategiesofD
andA
, respec-tively,inthePG.5 Rigorously speaking, our definition of subgame is not compliant with the definition provided in Game Theory[32], which requires that all the actions
of a node belong to the same subgame (and therefore we could not separate action from actions t).
However, we can slightly change the structure of
our game making our definition of subgame compliant with the one from game theory. More precisely, it is sufficient to split each node of Ainto two nodes: in the first Adecides to attack a target or to wait for one turn, and in the second, conditioned to the fact that Adecided to attack, Adecides which target to attack. This way, the subtree whose root is the second node of Ais a subgame compliant with game theory. It can be easily observed that this change to the game tree structure does not affect the behavior of the agents.
N. Basilico et al. / Artificial Int elligence 246 (2017) 220–257
Fig. 3. Gametree,v isassumedtothebecurrentvertexforD,r isacollapsedsequenceofmoves,calledroute,thatDperformsonthegraph,dashedlinesindicateinformationsets,andfunctionUA(ri,tj)
2.2. The computational questions we pose
Ourcontributionsfocusonthedesignofalgorithmstofindan equilibriumforourgameanddevelopalongfourcentral questions.Thefirstonestemsdirectlyfromourproblemdefinition.
Question1.
Which is the best patrolling strategy for
D
maximizing its expected utility?Clearly,thisproblemisrelatedtowhatwecalledPGinourgamedecomposition.Inordertobuildan answer,wepose otherthreequestionsthat,instead,involvetheothersubgamecalledSRG-v.
Question2.
Given a starting vertex v and a signal s, is there any strategy allowing
D
to visit all the targets in T(
s)
, each within its deadline?Question3.
Given a starting vertex v and a signal s, is there any pure strategy giving
D
an expected utility of at least k? Question4.Given a starting vertex v and a signal s, is there any mixed strategy giving
D
an expected utility of at least k?These questions are not independent from each other. In particular, answering to the last three is a prerequisite to solving the first one. For thisreason, we shall take a bottom-up approach answering the above questions starting from the last three andthen dealing withthe first one atthe whole-gamelevel. Also, Questions 3 and 4 are not easierthan
Question 2sohardnessresultsforthislastonecanbeextendedtotheothers.
3. Signalresponsegame
InthissectionwestartbydealingwiththeresolutionofSRG-v.Specifically,inSection3.1weprovethehardnessofthe problem, analyzingits computationalcomplexity.Then, inSection 3.2andinSection 3.3we proposetwo algorithms,the firstbasedon
dynamic programming (breadth-first
search)whilethesecondadoptsabranch and bound (depth-first
search) paradigm. Furthermore,we provide a variation for each algorithm, approximating the optimal solution. For the sake of presentation,themosttechnicalproofsarereportedinAppendix A.3.1. Complexity results
Theaimofthissectionistoassessthecomputationalcomplexityoffindinganexactorapproximateequilibriumforour gamemodel.Furthermore,weaimatidentifyingthesourceofhardnessofourproblemandthemostefficientalgorithmfor solvingit.
Each SRG-v is characterized by
|
T|
actions available toA
, each corresponding to a target t, and by O(
|
V|
maxt{d(t)})
decisionnodesof
D
.TheportionofgametreeplayedbyD
canbesafelyreducedbyobservingthatD
willmovebetween anytwo targets along a shortestpath.This allows usto discardfrom thetree all the decisionnodes whereD
occupies a non-targetvertex. Nevertheless, this reduction keeps the size of the game tree exponential in the parameters of the game,specifically O(
|
T|
|T|)
.6 Noticethat,theexponentialsizeofthegametreedoesnotconstituteaproofthatfindingthe equilibriumstrategiesofanSRG-v requiresexponentialtimeintheparametersofthegamebecauseitdoesnotexcludethe existence ofsome compactrepresentationofD
’sstrategies,e.g., Markovianstrategies.Thefirst resultweprovide implies thatsucharepresentationisveryunlikelytoexistorthat,ifitexists,itunlikelycanbecomputedinpolynomialtime.Callgv theexpectedutilityof
A
fromSRG-v (1−
gv isthenthecorrespondingutilityforD
).Wedefinethefollowingproblem.Definition1
(k–SRG-v).
Thedecisionproblemk–SRG-v is
definedas: INSTANCE:aninstanceofSRG-v;QUESTION:isthereany
σ
D suchthat,whenA
playsitsbestresponse,itholdsthat gv≤
k?Theorem1.
k–SRG-v is
NP-hard even when|
S|
=
1 andthe graph is a tree.
7Theaboveresultshowsthatw.r.t.treegraphs:
•
answeringtoQuestion 1
isFNP-hard,•
answeringtoQuestions 2,
3,4isNP-hard. 6 A more accurate bound is O(|T|min{|T|,maxt{d(t)}}).Forthe sake ofcompleteness,noticethat a maxminstrategywithsupport upperboundedby
|
T|
(that is,thenumberof actionsavailabletotheminplayerA
)alwaysexists[33].Now,giventhat weestablishedthatthemainsourceofhardnessoftheproblemiscomputingthestrategyspaceof
D
, wefocusontheproblemoffindinganefficientrepresentationforit.Westartwithsomedefinitions.Definition2
(Route).
Givenastartingvertexv and asignals,
arouter is
afinitesequenceofverticeswhere,calledr
(
i)
thei-th vertex,
r
(
0)
=
v and r(
i)
∈
T(
s)
foranyi
>
0.Withaslightoverloadofnotation, call T(
r)
thesetoftargetsfromthe sequence.We restrict ourattention toa special class ofroutes that we call covering. Fora route r and i
>
0, define Ar(
r(
i))
=
i−1h=0
ω
∗r(h),r(h+1).Such value gives thetime neededbyD
toarrive attargetr
(
i)
after following agraph walk along theshortestpathsbetweenconsecutiveverticesofthesequence
r
(
0),
r(
1),
. . . ,
r(
i)
.Definition3
(Covering route).
Acoveringrouteisarouter where
Ar(
t)
≤
d(
t)
foranyt
∈
T(
r)
.Coveringroutesareabstractionsfor
D
’savailablepure strategieswhenthecurrentvertexis v and asignal s has been generated.Ifr is
acoveringrouteforvertexv and
signals and
T(
r)
⊆
T(
s)
isthesetoftargetsinr,
then wecanalways instantiater to
agraphwalkforD
(asequenceofmoveson G) thatguarantees tocaptureA
onanytarget inT
(
r)
.Such walk issimplyobtainedbystarting fromr
(
0)
=
v and thencoveringanyshortestpathbetweenr
(
i)
andr
(
i+
1)
.Thetotal temporalcostofsuchwalkisexpressedbyc
(
r)
=
Ar(
r(|
T(
r)|))
.Weshallinformallyrefertosuchprocessaswalking a route r.
Coveringroutesthen constitutethe actionspaceforD
is aSRG-v game.Evenwhenconsidering asingle signals,
the numbersuchofsuchroutesis O(
|
T(
s)
|
|T(s)|)
intheworst case.However,somecoveringrouteswillneverbeplayedbyD
duetoanyofthefollowingtwodominancearguments[34] anddiscardingdominatedroutesiscrucialinthedesignofan efficientalgorithm.
Definition4 (Intra-set dominance). Given a starting vertex v, a signal s and two different covering routes r
,
r such thatT
(
r)
=
T(
r)
,ifc
(
r)
≤
c(
r)
thenr dominates r
.Definition5
(Inter-set dominance).
Givenastartingvertexv,
asignals and
twodifferentcoveringroutesr
,
r,ifT(
r)
⊃
T(
r)
then
r dominates r
.Furthermore,itisconvenienttointroducetheconceptof
covering set,
whichisstrictlyrelatedtotheconceptofcovering route.Itisdefinedasfollows.Definition6
(Covering set).
Givenastartingvertexv and
asignals,
acoveringset(covset) Q is asubset ofT(
s)
suchthat thereexistsacoveringrouter with T
(
r)
=
Q .Let usfocuson Definition 4.It suggeststhat we can safelyuseonly one route per coveringset. Coveringsets suffice for describingall the outcomesofthe game,since theagents’payoffs dependonly onthe fact that
A
attacksa target tthat is coveredby
D
ornot, andinthe worst caseare O(
2|T(s)|)
,witha remarkablereduction ofthe search spacew.r.t. O(
|
T(
s)
|
|T(s)|)
. However, any algorithmrestricting on covering sets should be able todetermine whether ornot a set oftargetsisacoveringone,whichisadifficultproblemaswell.
Definition7
(COV-SET).
ThedecisionproblemCOV-SETisdefinedas: INSTANCE:aninstanceofSRG-v withatargetsetT ;
QUESTION:is
T a
coveringset?(Equivalently,doesT admit anycoveringrouter?)
Theorem2.COV-SET is
NP-complete.Therefore,computingacoveringrouteforagivensetoftargets(ordecidingthatnocoveringrouteexists)isnotdoable in polynomial time unlessP
=
NP. Thisshowsthat, whilecovering sets sufficefordefining the payoffsofthe gameand thereforethesizeofthepayoffsmatrixcanbeboundedbythenumberofcoveringsets,inpracticewealsoneedcovering routes to certify that a givensubset of targets is covering. The impossibility to confine our algorithms to the space of covering setsseems tosuggesta complexityworse than O(
2|T(s)|)
.However,inthenext section weprovidean algorithmwithcomplexity
O
(
2|T(s)|)
(neglectingpolynomialterms)toenumerateallandonlythecoveringsetsand,foreachofthem,theassociatedcoveringroutewithminimumcost.
Let usfocuson Definition 5. Inter-Setdominancecan be leveraged tointroduce theconcept of
maximal covering
sets (androutes)whichcouldenableafurtherreductioninthesetofactionsavailabletoD
.Definition8
(MAXIMALITY).
Givenacoveringset Q=
T(
r)
forsomer,
wesaythat Q and r are maximalifthereisnoother coveringrouter
suchthat Q⊂
T(
r)
.In the best case, when there is a route covering all the targets, the numberof maximal covering sets is 1 (and we can safelyrestrict to a single minimum cost coveringroute over that set), while thenumber ofcovering sets (including the non-maximal ones) is 2|T(s)|. Thus, considering only maximal covering sets allows an exponential reduction of the payoffsmatrix.In theworst case, whenall the possiblesubsetscomposed of
|
T(
s)|/
2 targets are maximal coveringsets, thenumberofmaximalcoveringsetsisO
(
2|T(s)|−2)
,whilethenumberofcoveringsetsisO
(
2|T(s)|−1)
,allowingareductionofthepayoffsmatrixby afactorof2.Furthermore,ifweknew
a priori that
Q is a maximalcoveringset,wecould avoid searching forcovering routes for anyset of targetsthat strictly contains Q . When designing an algorithm to solve this problem,Definition 5
couldthenbeexploitedtointroducepruningtechniquestosaveaveragecomputetime.However,the followingresultshowsthatdecidingifacoveringsetismaximalishard.Definition9
(MAX–COV-SET).
ThedecisionproblemMAX–COV-SETisdefinedas:INSTANCE:aninstanceofSRG-v withatargetset T and acoveringset
T
⊂
T ;QUESTION:is Tmaximal?
Theorem3.
There is no polynomial-time algorithm for MAX–COV-SET unless
P=
NP.Nevertheless,weshow hereafter thatthereexists analgorithm computingallandonlythemaximal coveringsetsand oneroute foreach ofthem (whichpotentiallyleads toanexponentialreduction ofthetime neededforsolvingthelinear program) with only an additional polynomial cost w.r.t. the enumeration of all the covering sets. Therefore, neglecting polynomialterms,ouralgorithmhasacomplexityof
O
(
2|T(s)|)
.Finally,wefocuson thecomplexityofapproximatingthebest solutioninanSRG-v.When
D
restrictsitsstrategiesto bepure,theproblemisclearlynotapproximableinpolynomialtimeevenwhentheapproximationratiodependson|
T(
s)|
. Thebasicintuitionisthat,ifagameinstanceadmitsthemaximal coveringroutethatcoversall thetargetsandthevalue ofall the targetsis 1,then eitherthe maximalcovering route is played returninga utility of1 toD
oranyother route isplayed returninga utilityof 0,butno polynomial-timealgorithmcanfindthe maximalcovering routecovering allthe targets,unlessP=
NP.Ontheotherhand,itisinterestingtoinvestigatethecaseinwhichnorestrictiontopurestrategies ispresent.Weshowthattheproblemkeepsbeinghard.Theorem4.
The
optimization version of k–SRG-v, say OPT–SRG-v, is APX-hard even in the restricted case in which the graph is metric, there is only one signal s, all targets t∈
T(
s)
have the same penetration time d(
t)
, and there exists a maximal covering route covering all the targets.Theabovetheoremallowsustomaketwoimportantremarks.
Remark1. The above result does not exclude the existence of constant-ratio approximation algorithms for OPT–SRG-v.
We conjecture that itis unlikely.OPT–SRG-v presents similarities withthe (metric)DEADLINE–TSP, where thegoal isto findthelongestpathofverticeseach traversed beforeitsdeadline. TheDEADLINE–TSPdoesnot admitanyconstant-ratio approximationalgorithm [35]andthebest-knownapproximationalgorithm haslogarithmicapproximation ratio[36].The followingobservationscanbeproducedabouttherelationshipsbetweenOPT–SRG-v andDEADLINE–TSP:
•
whenthe maximal route coveringall the targetsintheOPT–SRG-v exists, the optimalsolutionof theOPT–SRG-v is alsooptimalfortheDEADLINE–TSPappliedtothesamegraph;•
when themaximal route covering all thetargets inthe OPT–SRG-v doesnot exist,the optimalsolutions ofthe two problemsaredifferent,evenwhenwerestrictustopure-strategysolutionsfortheOPT–SRG-v;•
approximatingtheoptimalsolutionoftheDEADLINE–TSPdoesnotgiveadirecttechnique toapproximateOPT–SRG-v, since we should enumerateall the subsets of targetsand foreach subset of targetswe wouldneed to execute the approximation of the DEADLINE–TSP, butthis would require exponential time. We notice in addition that even the totalnumberofsetsoftargetswithlogarithmicsizeisnotpolynomial,being(
2log2(|T|))
,andthereforeanyalgorithmenumeratingthemwouldrequireexponentialtime;
•
whentheoptimalsolutionoftheOPT–SRG-v israndomized,examplesofoptimalsolutionsinwhichmaximalcovering routes are not played can be produced, showing that at the optimum it is not strictly necessary to play maximal coveringroutes,butevenapproximationssuffice.Remark2.Ifitwerepossibletomap DEADLINE–TSPinstancestoOPT–SRG-v instanceswherethemaximalcoveringroute
covering all thetargetsexists, then itwould trivially followthat OPT–SRG-v doesnot admit anyconstant-approximation ratio.Wewerenotabletofindsuchamappingandwe conjecturethat,ifthereisan approximation-preservingreduction
fromDEADLINE–TSPtoOPT–SRG-v,thenwecannotrestrict tosuch instances.ThestudyofinstancesofOPT–SRG-v where mixedstrategiesmaybeoptimalmakethetreatmentverychallenging.
3.2. Dynamic-programming algorithm
Westartbypresentingtwoalgorithms.Thefirstoneisexact,whilethesecondoneisanapproximationalgorithm.Both algorithmsarebasedonadynamicprogrammingapproach.
3.2.1. Exact algorithm
Inthissectionweprovideanalgorithmtocomputethestrategiesavailableto
D
whenv is
thecurrentvertexandsignals has beengeneratedbythealarmsystem.Theideaistoadoptadynamicprogrammingmethodthatenumeratescovering setsofincreasingcardinalities.Eachcoveringsetisobtainedbyasequenceofexpansionsthat,startingfromtheemptyset, add onetarget ateach iteration.Wedenoteacovering setby Qk
v,t where
k indicates
itscardinalitywhile v and t denote the startingvertexofD
andthelasttarget addedtothe set,respectively.The algorithmoperatesinsuch awaythat the sequence oftargetscorresponding to theexpansionsmadeto obtain Qvk,t isa covering routeforthat set.Informally,we shall call it thegenerative route of
Qkv,t and we willdenote withc(
Qkv,t)
its temporalcost. We choose toobtain thisby requiringanyexpansiontobe admissiblewithrespectto threeconditions.Givenaset Qkv,t a newset Qvk,+w1
=
Qkv,t∪ {
w}
canbeobtainedif:1. w isnotcoveredinthecurrentcoveringset,thatis
w
∈
T(
s)
\
Qk v,t;2. w canbecoveredby
d
(
w)
byextendingthegenerativerouteof Qvk,t withashortestwalkfromt to w,
thatisd
(
w)
≥
c(
Qkv,t)
+
ω
t∗,w;3. call
t
atargetvisitedbyashortestpathfromt to w,
ift
∈
/
Qkv,t thenitcannotbecovered,thatis
d
(
t)
<
c(
Qvk,t)
+
ω
∗t,t. Conditions 1 and 2requireanyexpansion to formanew coveringset consistentwithDefinition 6
,thus ensuring the algorithm’ssoundness.Condition 3requiresQ
kv,t tobea
proper descriptor
ofitsgenerativeroute,meaningthatsuchroute, oncewalked,coversuniquelythetargetsincludedinQ
kv,t andnottargetsoutsideit.Thislastrequirementisanoperational choice to reduce the numberof expansionsmadein each iteration ofthe algorithm. Consider forinstance agraph with degree boundedby 3, Condition 3allows usto generateintheworst case3 new covering setsateach expansion round insteadof
|
T|
.NoticethatunderCondition 3thealgorithmcanstillgenerateanymaximalcoveringset.Lemma5.
Any maximal covering set can be generated from the empty set with a sequence of admissible expansions.
Proofsketch. Consideranexpansionwhich,byaddinga target w, violatesCondition 3foratarget
t
.Such expansioncan always be split intwo admissible ones. Thefirst addingt
,the second adding w. The samerationaleworks formultiple expansionsandmultiplet
andclearlyappliestomaximalcoveringsetswhichareasubsetofcoveringsets.2
Condition 3can beexploitedtoreducetherequiredcompute timebut,rigorouslyspeaking,itpresentsadrawback.To establish iftarget w can be added to set Qk
v,t the algorithmneeds tocheck every shortest pathfrom
t to
w, andsuch paths can be,inthe worstcase, exponentiallymany.We cancope withthisby adoptingthe followingsimplification. We fixasetofcanonicalshortestpathsbyrunningtheFloyd–Warshallalgorithm.Then,givent and
w, wefetchthecanonical shortestpathbetweenthemandwecheckCondition 3assuming thatsuch pathisunique.Iftheconditionisnot verified underthisassumptionthenitisalsonot verifiedinitsoriginal definition.Ifotherwise,thecondition isverified,then the algorithm(assumingvalidityofConditions 1and 2)mightobtainacoveringsetQ¯
k+1v,w whichisnotaproperone,meaning that by walkingits generative route atleastone target t
∈ ¯
/
Qkv,+w1 gets covered.Let usassume that this isthe caseand, w.l.o.g., thatt
isthe onlyinvalidating target. Thealgorithm’s currentsolution representation(the set Q¯
k+1v,w) would then be inconsistent withthe actual solution(the generative route).By Lemma 5 though, weknow that thealgorithm would generatealsoset Qkv+,w2 makingtwoadmissibleexpansionswith
t
andw to Q
kv,t.Since Qvk+,w2= ¯
Qvk,+w1∪ {
t}
thenon-proper covering set Q¯
k+1v,w isremovableby inter-set dominance(Definition 5). Obviously thisworkaroundcomes atan additional cost:thealgorithmunnecessarilygeneratestheset Q
¯
vk+,w1 whichundertheproperdefinitionofCondition 3wouldhavebeen avoided.Still,theabovesolutionturnedoutveryconvenientsinceexponentialmultiplicityofshortestpathsisveryunlikely ingraphsrepresentingrealenvironments.In Algorithm 1 we provide full technical details. Covering sets obtainedby the algorithm are groupedin collections:
C
kv,t denotes thecollectionofallcoveringsets ofcardinality
k where
thelastexpansionaddedtargett.
Aftertherequired initializationsteps(Lines1and2)foranygeneratedQ
vk−,t1wecomputethesetofadmissibleexpansionsQ
+(Line6)andwe applyeachoneofthem(Line8).InStep9,wemakeuseofaprocedurecalledSearch
(
Q,
C)
whereQ is
acoveringsetandC
isacollectionofcoveringsets.TheprocedureoutputsQ if
Q∈
C
and∅
otherwise.(Weadoptedanefficientimplementation ofsuch procedurewhichcanrunin O(
|
T(
s)
|)
.Moreprecisely,werepresentacoveringset Q as abinary vectoroflength|
T(
s)
|
wherethei-th
componentisset to1 iftarget ti∈
Q and 0 otherwise.A collectionofcovering sets C can thenbe representedasa binary treewith depth|
T(
s)|
.The membershipof acovering set Q to collectionC is represented with a branchofthetreeinsuch awaythat ift
i∈
Q then wehavea left edgeatdepthi
−
1 onsuch branch.We caneasily determineif Q∈
C
bycheckingiftraversingaleft(right)edgeinthetreeeachtimeweread a1 (0)in Q ’s binaryvector wereachaleafnodeatdepth|
T(
s)
|
.Theinsertionofanewcoveringsetinthecollectioncanbedoneinthesamewayby traversingexistingedgesandexpandingthetreewherenecessary.)Ifthenewlygeneratedcoveringsetisnotpresentinits collectionorisalreadypresentwithahighercost(Step10),thencollectionandcostareupdated(Steps11and12).Algorithm1DP-ComputeCovSets(v
,
s). 1:∀t∈T(s): C0v,v= {∅} 2: c(∅)=0 3: for all k∈ {1 . . .|T(s)|}do 4: forall t∈T(s)do 5: forall Qkv−,t1∈ C k−1 v,t do6: Q+= {w∈T(s)|Conditions 1–3 are satisfied}
7: forall w∈Q+do 8: Qk v,w=Qkv−,t1∪ {w} 9: U=Search(Qk v,w,Ckv,w) 10: if c(U)>c(Qvk−,t1)+ωt∗,wthen 11: Ck v,w= Ckv,w∪ {Qvk,w} 12: c(Qk v,w)=c(Qkv−,t1)+ωt∗,w 13: return{Ck v,t:t∈T(s),k≤ |T(s)|}
After
Algorithm 1
completeditsexecution,foranyarbitraryT
⊆
T we caneasilyobtainthetemporalcostofitsshortest coveringrouteasc∗
(
T)
=
minQ∈Y|T|c
(
Q)
where Y|T|= ∪
t∈T{
Search(
T,
C
|T|
v,t
)}
(notice that ifT isnot acovering setthen c∗(
T)
= ∞
).Forthe sakeof simplicity,Algorithm 1doesnotspecifyhowtocarryouttwosub-taskswedescribeinthefollowing.
Thefirstoneisthe
annotation of dominated covering sets.
EachtimeSteps11and12areexecuted,acoveringsetisadded tosomecollection.Letuscallit Q and assumeithascardinalityk.
EachtimeanewQ has
tobeincludedatcardinalityk,
wemarkallthecoveringsetsatcardinality
k
−
1 thataredominatedby Q (Definition 5).Thenumberofsetsthatcanbe dominatedisintheworstcase|
Q|
sinceeachofthemhastobesearchedincollectionC
kv−,t1foreachfeasibleterminalt and,
iffound,markedasdominated.Thenumberofterminaltargetsandthecardinalityof Q are atmost
n and,
asdescribed above,the Search procedure takes O(|
T(
s)|)
.Therefore,dominatedcoveringsetscanbeannotatedwitha O(|
T(
s)|
3)
extra cost ateachiteration ofAlgorithm 1
.We canonly markandnotdeletedominated coveringsets sincethey cangenerate non-dominatedonesinthenextiteration.The secondtaskis the
generation of routes.
Todothiswe maintaina listofgeneratingroutes byiteratively appending terminal vertex w to the generative route of Qvk−,t1 when set Qkv−,t1∪ {
w}
is included inits corresponding collection.At theendofthealgorithmonlyroutesthatcorrespondtonon-dominatedcoveringsetsarereturned.Maintainingsuch alist introducesa O(
1)
cost.Theorem6.Algorithm 1is an exact algorithm and has worst-case complexity of O
(
|
T(
s)
|
22|T(s)|)
since it has to compute covering sets up to cardinality|
T(
s)
|
. With annotations of dominances and routes generation the whole algorithm yields a worst-case complexity of O(|
T(
s)|
52|T(s)|)
.Noticethatthealgorithmisexactsinceitisbasedonanenumerationprocedure.
Example3.We providea simpleexampleof executionof
Algorithm 1
. Considera probleminstancewitha single signal, arbitrarytargetvalueswhiletopologyandpenetrationtimesareasfollow:Wereporttheexpansionsmadebythealgorithmforincreasingcardinalities(valueof
k)
inthefollowingtable. k=0 k=1 k=2 k=3 ∅,c=0 Q1 v,t3= {t3},c=1 Q 2 v,t1= {t1,t3},c=2 Q 3 v,t2= {t1,t2,t3},c=3 Q3 v,t4= {t1,t3,t4},c=4 Q2 v,t2= {t2,t3},c=2 Q 3 v,t1= {t1,t2,t3},c=3 Q3 v,t4= {t2,t3,t4},c=4 Q2 v,t4= {t3,t4},c=2 Q 3 v,t1= {t1,t3,t4},c=4, Q3 v,t2= {t2,t3,t4},c=4,♦ Q1 v,t4= {t4},c=1 Q 2 v,t3= {t3,t4},c=2 Q 3 v,t1= {t1,t3,t4},c=3, Q3 v,t2= {t2,t3,t4},c=3,NoticethatConstraint 3intervenesbothwhenexpandingcoveringsetswith
k
=
1 andk
=
2.Inthetable,c indicates
the temporal costofthe relativecovsetwhilethecovsetmarked withdominatestheone markedwith♦
whilethecovset markedwithdominatestheonemarkedwith.3.2.2. Approximation algorithm
The dynamicprogrammingalgorithmpresentedintheprevious section cannotbe directlyadoptedtoapproximatethe maximal coveringroutes. Wenotice that even inthe casewe introduce a logarithmic upper bound over thesize of the coveringsetsgeneratedby
Algorithm 1
,wecouldobtainanumberofroutesthatisO
(
2log2(|T(s)|))
,andthereforeexponential.Thus, ourgoalistodesigna polynomial-timealgorithm thatgeneratesa polynomialnumberof
good covering
routes. We observethatifwehaveatotalorderovertheverticesandweworkoveracompletegraphofthetargetswhereeachedge correspondstotheshortestpath,wecanfindinpolynomialtimethemaximalcoveringroutessubjecttotheconstraintthat, givenanypairoftargetst
,
tinaroute,t can
precedet
intherouteonlyift precedes t
intheorder.Wecallmonotonic
a routesatisfyingagiventotalorder.
Algorithm 2
returns themaximalmonotoniccoveringrouteswhenthetotalorderis lexicographic(trivially,tochangetheorder,itissufficienttore-labelthetargets).Algorithm 2isbasedondynamicprogrammingandworksasfollows.R
(
k,
l)
isamatrixstoringineachcellone route, while L(
k,
l)
isamatrixstoringineachcellthemaximumlatenessofthecorrespondingroute(seebelowforthemeaning ofk and l).
Themaximumlatenessofarouter captures
themaximumdelaybetweenatarget’s firstvisitanditsdeadline. Formally, itisdefinedasmaxt∈T(s)Ar(
t)
−
d(
t)
. Theroute storedin R(
k,
l)
isthe onewiththe minimumlatenessamong all the monotonic onescovering l targets wheretk is thefirst visitedtarget. Thus, basically,when l=
1, R(
k,
l)
contains theroute v,
tk,while,whenl
>
1,R
(
k,
l)
isdefinedappendingto R(
k,
1)
thebest(intermsofminimizingthemaximum lateness)route R(
k,
l−
1)
foreveryk
>
k, inordertosatisfy thetotalorder.The wholesetofroutesin R are returned.8 ThecomplexityofAlgorithm 2
is O(|
T(
s)|
3)
,exceptthetimeneededtofindalltheshortestpaths.Algorithm2MonotonicLongestRoute(v
,
s). 1:∀k,k∈ {1,2,. . . ,|T(s)|},R(k,k)= ∅,L(k,k)= +∞,CR(k)= ∅,CL(k)= +∞ 2: for all∀k∈ {|T(s)|,|T(s)|−1,. . . ,1}do 3: forall∀l∈ {1,2,. . . ,|T(s)|}do 4: if l=1 then 5: R(k,l)= v,tk 6: L(k,l)=ω∗v,tk−d(tk) 7: else 8: forall ks.t. |T(s)|≥k>|T(s)|−k do 9: CR(k)= R(k,1),R(k,l−1) 10: CL(k)=max{L(k,1),ω∗v,tk+ω ∗ tk,k−ω ∗ v,k+L(k,l−1)} 11: j=arg minj{CL(j)} 12: if CL(j)≤0 then 13: R(k,l)←CR(j) 14: L(k,l)←CL(j) 15: return RWe usedifferenttotalorders(breakingtiesrandomly)over thesetoftargets,collecting all theroutesgeneratedusing eachtotalorder:
•
increasingorderω
∗v,t:therationaleisthattargetscloseto v will bevisitedbeforetargetsfarfromv;
•
increasingorderd
(
t)
:therationaleisthattargetswithshortdeadlineswillbevisitedbeforetargetswithlongdeadlines;•
increasing orderd
(
t)
−
ω
∗v,t (wecallthisquantityexcess time):
therationaleisthattargetswithshortexcesstimewill bevisitedbeforetargetswithlongexcesstime.8 We notice that dominance can be applied to discard dominated routes. However, in this case, the improvement would be negligible since the total
Algorithm3Branch-and-Bound(v,
s, ρ
). 1: C Lmax← ∅ 2: C Lmin← ∅ 3: for all t∈T(s)do 4: ifω∗v,t≤d(t)then 5: Tree-Search(ρ· |T(s)|,v,t) 6: return C LmaxInaddition,weusea randomrestartgeneratingrandompermutationsoverthetargets.
Theorem7.Algorithm 2provides an approximation with ratio
(
|T1(s)|)
.Proofsketch. Theworstcasefortheapproximationratioofouralgorithmoccurswhenthecoveringrouteincludingallthe targetsexistsandeachcoveringroute returnedby ourheuristic algorithmvisitsonlyonetarget. Inthatcase,the optimal expectedutilityof
D
is1.Ouralgorithm,intheworst caseinwhichπ
(
t)
=
1 foreverytargett,
returnsanapproximation ratio(
|T1(s)|)
.Itisstraightforwardtoseethat,inothercases,theapproximationratioislarger.2
3.3. Branch-and-bound algorithms
Thedynamicprogrammingalgorithmpresentedintheprevioussectionessentiallyimplementsabreadth-firstsearch.In some specificsituations,depth-firstsearchcould outperformbreadth-firstsearch,e.g.,whenpenetration timesare relaxed andgoodheuristicsleadadepth-firstsearchtofindinabrieftimethemaximalcoveringroute,avoidingtoscanan expo-nentialnumberofroutesasthebreadth-first searchwoulddo. Inthissection,we adoptthebranch-and-bound approach to design both an exact algorithm andan approximation algorithm. Inparticular, in Section 3.3.1 we describe ourexact algorithm,whileinSection3.3.2wepresenttheapproximationone.
3.3.1. Exact algorithm
Ourbranch-and-boundalgorithm(see
Algorithm 3
)isatree-searchbasedalgorithmworkingonthespaceofthecovering routesandreturningasetofcoveringroutesR. Itworksasfollows.Initialstep. Weexploittwoglobalsetvariables,
C L
min andC L
maxinitiallysettoempty(Steps 1–2ofAlgorithm 3
).These variables containclosed covering
routes,namely coveringrouteswhich cannot befurther expandedwithout violatingthe penetrationtimeofatleastonetargetduringthevisit.C L
maxcontainsthecoveringroutesreturnedbythealgorithm,whileC Lmin is usedforpruningasdiscussedbelow.Givena starting vertexv and a signal s, foreach target
t
∈
T(
s)
such thatω
∗v,t≤
d(
t)
wegenerateacoveringrouter
=
v,
twithr
(
0)
=
v and r(
1)
=
t (Steps 5 ofAlgorithm 3
).Thus,D
hasatleast onecoveringroute pertargetthat canbecoveredintimefrom v. Noticethatif, forsomet,
such minimalroutedoesnot exist,then targett cannot
becovered (even theshortestpathfromthe starting vertexv cannot guaranteecapture).This doesnotguaranteethatA
willattackt with
fullprobabilitysince,dependingonthevaluesπ
,A
couldfindmoreprofitable torandomizeoveradifferentsetoftargets.Themeaningofparameterρ
(usedinLine 5ofAlgorithm 3
)isdescribedbelow.Routeexpansions. Thesubsequentstepsessentiallyevolveoneachbranchaccordingtoadepth-firstsearchwith back-trackinglimitedby
ρ
.The choiceofρ
directlyinfluences thebehavior ofthealgorithmandconsequently itscomplexity. Eachnode inthesearch treerepresentsa router built
sofarstarting froman initialroute v,
t.Ateach iteration,router is expandedbyinsertinga newtargetataparticularposition.Wedenotewith
r
+(
q,
p
)
theroute obtainedbyinserting target q after the p-th target in r. Notice that every expansion of r will preserve the relative orderwith which targets alreadypresentinr will
bevisited.Thecollectionofallthefeasibleexpansionsr
+s(i.e.,theonesthatarecoveringroutes) isdenotedby R+ anditisorderedaccordingtoaheuristicthatwe describebelow.Algorithm 6
,described below,isused togenerateR
+(Step 1ofAlgorithm 4
).Ineachopenbranch(i.e.,R
+= ∅
),ifthedepthofthenodeinthetreeissmalleror equaltoρ
· |
T(
s)
|
thenbacktrackingisdisabled(Steps 7–10ofAlgorithm 4
),while,ifthedepthislargerthansuchvalue, isenabled(Steps 5–6ofAlgorithm 4
).Thisisequivalenttofixtherelativeorderofthefirst(atmost)ρ
· |
T(
s)
|
inserted targetsinthe currentroute.Inthiscase, withρ
=
0 wedonot relyon theheuristicsatall,full backtrackingis enabled, thetreeisfullyexpandedandthereturned R is complete,i.e.,itcontains allthenon-dominatedcovering routes.Router
isrepeatedlyexpandedinagreedyfashionuntilnoinsertionispossible.Asaresult,
Algorithm 4
generatesatmost|
T(
s)|
coveringroutes.
Pruning.Algorithm 5 is incharge ofupdating C Lmin andC Lmax each timea route r cannot be expandedand, conse-quently, theassociated branchmustbe closed.We call C Lmin the