braz j infect dis 2 0 1 7;21(1):112–115
w w w . e l s e v ie r . c o m / l o c a t e / b j i d
The
Brazilian
Journal
of
INFECTIOUS
DISEASES
Brief
communication
Snow’s
case
revisited:
new
tool
in
geographic
profiling
of
epidemiology
Alessio
Papini
a,∗,
Ugo
Santosuosso
baUniversityofFlorence,DepartmentofBiology,Firenze,Italy
bUniversityofFlorence,DepartmentofClinicalandExperimentalMedicine,Firenze,Italy
a
r
t
i
c
l
e
i
n
f
o
Articlehistory: Received27July2016 Accepted30September2016 Availableonline9November2016 Keywords: Geographicprofiling Geographicepidemiology Cholera JohnSnow
a
b
s
t
r
a
c
t
GeographicProfilingtechniqueisusedtofindtheoriginofaseriesofcrimes.Themethod wasrecentlyextendedtootherfields.Oneofthebestrenowneddatainepidemiologyisthat byJohnSnowduringanoutburstofcholerainLondon.WewrotePythonscriptstoperform theanalysestoapplytheGeographicProfilingforindividuatingthestartingoriginofan infectionbyusingtheoldSnow’sdataset.Wemodifiedthemethodbyapplyingaweightto eachpointofthemapwherecasesofcholerawerereported.Theweightwasproportional tothenumberofcasesinagivenlocation.
ThismodificationoftheGeographicProfilingmethodallowedtoindividuateinthemap anareaofmaximumprobabilityoftheinfectionsource,whichwasafewmeterswideand includingthehistoricallyknownsourceofcholera,thatisthe“classical”waterpumpat BroadStreet.
Themethodappearstobeausefulcomplementinordertoindividuatethesourceof epidemicswhenavailabledataaboutthecasesoftheinfectionscanbesummarizedona map.
©2016SociedadeBrasileiradeInfectologia.PublishedbyElsevierEditoraLtda.Thisisan openaccessarticleundertheCCBY-NC-NDlicense(http://creativecommons.org/licenses/ by-nc-nd/4.0/).
Introduction
GeographicProfiling(GP) isananalytictoolwidelyused in criminologyinordertoidentifyonamap anarea of high-estprobabilityassumedtocontaintheoriginoflinkedevents, typicallycrimesexecutedbyaserialoffender.1Themethod
wasextendedfromcriminologytootherfieldswhereitwas possibletoidentifyaseriesoflinkedeventswhichmighthave
∗ Correspondingauthor.
E-mailaddress:[email protected](A.Papini).
originated from a starting point in the space (represented onatwodimensionalmap).Fieldsofapplicationotherthan criminologyhavebeen:invasionbyalienspecies,2–5
bumble-bees foraging and nest location,6,7 and infectious diseases
targeting.8,9
GP usesthecoordinatesonthemappedevents,creating aprobabilitysurface,theso-calledgeoprofile.1Thegeoprofile
does notindicatetheexact origin ofthe events,but rather prioritizeaseriesofgeographicalpoints,basedonthedata.1
http://dx.doi.org/10.1016/j.bjid.2016.09.010
1413-8670/©2016SociedadeBrasileiradeInfectologia.PublishedbyElsevierEditoraLtda.ThisisanopenaccessarticleundertheCC BY-NC-NDlicense(http://creativecommons.org/licenses/by-nc-nd/4.0/).
brazj infect dis.2017;21(1):112–115
113
Thegeoprofilewillprovideonthemapadecreasingprobability densityoffindingthesourceoftheeventsdrawnonthemap.1
Themodeldoesnotsearchsimplythegeographical cen-terofthe events,but instead it considersadistance-decay function,suchthattheprobabilityofaneventwillbelower byincreasingthe distancefromthe centeroforigin;and a bufferzone,withinwhichtheprobabilityofanevent tends tozero.1Thedistance-decayfunctionisrelatedto
maximiz-ingparsimonyinmovement,ineconomicalandenergyterms. Surprisingly,thesefunctionsrevealedtobefoundnotonlyfor humans(criminals),butalsoevenforinvasive(nothuman) species2,3andinfectiousdiseases.8–10
Theneedforanalyticaltoolstorecognizethesourceofthe spreadingof“something”(generallyathreat)hasalwaysbeen animportanttask.11Oneofthebestknowncasesis,in
epi-demiology,thatofcholeraoutbreakinLondon,1854,studied byJohnSnow12andwidelycitedasaseminalworkinspatial
epidemiology13[13andreferencestherein].Dr.Snowtagged
thecholeracasesandthewaterpumpsonthemapof Lon-donandsearchedforthearea withthehighestnumber of cases,sodiscoveringthattheoriginoftheoutbreak(the so-calledfocus ofinfection) wasa contaminated waterpump inBroadStreet.ThetaggedcholeracasesdrawnbySnowon themap ofLondon canbeconvertedinadatasetof coor-dinates,thatwasalreadyusedbyLeComberet al.8 totest
theGPmethodfortargetinginfectiousdiseases.LeComber etal.8wereabletomarkarestrictedareainthemapof
Lon-doncontainingthefamouswaterpumpofBroadStreet(see Fig.1CandDintheirarticle).Theseauthorsusedasinputdata theindividualaddresseswherecaseofdeathsduetocholera hadoccurred,thatis321addresses,whilethetotalnumber ofcasesamountedto575,sincemorethanonecasemight haveoccurredatthesameaddress.LeComberet al.8used
thisapproach“toavoidthepossibleproblemofspatial tempo-ralnon-independenceduetosecondaryinfectionsatagiven address”.Ourapproachincluded,instead,allcasesassigning aweighttoeachpoint(addresses)proportionaltothenumber ofcases.Weoverlookedpossiblesecondaryhuman-to-human contagions, since cholera should not easily transmit from person-to-person,whileitstransmissionisknowntobemore food-orwater-born.14 Forthisreason,weinterpretedmore
thanonecaseinthesameaddressasindependenteventsand hencesummable.
Therefore,hereweproposeanewmethodofapplyingGP inwhichadifferentweightisassignedtoeachpointofthe mapproportionallytothenumberofcasesoccurredineach point.
Methods
The data about the positions of cases on the map were acquiredwith Neuronmorpho(http://www.southampton.ac. uk/∼dales/morpho/),apluginofImageJ(NationalInstituteof Health;http://rsb.info.nih.gov/ij/),thatcanreadamap posi-tion with amouse click, building a csv filecontaining the coordinatespointbypoint.Weightswereaddedmanually.Our methodcalculatestheGPbyweightingeachpointofthemap indirectproportionalitywiththenumberofcasesoccurredin agivenpointofthemap.Thatis,somepointsofthemapare
moreimportantthanothers.Thedatawereanalyzedwitha Pythonscript(Geoprof3.0.2.py).
CrucialfortheGPanalysisistheassignmentofthevalues B,correspondingtotheradiusofthebufferzone.2Inour
anal-ysisweusedB=30,correspondingtoabufferzoneof30pixels (about15monourmap),thatisquitesmall,withrespectto other GPanalysesinotherfields,suchasthose onmalaria casesinCairo.9WeevaluatedmoreBvalues,calculatingthe
impactontheanalysis.TheGPtechniqueisdescribedindetail inPapinietal.3ThevariableB(thebufferzone)isofcourse
dependentonthemapmagnificationandonthemap resolu-tion,sinceBisexpressedinpixels,whiletheactualmeaningof thebufferzonecanbeunderstoodonlyifexpressedinmeters orkm.
The Python scripts were written by the authors and can be retrieved from the site www.unifi.it/caryologia/ PapiniPrograms.html.ThescriptswereexecutedwithPython 2.7.3 (http://www.python.org/), running in Ubuntu 12.04 LTSoperatingsystem, kernel2.6.32.ThePython(>=2.6 ver-sion)programsneedNumPy(http://www.numpy.org/),SciPy (http://www.scipy.org/), Matplotlib (http://matplotlib.org/), Scikit-learn (http://scikit-learn.org), and Python Image Library – PIL – (http://www.pythonware.com/products/pil/) librariesinstalled.Anoteaboutthesoftwareisprovidedas
Supplementarymaterial(SoftwareUsesupplementary.pdf).
Fig.1–Resultsobtainedbyconsideringonlytheaddresses onthemapasdatasets.Noweightisassignedtoeach addressonthebasisofthenumberofrecordedcases.
114
braz j infect dis.2017;21(1):112–115Results
and
discussion
Fig. 1 shows the results obtained by considering only the addressesonthemapasdatasets,correspondingtothe anal-ysisbyLeComberetal.,9thatis,noweightwasassignedtoan
addressonthebasisofthenumberofrecordedcases.InFig.2
weshowtheGPanalysiswithweightsassignedtoeachpoint ofthemaponthebasisofthenumberofcases.Theresult isquitestriking,sincetheredarea,representingtheareaof themapwiththepointswith95%ofhighestprobability com-prisedthepumpofBroadStreet.Thisareawasabout30min diameter.Withrespecttothemethodthatdoesnotconsider thenumberofcasesasweights(showninFig.1),thetotalarea ofhighestprobabilityofthepresenceofthesourcewashence muchsmaller.
Countingthepixelswithhighestprobabilityoffindingthe sourceof the crimes, we found that the red pixels (those withhighestprobability)decreasedsubstantiallypassingfrom consideringonlytheaddressestousingthewholedataset withweights,that isfrom 36533to10068 (visiblefrom the reductionindimensionoftheredareafromFig.1toFig.2).
Fig.2–GPanalysiswithweightsassignedtoeachpointof themaponthebasisofthenumberofcholeracases.The redarea(thatwithhighestprobabilitytofindtheinfection source)isonlyabout30mindiameteranditcomprisesthe famouspumpofBroadStreet.
Calculatingeachcaseasasinglepoint,alsoiflocatedinthe samepositiononthemap(thatisatthesameaddress), pro-ducedanareaofredpixelsonlyslightlyhigherwithrespect totheuseofweights(datanotshown).
Calculatingthedistanceonthemap,theGPanalysiswith weightsproducedanareaofmaximumprobabilityoffinding thesourceofabout30mindiameter,whichcontainsthewell knownsourceofcholeracasesinLondon,thatisthefamous pumpofBroadStreetrecognizedbySnow.12Thisresultshows
thattheuseofweightsproportionaltothenumberofcases ineachaddress largelyincreasetheprecisionofthe analy-sis,thatis,itreducestheareaofmaximumprobabilitywhere tolookforthesourcewithrespecttootherGPtechniquesas thoseemployedbyLeComberetal.9andVerityetal.11
Conclusion
Theweightedgeoprofilingcanbeausefulmethodtoidentify acenteroforiginofanoutbreakofadisease,incaseswhen morecasesofinfectioncanbefoundinthesamepointofthe map(normallycorrespondingtoaresidence),largelyreducing theprioritypointsandhenceshowingthehighestprecisionin delimitingthesourcesearcharea.
Theuseofweightsformorecasesofinfectionsatthesame address,canbeagoodchoiceonlyincaseswheresecondary person-to-personinfectionscanbeconsiderednotprobable (asitislikelythecaseofcholera),otherwise,asstatedbyLe Comberetal.9itisnecessarytouseasinputdataeachaddress
(pointonthemap)aspointswiththesameweight=1.
Funding
FinancialsupportbytheItalianMinistryofResearch(MUR), FondidiAteneo.
Conflicts
of
interest
Theauthorsdeclarenoconflictsofinterest.
Appendix
A.
Supplementary
data
Supplementary data associated with this article can be found, in the online version, at http://dx.doi.org/10.1016/ j.bjid.2016.09.010.
r
e
f
e
r
e
n
c
e
s
1.RossmoDK.Geographicprofiling.BocaRaton,FL:CRCPress; 2000.
2.StevensonMD,RossmoDK,KnellRJ,LeComberSC. Geographicprofilingasanovelspatialtoolfortargetingthe controlofinvasivespecies.Ecography.2012;35:1–12.
3.PapiniA,MostiS,SantosuossoU.Trackingtheoriginofthe invadingCaulerpa(Caulerpales,Chlorophyta)withgeographic profiling,acriminologicaltechniqueforakilleralga.Biol Invasions.2013;15:1613–21.
brazj infect dis.2017;21(1):112–115
115
4. CiniA,AnforaG,Escudero-ColomarLA,etal.Trackingthe invasionofthealienfruitpestDrosophilasuzukiiinEurope.J PestSci.2014;87:559–66.
5. SantosuossoU,PapiniA.MethodsforGeographicProfilingof biologicalinvasionswithmultipleoriginsites.IntJEnviron SciTechnol.2016;13:2037–44.
6. RaineNE,RossmoDK,LeComberSC.Geographicprofiling appliedtotestingmodelsofbumble-beeforaging.JRSoc Interface.2009;6:307–19.
7. Suzuki-OhnoY,InoueMN,OhnoK.Applyinggeographic profilingusedinthefieldofcriminologyforpredictingthe nestlocationsofbumblebees.JTheorBiol.2010;265:211–7.
8. LeComberSC,RossmoDK,HassanAN,FullerDO,BeierJC. Geographicprofilingasanovelspatialtoolfortargeting infectiousdiseasecontrol.IntJHealthGeogr.2011;10:35.
9. SmithCM,DownsSH,MitchellA,HaywardAC,FryH,Le ComberSC.Spatialtargetingforbovinetuberculosiscontrol:
canthelocationsofinfectedcattlebeusedtofindinfected badgers?PLOSONE.2015;10:e0142710.
10.LeComberSC,StevensonMD.FromJacktheRipperto epidemiologyandecology.TrendsEcolEvol.2012;27:307–8.
11.VerityR,StevensonMD,RossmoKD,NicholsRA,LeComber SC.Spatialtargetingofinfectiousdiseasecontrol:identifying multiple,unknownsources.MethodsEcolEvol.2014;5:647–55.
12.SnowJ.Snowoncholera.AreprintoftwopapersbyJohn Snoe,MD,togetherwithabiographicalmemoirbyBW Richardson,MD,andanintroductionbyWadeHampton Frost.NewYork:TheCommonwealthFund;1936.
13.ShiodeN,ShiodeS,Rod-ThatcherE,RanaS,Vinten-Johansen P.Themortalityratesandthespace-timepatternsofJohn Snow’scholeraepidemicmap.IntJHealthGeogr.2015;14:21.
14.SackDA,SackRB,NairGB,SiddiqueAK.Cholera.Lancet. 2004;363:223–33.