• Non ci sono risultati.

Clustering of contacts relevant to the spread of infectious disease

N/A
N/A
Protected

Academic year: 2021

Condividi "Clustering of contacts relevant to the spread of infectious disease"

Copied!
9
0
0

Testo completo

(1)

ContentslistsavailableatScienceDirect

Epidemics

jo u r n al h om ep ag e :w w w . e l s e v i e r . c o m / l o c a t e / e p i d e m i c s

Research

Paper

Clustering

of

contacts

relevant

to

the

spread

of

infectious

disease

Xiong

Xiao

a,d

,

Albert

Jan

van

Hoek

a

,

Michael

G.

Kenward

a

,

Alessia

Melegaro

b

,

Mark

Jit

a,c,∗

aFacultyofEpidemiologyandPopulationHealth,LondonSchoolofHygieneandTropicalMedicine,KeppelStreet,LondonWC1E7HT,UnitedKingdom bDONDENACentreforResearchonSocialDynamics&PublicPolicy,UniversitàBocconi,ViaGuglielmoRöntgenn.1,20136Milan,Italy

cModellingandEconomicsUnit,PublicHealthEngland,61ColindaleAvenue,LondonNW95EQ,UnitedKingdom dDepartmentofEpidemiologyandBiostatistics,WestChinaSchoolofPublicHealth,SichuanUniversity,Chengdu,China

a

r

t

i

c

l

e

i

n

f

o

Articlehistory:

Received18January2016

Receivedinrevisedform4August2016 Accepted23August2016

Availableonline26August2016

Keywords: Clustering Contacts Infectiousdiseases Mathematicalmodelling Varicella-zostervirus

a

b

s

t

r

a

c

t

Objective:Infectiousdiseasespreaddependsoncontactratesbetweeninfectiousandsusceptible indi-viduals.Transmissionmodelsarecommonlyinformedusingempiricallycollectedcontactdata,butthe relevanceofdifferentcontacttypestotransmissionisstillnotwellunderstood.Somestudiesselect con-tactsbasedonasinglecharacteristicsuchasproximity(physical/non-physical),location,durationor frequency.Thisstudyaimedtoexplorewhetherclustersofcontactssimilartoeachotheracrossmultiple characteristicscouldbetterexplaindiseasetransmission.

Methods:IndividualcontactdatafromthePOLYMODsurveyinPoland,GreatBritain,Belgium,Finlandand ItalyweregroupedintoclustersbythekmedoidsclusteringalgorithmwithaManhattandistancemetric tostratifycontactsusingallfourcharacteristics.Contactclusterswerethenusedtofitatransmission modeltosero-epidemiologicaldataforvaricella-zostervirus(VZV)ineachcountry.

Resultsanddiscussion:Acrossthefivecountries,9–15clusterswerefoundtooptimisebothqualityof clus-tering(measuredusingaveragesilhouettewidth)andqualityoffit(measuredusingseveralinformation criteria).Ofthese,2–3clustersweremostrelevanttoVZVtransmission,characterisedby(i)1–2clusters ofage-assortativecontactsinschools,(ii)aclusteroflessage-assortativecontactsinnon-schoolsettings. Qualityoffitwassimilartousingcontactsstratifiedbyasinglecharacteristic,providingvalidationthat singlestratificationsareappropriate.However,usingclusteringtostratifycontactsusingmultiple char-acteristicsprovidedinsightintothestructuresunderlyinginfectiontransmission,particularlytheroleof age-assortativecontacts,involvingschoolagechildren,forVZVtransmissionbetweenhouseholds.

©2016TheAuthor(s).PublishedbyElsevierB.V.ThisisanopenaccessarticleundertheCCBYlicense (http://creativecommons.org/licenses/by/4.0/).

1. Introduction

Mathematicalmodelsofinfectiousdiseasetransmissionrequire assumptionsaboutmixingbetweendifferentsubgroupsina pop-ulationthatcanpotentiallyleadtotransmissionbetweeninfected andsusceptibleindividuals.Thesimplestassumptionisthat

every-Abbreviations: AIC,AkaikeInformationCriterion;AICc,small-sample-size cor-rectedAkaikeInformationCriterion;ASW,averagesilhouettewidth;BIC,Bayesian InformationCriterion;PAM,partitioningaroundmedoids;VZV,varicella-zoster virus;WAIFW,whoacquiresinfectionfromwhom.

∗ Correspondingauthorat:FacultyofEpidemiologyandPopulationHealth, Lon-donSchoolofHygieneandTropicalMedicine,KeppelStreet,LondonWC1E7HT, UnitedKingdom.

E-mailaddresses:Xiong.Xiao@lshtm.ac.uk

(X.Xiao),Albert.VanHoek@lshtm.ac.uk(A.J.vanHoek),Mike.Kenward@lshtm.ac.uk

(M.G.Kenward),alessia.melegaro@unibocconi.it(A.Melegaro),

mark.jit@lshtm.ac.uk(M.Jit).

onehasthesameprobabilityofcontactingeachother,butthiscan sometimesleadtomisleadingresults(KeelingandRohani,2008). Indeed,manyinfectioncontrolinterventionssuchasvaccinating children(Thorringtonetal.,2015)orclosingschoolsduringa pan-demic(Houseetal.,2011)arepredicatedontheassumptionthat certainsubgroupsinthepopulationarethemaintransmitters.

A more realistic assumption is to subdivide the population basedonsomecharacteristic,andintroduce amatrixofcontact rates capableof transmittinginfectionbetween eachsubgroup, calledthe“whoacquiresinfectionfromwhom”(WAIFW)matrix (VynnyckyandWhite,2010).Ageisthecharacteristicmost com-monlyusedasasourceofheterogeneityinmixingpatterns.Amodel withage-stratifiedcontactratescanbefittedtoage-specificdataon infectionhistory(suchassero-epidemiologicaldata,whichmarks theprevalenceofpreviousinfection)toestimatetheage-specific effectivecontactratesintheWAIFWmatrix.

ToinformtheelementsoftheWAIFWmatrix,thenumberof socialcontactsthatindividualsindifferentagegroupsreportcan http://dx.doi.org/10.1016/j.epidem.2016.08.001

(2)

beempiricallymeasuredandusedasproxiestotheactualcontact ratesunderlyingtransmission(Beutelsetal.,2006;Edmundsetal., 1997;Wallingaetal.,2006).Thelargestsuchstudyisadiary-based surveyof7290participantsineightEuropeancountriescollected in2006aspartofthePOLYMODproject(Mossongetal.,2008). Sincethencontactstudieshavebeencarriedoutinotherpartsof theworldusingsimilarmethodology.

In these studies, participants are asked to record the con-tacts thatthey have madeover a single dayand classify them usinganumberofcharacteristics(suchasphysical/non-physical, long/short,home/school/worketc.)Sinceitisunrealisticto mea-sureeffectivecontacts(i.e.contactsthatcantransmitaparticular infection)betweenindividualsdirectly,someformofself-reported socialbehavioursuchasface-to-faceconversationorskin-to-skin contactisusedasaproxy.Itisassumedthattheagedistribution ofthesesocialcontactsisrelatedtotheagedistributionof effec-tivecontactsbyaconstantproportionalityfactor,anassumption referredtoasthe“socialcontacthypothesis”(Wallingaetal.,2006). Hencetheage-specifictransmissionmatrixiscompletelydescribed byestimating this factor.Subsequentanalyses (Melegaroet al., 2011)foundthatstratifyingtheWAIFWmatrixaccordingto dif-ferentcharacteristicsofcontacts(withadifferentproportionality constantforeachtypeofcontact)significantlyimprovedthe good-nessoffitofthemodelstoserologicalmarkersofpastinfectionfor respiratoryinfections.Thisimpliesthatdifferentcharacteristicsof socialcontactcontributedifferentlytoinfectiontransmission.

A previous study explored the use of formal clustering algorithms to group POLYMOD survey respondents based on the number and location of their contacts (Kretzschmar and Mikolajczyk,2009).Thestudyfoundthatrespondentsacross dif-ferentcountriesfellintoasimilarrangeofcontactprofiles,withthe work,schoolandhouseholdcontactprofilesmostcommon. How-ever,thisdoesnottelluswhethertherearecertaintypesofcontacts (ratherthanrespondents)whichmaybeparticularlyrelevantfor thetransmissionofparticularinfectiousdiseases.Furthermore,the relevantclustersofcontactsmaybedisease-specific,sincedifferent infectionshaveseparateroutesoftransmission.

Henceanalternativeapproachwouldbetoexplorewhatkindof contacts(ratherthanrespondents)aremostrelevanttoinfection transmission.Theonlypreviousworkinthisareahasfocusedon singledimensionsofcontacts(Melegaroetal.,2011).Thissimply indicatedthat“intimate”(i.e.physical,home,long-durationand frequent)contactsarebetterabletoexplainage-dependent pat-ternsintheacquisitionofserologicalmarkersforvaricella-zoster virus(VZV)andparvovirusB19,thetwoinfectionexaminedinthe study.Sincethedimensionsofintimacyarehighlycorrelated,it maybemoreinformativetotakeallthecharacteristicsofsocial con-tactsintoaccountcollectivelywhenstratifyingtheWAIFWmatrix. Inparticularcontactsmadebydifferentrespondentscouldbe groupedintoclusters,andthenexaminedtoidentifywhichclusters bestexplainpatternsofinfectionacquisitioninthecorresponding populations.Here,weexploretheuseofclusteringalgorithmsto determineclustersofsocialcontactswhicharesimilartoeachother basedonmulti-dimensionalcharacteristicsofsocialcontacts.

Arangeofclusteringalgorithmshavebeendevelopedsuchas hierarchicalclustering,partitioningclusteringandlatentclass clus-tering(Everittetal.,2011).Inhierarchicalclustering,eachelement isplottedonagraphtodeterminetheoptimalnumberofclusters. Whenthenumberofelementsbecomesverylarge(suchasthe thousandsofcontactsforeachcountryinthePOLYMODsurvey), thisgraphicalmethodbecomesverycumbersome.Inlatentclass clustering,amultivariatedistributionisimposedonthedataand thevalidationoftheclusteringresultsdependsonseveral assump-tionssuchastheparametricformofthemultivariatedistribution, thedependencybetweenvariablesandtheapproximationof like-lihoodestimation.Sincesocialcontactdataincludedifferenttypes

ofvariables(binaryandordinal),someofwhicharestructured,it isdifficulttoidentifyanappropriatemultivariatedistributionto describethedata.Henceweusedpartitioningclustering,andin particularthekmedoidsmethodtoclassifycontacts.

We then investigate whether these clusters enhance our understanding of transmission patterns by using them to fit a transmission dynamic model to age-dependent patterns in the acquisition of varicella-zoster virus serological markers, as an exampleofachildhoodrespiratoryinfectionwithclear,long-lived markersofpastinfectionandnovaccinationhistoryatthetimeof datacollection.

2. Methods 2.1. Datasources

Age-specific contact matrices were constructed using social contactdatafromparticipantsofthePOLYMODproject(Mossong etal.,2008)livinginPoland(15808contacts,1003participants), Great Britain(11052contacts,996participants),Belgium (8810 contacts,747participants),Finland(10319contacts,973 partici-pants)andItaly(15788contacts,842participants).Contactswith anymissing information werediscarded, so ourdataset differs slightly frompreviousanalyses(Melegaroetal., 2011).Contact matriceswereadjustedforpopulationsizeandreciprocityusing well-describedprocedures(Melegaroetal.,2011;Wallingaetal., 2006).

ModelswerefittedtodataonthepresenceofantibodiestoVZV fromserumsamplescollectedin1996from1300participantsaged 0–19inPoland,2091participantsaged0–20inEnglandandWales (fittedtoGreatBritishcontacts),2760participantsaged0–39in Belgium,2500participantsaged0–79inFinlandand2517 par-ticipantsaged0–79inItaly(Vyseetal.,2004).Thesampleswere collectedfromunlinkedanonymised(apartfromage)residualsera followingmicrobiologicalorbiochemicalinvestigations(Osborne et al., 2000), and tested as partof the European Commission-fundedsecondEuropeanSero-epidemiologicalNetwork(ESEN2) (Melegaroetal.,2011;Nardoneetal.,2007).Childrenunder5years oldwereoversampledwiththesamplesizeineachagegroup rang-ingfrom117to192.For those aged5–20 yearsapproximately 100seraineachone-yearagegroupweretested.Allserological testsforVZV-specificIgGwereperformedatPrestonPublicHealth LaboratoryusingacommercialELISAassay.

Populationdatawereobtainedfromnationalstatisticsoffices inthefivecountriesconsideredasinpreviousanalyses(Melegaro etal.,2011).

2.2. Clusteranalysisofsocialcontactdata

Aclusterisdefinedasagroupofsocialcontactswhose mem-bers aremoresimilartoeach otherthan tonon-members.The similarity of two contacts is defined using the Manhattan dis-tancemeasure betweenthetwo (Everittet al.,2011),based on four contact characteristics: proximity (physical, non-physical), duration(<5min,5–15min,15minto1h,1–4h,>4h),frequency (daily, weekly, monthly, a few times a year, first time) and location (home, work, school, leisure, transport, other). Vari-ables representingeachcharacteristic wererecodedsothatthe distance between any two contacts lies between 0 and 1, so theManhattan distance becomes theGower’s similarity coeffi-cientwhichisanappropriateproximitymeasurementformixed data (Gower, 1971) (see Appendix A1 for details in Supple-mentary data). Contacts recorded as taking place in multiple locationswereassignedtoasinglelocationbasedonthefollowing hierarchy: home>work>school>leisure>other>transport. The

(3)

hierarchywasbasedontheputativedurationofcontactsinthe givensettingsassuggestedbypreviousinvestigators(Kretzschmar andMikolajczyk,2009).Socialcontactswereclassifiedinto clus-tersusingthePartitioningAroundMedoids(PAM)algorithm,the mostcommonrealisationofk-medoidsclustering.Thisapproach pre-defines a fixednumberof clusters, selectstypical elements ascentroidsofeachclusterandassignsotherelementstoa clus-ter according to their distance to the centroid (Kaufman and Rousseeuw,1990).

Thenumberofclusterswasvariedbetween2and15,withthe upperlimitincludedtoavoidpossibleproblemswithdatasparsity. TheoptimalnumberofclustersforthePAMalgorithmwasthen determined usingaverage silhouette width (ASW),which mea-sureshowwelleachcontactisassigned toitscluster(Kaufman andRousseeuw,1990).Replicationanalysis(Breckenridge,2000; Walesiaketal.,2008)wasperformedtoassesstherobustnessof theclusteringresults,bysplittingthedatasetintoseveralsubsets andusingthesamealgorithmseparatelyforeachsubsettocompare classificationagreement.

2.3. Fittingdynamictransmissionmodels

We constructed a realistic age-structured (Schenzle, 1984) susceptible-infected-recovered(SIR)dynamicmodelofVZV trans-mission(withnonaturalmortalityassumed).Inthis model,the nextgenerationmatrixrepresentingthepotentialnumberof infec-tiontransmissioneventsperperson,iscalculatedastheproductof the(adjusted)age-dependentcontactmatrix,aproportionality fac-torqrepresentingtheproportionofcontactsthatcanpotentially leadtoanewinfectionandvectorrepresentingthesizeofthe sus-ceptiblepopulationineachagegroup.Foreachclusteriofcontacts (1≤i≤NwhereNisthetotalnumberofclusters),wegenerated NseparatecontactmatricesCiwithcorrespondingproportionality factorsqi.Hencethenextgenerationmatrixisthelinear combi-nationofallcontactmatricesC1q1w+...+CNqNwwherewisa

verticalvectorofthepopulationsizeineachagegroup.Thebasic reproductionR0wascalculatedastheprincipaleigenvalueofthe

nextgenerationmatrix.

A grid search algorithm was used to find the qi’s which maximised the binomial likelihood of the model given sero-epidemiological data for VZVinfections (Melegaroet al., 2011; Ogunjimietal.,2009).Wedefinedrelevantclustersasthosewith correspondingbestfittingqi’sthatwerenon-zero,andhencewhich contributedtothenextgenerationmatrix.Aniterativemethodwas usedtoobtaintheproportionofimmunesateachagegroupasa functionoftheqi’s.Qualityofmodelfitwithdifferentnumbersof clusterswasmeasuredusingtheAkaikeInformationCriterion(AIC), itssmall-sample-size corrected version (AICc)and the Bayesian InformationCriterion(BIC).Goodnessoffitwasalsocomparedto modelswithcontactsstratifiedbyasinglecharacteristiconly. 2.4. Uncertaintyintervals

Uncertaintyintervalsfortheproportionalityfactorqandbasic reproductionnumberR0 wereestimatedbybootstrapsampling

frombothsocialcontactdataandsero-epidemiologicaldatawith thesamesamplesizeastheoriginaldata(Melegaroetal.,2011). For social contact data, bootstrap samples were generated by re-samplingparticipants’identitynumbers.Socialcontact matri-ces were then estimated for each bootstrap sample. For the sero-epidemiologicaldata,bootstrapsamplesweregeneratedby randomlydrawingfromaBernoullidistributionwithprobability ofsuccessequaltotheobservedseroprevalenceforthespecificage group.Afterthis,thenewlygeneratedsocialcontactmatricesand sero-epidemiologicaldatawerematched torepeatedly estimate thetransmissionparameters.

AllanalyseswereperformedinRversion2.15.2(RCoreTeam, 2012),usingtheclusterandclusterSimpackages.

2.5. Sensitivityanalysis

Toexplorewhetherbetween-countrydifferencesinfitted mod-els were due to heterogeneity in the age ranges over which seroprevalencedatawasavailable,werepeatedouranalysisusing onlyseroprevalencedatain therange 0–20years(whichis the maximumrangeforwhichdatawereavailableinallfivecountries). 3. Results

3.1. Optimalnumberofclusters

Evaluatingtheoptimalnumberofclustersinvolvesthree com-ponents:thequalityoftheclustering(measuredusingASW),how welltheresultingclustersfitthetransmissionmodelofVZV sero-prevalence(measuredusingAIC,AICcandBIC)andthenumberof clustersthatareactuallyrelevanttothismodelfit(Fig.1).

Inallcountries,ASWdecreasedbetween2and5clusters (indi-catingworseclusteringquality),probablybecausemostcontact characteristicshaveeither 2(physical/non-physical)or 5(other characteristics)levelsofencoding.After5clusters,theASW grad-uallyincreased,withtheincreasesoccurringatthepointswhere thenumberofrelevantclustersincreased.Thethreemeasuresof goodnessoffittoVZVseroprevalencedata(AIC,AICcandBIC)were highlyconsistentwitheachother.

Wedeterminedtheoptimalnumber ofclustersby maximiz-ingbothqualityoftheclusteringaswellasgoodnessoffittothe pointwherefurtherincreasesinthenumberofclusterswouldnot makenoticeablechangesinthequalityofclustering,modelfitsor thenumberofrelevantclusterssubstantially.Theoptimalnumber ofclustersis12inGreatBritain,14 inItaly,15 inBelgium,8in Finlandand7inPoland.TherelatedcorrectedRandindexforthe clusteringineachcountryrangesfrom89%to95%,whichsuggests thattheclusteringisrobust(Breckenridge,2000).TheRandindex representstheaverageagreementbetweentwodataclusteringin multiplereplications(withtenreplicationsusedinthisanalysis) ifcontactdataarerandomlydividedintotwosamplesandeach sampleisclusteredintosamenumberofclustersseparately. 3.2. Comparisonsbetweendifferentstratifications

Whencontactdataarestratifiedbyclusters,only2–3clusters ineachcountryarefoundtoberelevanttoVZVtransmission(i.e. havenon-zeroqi’swhenthemodelisfittedtoVZVserology). Pre-viousworkstratifyingcontactsbysinglecharacteristicsuggested thatmoreintimatecontacts(physical,home/school,frequent,long duration)bestexplainVZVtransmission(Melegaroetal.,2011). Whenclusteringcontactsusingmultiplecharacteristics,themost relevantclustersarestilllikelytocontainmoreintimatecontacts. However,thepictureismorenuancedandvariesacrossthefive countries.

DetailsofthemostrelevantclustersaregiveninTable2,Fig.2, Fig.3andAppendixA.2(Supplementarydata).Briefly,inFinland andItaly,thecountrieswiththemoststronglyage-assortative over-allcontact matrices,three contact clustersare relevant.Overall assortativityisdrivenbytwoclustersdominatedbydailyschool contacts:onephysicalandonenon-physical.Athirdcluster, domi-natedbynon-physicalleisurecontactsinFinlandandphysicalother contactsinItaly,isalsoassortativebutlesssothantheschool con-tacts.Inbothcountriestherearetwoclusterswithhomecontacts whicharenotfoundtoberelevant.

InGreatBritain,BelgiumandPoland,onlytwoclustersare rel-evant:oneconsistingmainlyofhighlyassortativedailyphysical

(4)

Fig.1.Qualityofclustering(measuredusingASW),qualityoffit(measuredusingAIC,AICcandBIC)andnumberofrelevantclustersformodelsineachcountrywithdifferent numberofclusters.Verticaldashedlineshowsthemodelwiththeoptimalnumberofclusters.

Fig.2.ThedistributionsofthecharacteristicsofcontactforeachclusterinGreatBritain.Eachclustercorrespondstoonenumbersbelowthex-axis.Theredstarmark indicatestherelevantclusterwhoseestimatedqisnotequaltozero.ResultsforothercountriesaregiveninAppendixA2(Supplementarydata).(Forinterpretationofthe referencestocolourinthisfigurelegend,thereaderisreferredtothewebversionofthisarticle.)

schoolcontacts,andthesecondoflessassortativephysicalhome contacts.The home clustersshow a strong secondary diagonal probablyrepresentinginter-generationalcontacts,particularlyin Poland.Inallthree countries,there arenon-relevant homeand schoolclustersmoredominatedbylonger(>4h)contactsthanthe homeandschoolclustersactuallyfoundrelevant.

Table1comparesthebestfittingmodelwiththeoptimal num-berofclustersineachcountry,withmodelsofcontactsstratifiedby singlecharacteristicsasusedinpreviousanalyses(Melegaroetal., 2011).Ineverycountry,amodelbasedonclusteringcontactsusing multiplecharacteristicshasabetterqualityfit(measuringusing AIC)thanmodelswithcontactsstratifiedbyasingle characteris-tics.However,theimprovedfitisattheexpenseofpoorermodel

(5)

Table1

Modeloutcomesfordifferentstratificationsofsocialcontactdata.Paranthesesgive95%intervalforthebootstrapdistributionsofqandR0.

GreatBritain Italy Belgium Finland Poland

95%CI 95%CI 95%CI 95%CI 95%CI Allcontacts q 0.057 0.048–0.066 0.033 0.027–0.039 0.091 0.068–0.114 0.076 0.065–0.089 0.057 0.046–0.071 R0 4.65 3.95–5.50 4.59 3.74–5.64 7.72 5.81–10.31 5.84 4.95–7.01 6.87 5.49–8.90 AIC 264 295 277 194 181 Byproximity q(physical) 0.090 0.073–0.107 0.043 0.026–0.052 0.114 0.083–0.142 0.030 0.000–0.095 0.078 0.058–0.095 q(non-physical) 0.000 0.000–0.011 0.002 0.000–0.036 0.000 0.000–0.036 0.154 0.056–0.221 0.000 0.000–0.055 R0 3.70 3.29–4.40 3.55 3.12–4.71 5.74 4.46–7.64 8.34 5.78–11.19 5.47 4.47–7.43 AIC 202 273 253 182 158 Byduration q(<15min) 0.000 0.000–0.000 0.000 0.000–0.020 0.000 0.000–0.001 0.000 0.000–0.287 0.000 0.000–0.121 q(15–60min) 0.000 0.000–0.000 0.000 0.000–0.130 0.000 0.000–0.535 0.000 0.000–0.117 0.000 0.000–0.000 q(>1h) 0.081 0.060–0.096 0.043 0.018–0.050 0.121 0.026–0.148 0.107 0.042–0.131 0.079 0.027–0.103 R0 3.87 3.28–5.63 3.89 3.37–5.20 5.39 4.15–9.20 4.51 3.97–9.01 5.27 4.31–8.15 AIC 235 285 250 186 186 Bylocation q(home) 0.185 0.092–0.213 0.021 0.000–0.074 0.105 0.000–0.305 0.044 0.000–0.131 0.224 0.082–0.326 q(school) 0.000 0.000–0.111 0.044 0.023–0.060 0.154 0.000–0.298 0.142 0.068–0.191 0.025 0.000–0.049 q(work) 0.000 0.000–0.000 0.000 0.000–0.012 0.000 0.000–0.000 0.000 0.000–0.000 0.000 0.000–0.000 q(others) 0.000 0.000–0.337 0.008 0.000–0.065 0.000 0.000–0.120 0.026 0.000–0.179 0.000 0.000–0.229 R0 5.11 3.93–8.93 3.93 3.36–5.07 4.97 3.98–8.54 6.15 4.35–8.56 7.63 6.07–10.77 AIC 192 281 251 183 108 Byfrequency q(daily) 0.029 0.000–0.056 0.039 0.025–0.046 0.162 0.076–0.220 0.120 0.051–0.144 0.089 0.062–0.115 q(weekly) 0.155 0.057–0.268 0.000 0.000–0.094 0.000 0.000–0.129 0.015 0.000–0.405 0.000 0.000–0.279 q(lessthanweekly) 0.000 0.000–0.000 0.000 0.000–0.085 0.000 0.000–0.101 0.000 0.000–0.082 0.000 0.000–0.000

R0 4.62 3.68–6.60 4.19 3.50–5.33 4.83 3.80–7.50 5.35 4.60–7.17 5.53 4.03–9.82 AIC 166 282 231 183 187 Byclusters Clusters 12 14 15 8 7 Relevantclusters 2 3 2 3 2 q(firstcluster) 0.565 0.000–0.723 0.071 0.000–0.225 0.204 0.000–0.336 0.303 0.000–0.371 0.246 0.000–0.323 q(secondcluster) 0.085 0.000–0.178 0.127 0.000–0.202 0.443 0.000–0.948 0.183 0.000–0.329 0.081 0.000–0.129 q(thirdcluster) 0.064 0.000–0.105 0.030 0.000–0.357 R0 3.79 2.79–9.44 4.68 2.92–5.36 4.49 3.22–8.25 6.19 4.92–23.29 7.61 5.91–32.33 AIC 160 260 226 180 106 Randindex 0.89 0.91 0.91 0.95 0.90 Table2

Characteristicsofthemostrelevantclustersineachcountry.

Country Cluster Location Physical Frequency Duration Assortativity Importance

Great Britain

1(home) Home Y Mixed Mixed – ++

2(school) School Y Daily Mixed + +

Belgium 1(home) Home Y Daily Mixed – –

2(school) School Y Daily Mixed + +

Poland 1(home) Home Y Daily Mixed – ++

2(school) School(mostly) Y Daily(mostly) Mixed + +

Finland 1(schooltouch) School Y Daily(mostly) Mixed ++ +

2(schoolnon-touch) School N Daily(mostly) Mixed ++ –

3(leisure) Leisure N Mixed Mixed + +

Italy 1(schooltouch) School Y Daily(mostly) Mixed ++ +

2(schoolnon-touch) School N Daily(mostly) Mixed ++ +

3(other) Other Y Mixed Mixed + –

estimations(largeruncertaintyintervals)aroundtheestimatedR0.

Themodelsbasedonclusterswithmultiplecharacteristicsestimate

thatR0rangesfrom3.79(95%range2.79–9.44)inGreatBritainto

7.61(95%range5.91–32.33)inPoland.Thepointestimatesare

sim-ilartoestimatesusingsinglecharacteristicsandconsistentwitha

rangeof3–8intheliterature(Goeyvaertsetal.,2010;Iozzietal.,

2010;Melegaroetal.,2011;Ogunjimietal.,2009).

Fig.4showsmodelpredictionsofVZVseropositivityusingthe optimalnumberofclusterscomparedthemwiththetrue

propor-tionofseropositivesamples.Onthewholemodelpredictionsfit thesero-epidemiologicaldatawell.

3.3. Sensitivityanalysis

Whenonlyseroprevalencedatafor 0–20yearoldsisusedin Belgium, Finland and Italy to ensure comparability with Great BritainandPoland,theoverallresultsarenotgreatlyaltered.The samethreehighlyassortativeclusters(dominatedbytwoclusters

(6)

Fig.3.Thecorrectedage-specificcontactmatricesforthetwoidentifiedrelevantclusters(theclusternumber1and5)withsamplingweightingandreciprocalcorrection.

ofschoolcontacts)arerelevantinItaly.InFinland,twoassortative clusters(dominatedbyschoolandleisurecontactsrespectively) arerelevant,sothereislittleoverallchangeinage-assortativity.

InBelgium,there isa slightshiftasthehighlyassortative clus-terdominatedbyschoolcontactsisdroppedbutalessassortative clusterdominatedbyhomecontactsremainsrelevant,sooverall

(7)

Fig.4.ProportionsofsamplepositiveforVZVantibodiesbyage(bluecirclewithsizeproportionaltosamplesize),andmodelpredictionsofprevalence(reddashline)by age.Thegreylinesgivethe95%bootstrapinterval.(Forinterpretationofthereferencestocolourinthisfigurelegend,thereaderisreferredtothewebversionofthisarticle.)

thecontactmatrixissomewhatlessassortative.Fulldetailsarein AppendixA3(Supplementarydata).

4. Discussion

Understandingsocialcontactratesisessentialtomost realis-ticmodelsofinfectiousdiseasetransmission.Contactstudiessuch asPOLYMODanditssuccessorshaveadvancedthefieldby allow-ingsuchratestobemeasuredempiricallyratherthanassumed. However,infectiontransmissioneventscannotusuallybedirectly observedsoassumptionshavetobemadeaboutthecontribution ofdifferenttypesofreportedcontactsusedasproxiestocontacts leadingtotransmission.Wehaveconductedthefirststudyto inves-tigatethe contactsthat best explain past infectionstatus data, takingintoaccountmorethanonecharacteristicatatimeusing clusteringalgorithms.

Whenthesereportedcontactsare clusteredusingclustering algorithms,onlyafewoftheseclustersarefoundtoberelevantto VZVtransmission.VZVtransmissioninallfivecountriesis dom-inatedbya groupofhighlyassortative contactsanda groupof lessassortativecontacts.However,thebalancebetweenthetwo groupschangesacrosscountries:theformergroupdominatesin FinlandandItaly,whilethelattergroupdominatesinPoland.Great BritainandBelgiumlieinbetween.Thesecountrycharacteristics arerobusttoconstrainingtheVZVdatausedtofitmodelstodata

from0to20yearoldssotheydonotappeartobedrivenbycountry differencesintheavailabilityofVZVseroprevalencedata.

Interestingly,althoughschoolcontactsappeartoplayakeyrole acrossallfivecountriesinprovidingtheassortativenessincontacts, non-assortativenessincontactsderivefromhome,leisureorother contacts.ThismatchesinformationfromVZVsurveillancein Eng-landthatsuggestsschoolandpreschoolchildrenplayakeyrolein transmission(Brissonetal.,2001).Thisexplanationhasface valid-itysinceVZVistransmittedbetween,aswellaswithin,households (OrganisationforEconomicCooperationandDevelopment,2008). Ofnote,theproportionofchildren4yearsandunderenrolledin educationin2006(whenthePOLYMODcontactstudywas con-ducted)ishighestinBelgium,followedbyItaly,UnitedKingdom, FinlandandfinallyPoland.Thisexactlymatchestheorderinwhich assortativecontactsdominatetheclustersrelevanttoVZV trans-missionin thefive countries, withtheexception of Finland.In Finland,althoughformaleducationstartslate(atage7),thereis highenrolmentinpubliclysubsidisedday-carewhichisnot cap-turedinstandardisedinter-countrystatistics.

Thequalityoffitformodelsbasedonclustersstratifiedby mul-tiplevariableswasonlyslightlybetterthanforpreviousmodels basedoncontactsstratifiedbyasinglevariablealone.The simi-laritybetweenfitqualitywithsingleandmultiplestratifications ofcontactssuggeststhatasinglecharacteristiclargelysucceedsin capturingthedichotomybetweenrelevantandnon-relevant con-tacts.Thisisprobablybecausecontactscharacteristicsarehighly

(8)

associatedwitheachother(e.g.contactsthatare physical,long duration,homeandfrequentatthesametimeare disproportion-atelyrepresentedamongallcontacts).Also,byconsideringseveral contactcharacteristicssimultaneously,ourclusteringmethodgives insightsintotheinfectiontransmissionprocessthatmaynotbe seenwhenstratifyingbyasinglecharacteristic.Forinstance,we candeterminetherelevanceofshortdurationphysicalcontactsfor whichsinglecharacteristicstratificationwouldproduce conflict-ingconclusionsdependingonwhetherthestratificationwasby durationorproximity.

Recentanalyseshaveshownthatusinganage-specific propor-tionalityfactorenablesmodelstobetterfitVZVserologyinmany Europeancountries(Goeyvaertset al., 2010; Santermanset al., 2015).Inouranalysis,wevaryproportionalityfactorsaccording toother contact characteristics, but assume that theyare age-independentwithineach clusterof contacts.Thiswasdonefor simplicity and to allow direct comparison withprevious work (Melegaroetal.,2011)butmayhaveresultedinapoorerfittodata. 5. Conclusions

Usingthekmedoidsclusteringalgorithmstoidentifyclustersof contactsrelevanttoVZVtransmissiongivessimilarmodelfit qual-ityandR0 valuesascontactsstratifiedbyasinglecharacteristic.

However,consideringseveralcharacteristicssimultaneously pro-videskeyinsightsintothetypeofcontactsthataremostrelevantto infectiontransmission,andthosethatseemnottoplayan impor-tantrole.Firstly,itconfirmsthatsingle-characteristicstratifications areabletoprovideoptimalfitstodata,probablybecausetheyare abletocapturethemostintimatecontactsresponsibleformost oftransmission.Secondly,wefindthatschool-agechildrenplaya keyroleinalmostallcontactsrelevanttoVZVtransmissionacross fivecountries,andcontactsatschoolinparticulararemost impor-tantincountrieswithhighlyassortativecontactmatrices.Thiskind ofanalysismaythereforeprovideanalternativeorsupplementary approachtoprevioussinglecharacteristicstratificationmodels,as wellasvalidationforpreviousmethodsofcontactstratification. Extendingsuchanalysestoothercountriesandinfectionsmaybe usefulforfuturework.

Conflictsofinterest None.

Authorcontributions

MJandXXdesignedthestudy.XXcarriedouttheanalysis,wrote thecodes and drewtheconclusions withinputfromMJ, AJvH, MGKandAM.XXandMJwrotethefirstdraftofthemanuscript. Allauthorscontributedtocriticalrevisionsofthemanuscriptand approvedthefinalarticle.

Acknowledgments

We thank John Edmunds, Mirjam Kretzschmar and Rafaeal Mikolajczyk for helpful discussions. This work was supported bytheNational Institutefor HealthResearch Health Protection ResearchUnit(NIHRHPRU)inImmunisationattheLondonSchool ofHygieneandTropicalMedicineinpartnershipwithPublicHealth England(PHE).Theviewsexpressedarethoseoftheauthorand notnecessarily those of theNHS,theNIHR, theDepartment of HealthorPublicHealthEngland.Researchleadingtotheseresults hasreceivedfundingfromtheEuropeanResearchCouncilunder theEuropeanUnion’sSeventhFrameworkProgramme (FP7/2007-2013)/ERCGrantagreement283955(DECIDE)toAM.Thefunders

didnotinfluencestudydesign,collection,analysisand interpre-tationofdata,writingofthisreport,orthedecisiontosubmitfor publication.

AppendixA. Supplementarydata

Supplementarydataassociatedwiththisarticlecanbefound,in theonlineversion,athttp://dx.doi.org/10.1016/j.epidem.2016.08. 001.

References

Beutels,P.,Shkedy,Z.,Aerts,M.,VanDamme,P.,2006.Socialmixingpatternsfor transmissionmodelsofclosecontactinfections:exploringself-evaluationand diary-baseddatacollectionthroughaweb-basedinterface.Epidemiol.Infect. 134,1158–1166,http://dx.doi.org/10.1017/S0950268806006418.

Breckenridge,J.,2000.Validatingclusteranalysis:consistentreplicationand symmetry.MultivariateBehav.Res.35,261–285.

Brisson,M.,Edmunds,W.J.,Law,B.,Gay,N.J.,Walld,R.,Brownell,M.,Roos,L.L., Roos,L.,DeSerres,G.,2001.Epidemiologyofvaricellazostervirusinfectionin CanadaandtheUnitedKingdom.Epidemiol.Infect.127,305–314.

Edmunds,W.J.,O’Callaghan,C.J.,Nokes,D.J.,1997.Whomixeswithwhom?A methodtodeterminethecontactpatternsofadultsthatmayleadtothe spreadofairborneinfections.Proc.Biol.Sci.264,949–957,http://dx.doi.org/ 10.1098/rspb.1997.0131.

Everitt,B.S.,Landau,S.,Leese,M.,Stahl,D.,2011.Clusteranalysis.In:WileySeries inProbabilityandStatistics.Wiley,Chichester,UK,http://dx.doi.org/10.1007/ BF00154794.

Goeyvaerts,N.,Hens,N.,Ogunjimi,B.,Aerts,M.,Shkedy,Z.,Damme,P.,VanBeutels, P.,2010.Estimatinginfectiousdiseaseparametersfromdataonsocialcontacts andserologicalstatus.J.R.Stat.Soc.Ser.CAppl.Stat.59,255–277,http://dx. doi.org/10.1111/j.1467-9876.2009.00693.x.

Gower,J.C.,1971.AGeneralCoefficientofSimilarityandSomeofItsProperties. Biometrics27,857,http://dx.doi.org/10.2307/2528823.

House,T.,Baguelin,M.,VanHoek,A.J.,White,P.J.,Sadique,Z.,Eames,K.,Read,J.M., Hens,N.,Melegaro,A.,Edmunds,W.J.,Keeling,M.J.,2011.Modellingthe impactoflocalreactiveschoolclosuresoncriticalcareprovisionduringan influenzapandemic.Proc.Biol.Sci.278,2753–2760,http://dx.doi.org/10.1098/ rspb.2010.2688.

Iozzi,F.,Trusiano,F.,Chinazzi,M.,Billari,F.C.,Zagheni,E.,Merler,S.,Ajelli,M.,Del Fava,E.,Manfredi,P.,2010.LittleItaly:anagent-Basedapproachtothe estimationofcontactpatterns-fittingpredictedmatricestoserologicaldata. PLoSComput.Biol.6,e1001021,http://dx.doi.org/10.1371/journal.pcbi. 1001021.

Kaufman,L.,Rousseeuw,P.J.,1990.FindingGroupsinData:AnIntroductionto ClusterAnalysis(WileySeriesinProbabilityandStatistics).JohnWiley&Sons.

Keeling,M.J.,Rohani,P.,2008.ModelingInfectiousDiseasesinHumansand Animals.PrincetonUniversityPress,http://dx.doi.org/10.1038/453034a. Kretzschmar,M.,Mikolajczyk,R.T.,2009.Contactprofilesineighteuropean

countriesandimplicationsformodellingthespreadofairborneinfectious diseases.PLoSOne4,e5931,http://dx.doi.org/10.1371/journal.pone.0005931. Melegaro,A.,Jit,M.,Gay,N.,Zagheni,E.,Edmunds,W.,2011.Whattypesof

contactsareimportantforthespreadofinfections?:usingcontactsurveydata toexploreEuropeanmixingpatterns.Epidemics3,143–151,http://dx.doi.org/ 10.1016/j.epidem.2011.04.001.

Mossong,J.,Hens,N.,Jit,M.,Beutels,P.,Auranen,K.,Mikolajczyk,R.,Massari,M., Salmaso,S.,Tomba,G.S.,Wallinga,J.,Heijne,J.,Sadkowska-Todys,M.,Rosinska, M.,Edmunds,W.J.,2008.Socialcontactsandmixingpatternsrelevanttothe spreadofinfectiousdiseases.PLoSMed.5,e74.

Nardone,A.,deOry,F.,Carton,M.,Cohen,D.,vanDamme,P.,Davidkin,I.,Rota, M.C.,deMelker,H.,Mossong,J.,Slacikova,M.,Tischer,A.,Andrews,N.,Berbers, G.,Gabutti,G.,Gay,N.,Jones,L.,Jokinen,S.,Kafatos,G.,deAragón,M.V.M., Schneider,F.,Smetana,Z.,Vargova,B.,Vranckx,R.,Miller,E.,2007.The comparativesero-epidemiologyofvaricellazostervirusin11countriesinthe Europeanregion.Vaccine25,7866–7872,http://dx.doi.org/10.1016/j.vaccine. 2007.07.036.

Ogunjimi,B.,Hens,N.,Goeyvaerts,N.,Aerts,M.,VanDamme,P.,Beutels,P.,2009. Usingempiricalsocialcontactdatatomodelpersontopersoninfectious diseasetransmission:anillustrationforvaricella.Math.Biosci.218,80–87,

http://dx.doi.org/10.1016/j.mbs.2008.12.009.

OrganisationforEconomicCooperationandDevelopment.(2008).Educationata Glance,Statistics/EducationataGlance/2008:OECDIndicators[WWW Document].URL http://www.oecd-ilibrary.org/education/education-at-a-glance-2008eag-2008-en(accessed7.26.16).

Osborne,K.,Gay,N.,Hesketh,L.,Morgan-Capner,P.,Miller,E.,2000.Tenyearsof serologicalsurveillanceinEnglandandWales:methodsresults,implications andaction.Int.J.Epidemiol.29,362–368.

RCoreTeam.(2012).R:Alanguageandenvironmentforstatisticalcomputing. Santermans,E.,Goeyvaerts,N.,Melegaro,A.,Edmunds,W.J.,Faes,C.,Aerts,M.,

Beutels,P.,Hens,N.,2015.Thesocialcontacthypothesisundertheassumption ofendemicequilibrium:elucidatingthetransmissionpotentialofVZVin Europe.Epidemics11,14–23,http://dx.doi.org/10.1016/j.epidem.2014.12.005.

(9)

Schenzle,D.,1984.Anage-structuredmodelofpre-andpost-vaccinationmeasles transmission.IMAJ.Math.Appl.Med.Biol.1,169–191.

Thorrington,D.,Jit,M.,Eames,K.,2015.Targetedvaccinationinhealthyschool children—canprimaryschoolvaccinationalonecontrolinfluenza?Vaccine33, 5415–5424,http://dx.doi.org/10.1016/j.vaccine.2015.08.031.

Vynnycky,E.,White,R.,2010.AnIntroductiontoInfectiousDiseaseModelling. OxfordUniversityPress,Oxford.

Vyse,A.J.,Gay,N.J.,Hesketh,L.M.,Morgan-Capner,P.,Miller,E.,2004.

SeroprevalenceofantibodytovaricellazostervirusinEnglandandWalesin childrenandyoungadults.Epidemiol.Infect.132,1129–1134.

Walesiak,M.,Dudek,A.,Dudek,M.(2008).clusterSim:Searchingforoptimal clusteringprocedureforadataset.

Wallinga,J.,Teunis,P.,Kretzschmar,M.,2006.Usingdataonsocialcontactsto estimateage-specifictransmissionparametersforrespiratory-spread infectiousagents.Am.J.Epidemiol.164,936–944,http://dx.doi.org/10.1093/ aje/kwj317.

Riferimenti

Documenti correlati

ispirazione per la Farini Bakery, una panetteria che ha da poco progettato a Milano, tutta giocata sull’elegante e sobria combinazione della pietra di Ceppo di Gré e di mattoni

Redazione - Fabrizio Arrigoni, Valerio Barberis, Fabio Capanni, Francesco Collotti, Fabio Fabbrizzi, Francesca Mugnai, Alessandro Merlo, Andrea Volpe, Claudio Zanirato.. Info-grafica

Altri vantaggi sono dati dal mezzo internet stesso: il traduttore può reperire informazioni in maniera più facile e veloce, inoltre non raramente ha la possibilità di cambiare

Results: We confirmed that HHV8-infection is able to enhance both insulin binding and glucose-uptake in HHV8-infected primary endothelial cells; in addition, we found that

Recently, this has also been proved by multi-interface OCT (Optical coherence tomography) profilometry applied to the dynamic monitoring of varnish coatings on

3,4 For these reasons, in the past few years, we performed a randomized, multicenter study to assess the clinical usefulness of the addition of isosorbide mononitrate to nadolol in

A different approach to portfolio optimization has been proposed by Black and Litterman (BL), where the expected return of assets is calculated by the Capital Asset Pricing

La legge presenta tuttavia numerosi aspetti di grave discutibilità: ha rifiutato il confronto con i percorsi pregressi e le esperienze amministrative locali imponendo dall’alto