Method
Article
PCA-based
method
for
managing
and
analyzing
single-spot
analysis
referenced
to
spectral
imaging
for
artworks
diagnostics
Giacomo
Marchioro,
Claudia
Daffara
*
Dept.ofComputerScience,UniversityofVerona,StradaleGrazie15,37134,Verona,Italy
ABSTRACT
Artworksdiagnosticsisbasedonthejointuseofseveralnondestructivetechniquestoacquirecomplementary informationonthematerials.Acommonpracticeinthefieldistoperformtheanalyseswithsingle-spotanalytical techniques,e.g.spectroscopy-based,afterapreliminaryscreeningoftheartworkwithfull-fieldimaging-based techniques.Wepresentamethodanditspracticalimplementationforfusingandanalyzingdatacollectedusing analyticalsystemsthatacquiresinglespotmeasurementsmappedtospectralimagingstacks.Thefuseddatasetof single-spotandimagingobservationsisanalyzedusingprincipalcomponentanalysis(PCA).Theeffectivenessof themethodforartworksdiagnosticsis shownonspectroscopy andimagingdatasetsof anancientcanvas painting.TheresultsofthePCA analysisonthefinalfuseddataset arecomparedagainstthePCA analysis performedontheoriginaldatasetsfromsingle-spotandimagingmeasurementstakenseparately.Wepropose twopracticalimplementationsoftheprocedure,onebasedonusinggraphicaluserinterface(GUI)and open-source GIS software(QGIS), the other onebased on anopen-source Python module,named SPOLVERRO, specificallydevelopedforthisproject andreleasedonapublicrepository.Themethodallowsconservation scientiststoanalizeeffectivelytheheterogeneousdatasetsacquiredinadiagnosticcampaign.
single-spotspectroscopydataarereferencedonimagingdata.
thesamplingareaofeachspectroscopyspotisusedforextractingandaveragingtherespectiveimagingdata values.
thefinalmatrixisanalyzedusingPCAforextractingfurtherinformation.
©2020PublishedbyElsevierB.V.ThisisanopenaccessarticleundertheCCBY-NC-NDlicense(http:// creativecommons.org/licenses/by-nc-nd/4.0/).
DOIoforiginalarticle:http://dx.doi.org/10.1016/j.microc.2019.104469
*Correspondingauthor.
E-mailaddress:claudia.daffara@univr.it(C.Daffara).
http://dx.doi.org/10.1016/j.mex.2020.100799
2215-0161/©2020 Publishedby ElsevierB.V. This is anopen accessarticle underthe CCBY-NC-ND license(http://
creativecommons.org/licenses/by-nc-nd/4.0/).
ContentslistsavailableatScienceDirect
MethodsX
ARTICLE INFO
Methodname:PCA-basedmethodformanagingandanalyzingsingle-spotanalysisreferencedtospectralimagingforartworks
diagnostics
Keywords:CulturalHeritage,PCA,GIS,XRF,Multispectralimaging,Hyperspectralimaging,Spectroscopy,Datafusion,Artworks
Articlehistory:Received26November2019;Accepted13January2020;Availableonline23January2020
SpecificationTable
SubjectArea: EnvironmentalScience
Morespecificsubjectarea: DiagnosticsofCulturalHeritage,multi-modaldatafusion
Methodname: PCA-basedmethodformanagingandanalyzingsingle-spotanalysisreferencedtospectral
imagingforartworksdiagnostics
Nameandreferenceof
originalmethod:
NA
Resourceavailability: https://github.com/giacomomarchioro/spolverro
Methoddetails Background
Duringthediagnosticcampaigndifferentnondestructivetechniquesareusedforinvestigatingthe artwork[1–4].Wecandistinguishbetweensystemsthatareabletoacquireimages(imagingsystems) andsystemsthat areabletoacquire onlya singlespotforeach measurement,e.g. spectroscopy systems.Inthismethodwecallthelattersingle-spotsystems,wherewemeanby“spot”thesampling area,i.e.theareaandtheassociatedvolumeoftheartworkfromwhichmostoftheanalyticalsignalis collected.Eventhoughinrecentyearsresearchershavebeendevelopingmanyspectroscopyscanning systemsabletoscanrelativelylargeareas(intheorderofthesquaremetre),thesesystemsareoften expensiveandnotcommerciallyavailable.Furthermore,incaseoflargeartworks(suchasfrescosand largecanvaspaintings)thetimerequiredforscanningtheentiresurfaceisoftenprohibitive.Inthe currentpraxis,imagingtechniquescomprisingmultispectralandhyperspectralcamerasareusedfor exploringlargeareas,while,instead,hand-heldandportablespectrometersareoftenusedinalimited numberofselectedpointsforgatheringfurtherinformation.Themanagementofthedifferentdata collectedusingthevarioustechniquesiscrucialfortakingfulladvantageoftheinformationcontained. Butactually,conservatorsandconservationscientistsoftenlacktherequiredcomputerliteracyfor combiningandanalyzingheterogeneousdatasets.
Inthiscontext,weproposeamethodanditspracticalimplementationforfusingandanalyzingthe datacollectedfromanalyticalsingle-spotsystemsmappedtospectralimagingstacks.Theprocedure wasoriginallydevelopedtosupporttheanalysisofspectroscopyandimagingdataroutinelyacquired duringthediagnosticsofancientartworksinthemuseumsofVenice[5],thenitwasstandardizedand, here,releasedasavailableresource.
The need toperformthemethodin a reproducibleway, for example,byusing programming languagestowriteallthestepsoftheanalysisasreproduciblescript,maylimittheadoptionofthe methodtospecificusers.Hence,inparallelwiththeimplementationofthemethodusingscriptswe showanalternativepracticalsolutionbasedonusingsoftwarewithgraphicaluserinterface(GUI). Geographical information systems (GIS) can beused effectively in the management of thedata collectedduringthediagnosticcampaignonartworks[6].Becauseofthewiderusercommunity,GIS aregenerallystableandmatureandcanbeusedasanalternativetoadhocsolutionsforCultural Heritage[7].CulturalHeritageoperatorsandscholarsusingGISarefacilitatedbytheGUI,however, becauseofthemanyfunctionalitiesofthissoftware,theyoftenneedtobeguidedstepbystep.
Thegeneralproceduretodesigndatamanagementandanalysisinartworkdiagnosticscanbe summarizedinthestepsinFig. 1.Inthefirstphase,imagingsystemsareusedforexploringalargearea ortheentireareaoftheartwork.Iftheacquiredimagesaredistorted,itispreferabletoperformthe correctionofthedistortionbeforeproceedingwiththefollowing steps.In thesecondphase,the
conservationscientistperformsafirstanalysisofthedataandplanstogetherwithconservatorsand arthistoriansfurtherinvestigationsbasedonusing,forexample,single-spotsystemsasportableX-ray fluorescence(XRF), FT-IR, and Ramanspectrometers. During the acquisition,the positionof the measurementsperformedwiththesingle-spottechniqueisrecordedandeventuallymappedtothe imagingdata.Atthisstage,theheterogeneousdatacollectedfromimagingandsingle-spottechniques canbefusedinasingledataset.Basedonthepositionofthereferencedsingle-spotmeasurementand ontheestimatedsamplingareaofthesingle-spottechnique(Fig.2),thesoftwareextractsthepixelsof theimagingtechniqueenclosedintheboundaryofthespotandcomputestheaverageofthepixel valuesbelongingtothesamemultispectralchannel.Theresultisadatamatrixpxn,wherepindicates thenumber of rows representingtheobservations, equal tothenumber of spots collected, and nindicatesthecolumnscontainingthevaluesofeachsingle-spotanalysisandtherespectiveaveraged multispectral imaging values. In the last phase, the final data matrix can be analyzed using multivariateanalysis.Inthismethod,weproposeprincipalcomponentanalysis(PCA)forextracting meaningfulinformationfromthedatafusionprocessandvalidatingtheprocedure.
Softwareneeded
The general procedure described in the previous section has been implemented by taking advantageofavailable freeandopen-sourcetoolsand thedevelopedcodeis madeavailableasa Pythonmodule,that wecalled SPOLVERRO.For implementing thesoftware weused Python3.6 programming language, together with third-party open-source scientific libraries to make the workflowautomatedandmorereproducible.ThePythonShapefileLibrary(PyShpv.2.1)isusedfor readingtheshapefilevectordataformat,whilerasterdata(i.e.imagingdata)areprocessedusingthe Rasteriolibrary(v.0.10)[8],whichisbasedontheGeospatialDataAbstractionLibrary(GDAL)[9]by theOpenSourceGeospatialFoundation.FortheanalysisofthedataweusedtheNumPyandtheSciPy library,togetherwiththescikit-learnlibrary[10]forcomputingthePCA.TheMatplotliblibrary[11]is usedforplottingtheoutputs.
RegardingthealternativeimplementationofthemethodbasedonGISsoftware,wechooseQGIS (version 2.16 Nødebo or greater), a free and open-source multiplatform software with a wide communityofusersabletoprovidesupport.QGIScanbeusedforperformingthefirstfourstepsofthe procedure(Fig.3).
Hardwareneeded:imagingandsingle-spotinstruments
Toapplythemethoditisneededatleastoneimagingtechniqueandonesingle-spottechniquefor performing the measurements. The method presented is more effective when multispectral or
Fig.2.Illustrationofsamplingareaandassociatedexperimentalsurfaceforthesingle-spotanalysis.Intheapplicationshown,
hyperspectral imaging systems are combined with portable or handheld spectrometers, as shown. However,themethodcanin principle beappliedalso withsystemsthatrecorda limitednumber ofbandsor variables.Themethod,asreportedhere,assumesthattheimagingtechniquehasaspatialresolutionequal orhigherthanthesingle-spottechnique.However,itcanbeappliedalsowithimagingtechniqueswith lowerspatialresolutionassumingacertainhomogeneityoftheartworkmaterial.
Step1–acquisitionandcorrectionoftheimagingdata
Thefirststepconsistsinacquiringtheimagingdata.Itispreferabletocorrectanydistortionofthe imagestoensureacorrectreferencingoftheacquisitionscollectedusingthesingle-spottechniques. Thedistortionscanbecorrectedusingspecificsoftware,otherwise,ifortho-photosareavailableas groundtruth,theimagescanberegisteredmanuallyusingcontrolpointsandthe“Georeferencer Plugin”oftheQGISsoftware.Iftheimagingsystemsoutputtheimagesinaformatnotreadableby QGIS,theGDALdriverscanbeusedforconvertingthemultispectralorhyperspectralimagestacktoa suitableformate.g.geoTIFFor.lan.IfweassignaprojectionwhenimportingthedatainGIS,wemust besuretouseaplanarprojectione.g.EPSG:32632andnottoselectellipsoidalmeasurements. Step2–planningofthesingle-spotanalysescampaign
After the acquisition of the imaging dataset a first inspection of the images collected is propaedeutictothesuccessivestep.Inthiscase,false-colourcompositeimagingisoftenhelpfulfor identifyingtogetherwiththeconservatorsand arthistorianstheproblematicareas(forexample, addressingquestion suchasdetectingoriginal orrestoredareasof theartwork)and guidingthe followingsamplingusingsingle-spotanalyses.Therearemanystrategiesforfacilitatingthisprocess, forexample,avectorlayerintheGISsoftwarecanbecreatedforstoringthepositionswherethe analysiswillbeperformed.Thisplanningstepisveryimportantincasethesingle-spotinstrumentis notatdisposalforthedurationoftheentirediagnosticcampaignandhencethetime(andcost)of measurementsmustbeoptimized.
Step3–acquisitionandmappingofthesingle-spotanalyses
Thepositionwherethesingle-spotmeasurementsareacquiredmustberecordedaccurately.If mechanical positioning systems or optical tracking systems are available, they can be used for
Fig.3.AscreenshotofaQGISproject,whereXRFmeasurementsaremappedtomultiplerasterimagesoveralargemural
assigningautomatically the position to each spectrometry measurement relative to the images coordinatereferencesystem.Otherwise,GISsoftwarecanbeusedforcreatinganewvectorlayerand theoperatorcanreferencemanuallythesinglespot-measurementstotheimagingdatapreviously imported.ItisagoodpracticetotakephotosduringtheacquisitionandestablishauniqueIDforeach measurementstoensurethecorrectreferencingofthesinglepoints.Inthisway,wecanstorethe positionofthepoints (forinstance,asa shape file)and laterjoin thecorrespondingsingle-spot analysisdata.Thiscanbedonebyexportingthespectroscopydatainacommonformate.g.csvandby importingitasanattributetable,whichcanbejoinedwiththeshapefileusingthebuilt-inQGIS“join feature”intothelayerpropertiesandtheIDasjoinfield.Thesameprocedurecanbecarriedoutusing thePythonShapefileLibrary.
Step4–extracttheimagingdatawithinthesamplingareaofthesingle-spottechnique
Forcreatinganewmatrixcontainingboththesingle-spotanalysesandthecorrespondingsignal recordedwiththeimaging techniques in thesameposition, foreach location of thesingle-spot analysesthecorrespondingimagingvaluesmustbeextracted.Forperformingthistask,thesampling areaofthespectroscopytechniqueusedmustbepreviouslyestimated.Theimagingdatawithinthe sampleareaareextractedandaveragedforeachspectralchannelorbandoftheimagingtechnique. ThisprocedurecanbeimplementedprogrammaticallyusingPythonandRasterio.Inthiscase,we approximatethesamplingareatoasquareofsideequaltothediameterofthesamplingarea.Knowing thecorrespondingphysicaldimensionofthepixelintheartworksurface,agridofpointsequivalentto thesamplingareaiscalculated.Foreachpointthenearestrasterpixelforeachchannelisextracted. Otherwise,usingQGIS,asimilarprocedurecanbecarriedoutusingthe“Pointsamplingtool”pluginor theSagafunction“RasterValuestoPoints”(inQGISv.>3.14).
Step5–analysisofthefinaldatamatrix
The final data matrix is composed of a number of rows corresponding to the number of measurementsperformedwiththesingle-spottechnique.Eachrowisthejointvectorcomprehending thesingle-spotanalysisdataandthecorrespondingaveragedimagingvaluesextractedwithinthe samplingareaforeachspectralchannelorband.Inotherwords,eachrowisanobservationwhileeach columnrepresentsavariable.Differentpreprocessingofthedatacanbecarriedoutbeforeapplying multivariateanalysistothedata.However,itmightbereasonabletoexcludenoisychannelsbefore applyingfurtherprocessing.Inthismethod,wesuggestusingPrincipalComponentAnalysisafterthe standardizationofthedataset.Thestandardizationisperformedbyremovingthemeanandbyscaling tounitaryvariancethefuseddataset.Theprocessingcanbecarriedoutviascriptsusingscikit-learn [10],otherwise,fornon-Pythonusers,usingavailablefreeandopen-sourcetoolswithGUIsuchas Gnumeric.Eventually,clusteringtechniquescanbeusedtoidentifyclustersinthescoreplotsobtained withthePCAanalysis.
Methodvalidation
Themethodhasbeenvalidatedondatasets ofanancientlargecanvas painting(called telero), masterpiecebythevenetianartistCarpaccio,collectedduringtheIperion[12]projectaspresentedin ourpublication[5].Inthecasestudy,themultispectralimagingtechniqueiscombinedwiththeXRF spectroscopysingle-spottechnique.AmultispectralcubehasbeencollectedusingaVis-NIRscanner (16bandsinthevisibleand16bandsinthenearinfraredupto2250nm)whileasetof47spotanalyses hasbeenperformedusingahand-heldXRFspectrometerovertheredareasofthepainting.Tovalidate themethodwefirstperformedthesteps1–4oftheprocedure,thenwetestedtheeffectivenessofthe methodforanalyzingthetwodatasetsseparatelyandafterthefifthstepoftheprocedurethatfuses thetwodatasetstogether.OnceextractedthevaluesforeachspotwestandardizedtheVis-NIRdataset andtheXRFdatasetseparately,weperformedthePCAoneachdataset,andweevaluatedtheresults against the PCA computed on the two datasets fused in a single representation after the standardization.Previously,eachobservationwascategorizedintotwogroups,onecomprisingthe
variousmixtures ofpigmentsandlabelledusingcolourdependingontheattributedpigment,the otheroneusingdifferentmarkersrepresentingdatacollectedinacompromisedarea(circlemarker) ornot(trianglemarker).
TheresultsareshowninFig.4depictingthescoreplotsofthefirstthreeprincipalcomponentsof thePCAperformedonthedatasets.ThescoreplotoftheXRFdataset(Fig.4a)highlightssomeclusters withdifferentmixtureofpigments.Thescoreplotwiththefirstthreeprincipalcomponentsofthe datasetobtainedextracting thevalues from theVis-NIR multispectral cube (Fig. 4b) shows two clustersseparatedbythe firstprincipal component,one representingthecompromisedpictorial layersandtheotheronerelativetothedatacollectedontheotherareas.However,inthiscasethe clustersrelativetothepigmentmixturearenotclear.Finally,thescoreplotobtainedusingthejoint datasetmatrix(Fig.4c)clearlydistinguishesthedatacollectedonthedamagedareasfromtheothers. Furthermore,wecanidentifydifferentclustersdependingonthepigmentsusedinthepictoriallayer. Hints
Ingeneral,wemustconsiderthatevenifthedataisreferencedontheartworksurfacethesignal maycomefromdifferentdepthsofthepictoriallayer,dependingonthepropertiesofthematerials andtheradiationused.Hence,theinformationretrievedmustbeconsideredrelativetoanunknown portionofvolumeratherthantothesamplingarea.Thecontributionofartworkstratigraphymaterials tothecollectedsignalisnoteasytodetermineandgenerallyitvariesdependingontheradiation wavelength.Thisissueconcernsboththeuseofanalyticalsingle-spottechniquesandmultispectral imagingtechniques.
Instep4,theusermayaveragetheimagingdataonanareagreaterthantheeffectivesamplingarea ofthesingle-spottechnique.Ingeneral,thesizeoftheareatobeaveragedshouldtakeintoaccount alsotheerrorinmappingthesingle-spotmeasurementtotherasterimage.
Instep4,itcanbeusefultocalculate,besidestheaverage,alsothestandarddeviationforeach multispectralchannel.Thiscangiveaninsightintothehomogeneityofthesampledarea.Inartworks diagnosticsthisissuemaybecrucialduetothehighheterogeneityofthematerials.Ifthestandard deviationishigh,errorsintheprocessofmappingthesingle-spotmeasurementstotheimagingdata mayhavegreatinfluenceonthefinalresult.
Instep4,theusershouldknowhowthesoftwarecomputestheextractionofthevaluesfromthe imagestackinthesamplingarea.Inparticular,itmustbeclarifiedifthevaluesareinterpolatedorare extractedfromthenearestpixel.Inourcase,weextractedthevaluestothenearestpixel.
DeclarationofCompetingInterest
The authors declare that they have no known competing financial interests or personal relationshipsthatcouldhaveappearedtoinfluencetheworkreportedinthispaper.
Fig.4.ScoreplotsofthefirstthreeprincipalcomponentsofthePCAperformedontheXRFspectroscopydataset(a),onthe
Acknowledgements
ThisworkwaspartiallysupportedbytheScan4RecoprojectfundedbytheEuropeanUnionHorizon 2020FrameworkProgrammeforResearchandInnovationundergrantagreementno665091.We wouldliketothankNicoledeManincor,EnricoFiorin,MarcoRaffaelliforcollectingthedatausedfor thevalidationofthemethod,andGiovanniAllegriforsuggestionsontheprojectionsystemsusedfor theGISanalysis.
References
[1]H.Liang,Advancesinmultispectralandhyperspectralimagingforarchaeologyandartconservation,Appl.Phys.A-Mater. 106(2012)309–323,doi:http://dx.doi.org/10.1007/s00339-011-6689-1.
[2]K.Janssens,G.VanderSnickt,F.Vanmeert,S.Legrand,G.Nuyts,M.Alfeld,L.Monico,W.Anaf,W.DeNolf,M.Vermeulen, Non-invasiveandnon-destructiveexaminationofartisticpigments,paints,andpaintingsbymeansofX-raymethods,Top. Curr.Chem.374(2016)81,doi:http://dx.doi.org/10.1007/s41061-016-0079-2.
[3]M.Alfeld,L.deViguerie,Recentdevelopmentsinspectroscopicimagingtechniquesforhistoricalpaintings-areview, Spectrochim.ActaB136(2017)81–105,doi:http://dx.doi.org/10.1016/j.sab.2017.08.003.
[4]C.Daffara,S.Parisotto,D.Ambrosini,Multipurpose,dual-modeimaginginthe3–5mmrange(MWIR)forartwork diagnostics:asystematicapproach,Opt.LaserEng.104(2018)266–273,doi:http://dx.doi.org/10.1016/j.
optlaseng.2017.10.006.
[5]N.deManincor,G.Marchioro,E.Fiorin,M.Raffaelli,O.Salvadori,C.Daffara,Integrationofmultispectralvisible-infrared imagingandpointwiseX-rayfluorescencedatafortheanalysisofalargecanvaspaintingbyCarpaccio,Microchem.J.153 (2020),doi:http://dx.doi.org/10.1016/j.microc.2019.104469.
[6]W.Schmid(Ed.),GRADOC-GraphicDocumentationSystemsinMuralPaintingConservation.ResearchSeminar,Rome16– 20November1999,ICCROM,Rome,2000.
[7]A.Amat,C.Miliani,B.G.Brunetti,Non-invasivemulti-techniqueinvestigationofartworks:anewtoolforon-the-spotdata documentationandanalysis,J.Cult.Herit.14(2013)23–30,doi:http://dx.doi.org/10.1016/j.culher.2012.02.015.
[8]S.Gillies,etal.,Rasterio:GeospatialRasterI/OforPythonprogrammers,(2013).https://github.com/mapbox/rasterio.
[9]F.Warmerdam,Thegeospatialdataabstractionlibrary,in:B.Hall,M.G.Leahy(Eds.),OpenSourceApproachesinSpatial DataHandling,Springer-Verlag,BerlinHeidelberg,2008,pp.87–104,doi:http://dx.doi.org/10.1007/978-3-540-74831-1. [10]F.Pedregosa,G.Varoquaux,A.Gramfort,V.Michel,B.Thirion,O.Grisel,M.Blondel,P.Prettenhofer,R.Weiss,V.Dubourg,
Scikit-learn:machinelearninginpython,J.Mach.Learn.Res.12(2011)2825–2830.
[11]J.D.Hunter,Matplotlib:a2Dgraphicsenvironment,Comput.Sci.Eng.9(2007)90–95,doi:http://dx.doi.org/10.1109/ MCSE.2007.55.
[12]IPERIONCHIntegratedPlatformfortheEuropeanResearchInfrastructureONCulturalHeritage,MolabFacility,(2019).