• Non ci sono risultati.

bibliometrix : An R-tool for comprehensive science mapping analysis

N/A
N/A
Protected

Academic year: 2021

Condividi "bibliometrix : An R-tool for comprehensive science mapping analysis"

Copied!
17
0
0

Testo completo

(1)

ContentslistsavailableatScienceDirect

Journal

of

Informetrics

jo u r n al hom e p ag e :w w w . e l s e v i e r . c o m / l o c a t e / j o i

Regular

article

bibliometrix:

An

R-tool

for

comprehensive

science

mapping

analysis

Massimo

Aria

a,∗

,

Corrado

Cuccurullo

b

aDepartmentofEconomicsandStatistics,UniversitàdegliStudidiNapoliFedericoII,ViaCintia,C.ssoM.teS.Angelo,80126Naples,Italy bDepartmentofEconomicsandManagement,UniversitàdellaCampaniaLuigiVanvitelli,CorsoGranPrioratodiMalta,Capua,CE,Italy

a

r

t

i

c

l

e

i

n

f

o

Articlehistory:

Received14February2017

Receivedinrevisedform27August2017 Accepted27August2017

Availableonline12September2017 Keywords: Bibliometrics Sciencemapping Workflow Co-citation Bibliographiccoupling Rpackage

a

b

s

t

r

a

c

t

Theuseofbibliometricsisgraduallyextendingtoalldisciplines.Itisparticularlysuitable forsciencemappingatatimewhentheemphasisonempiricalcontributionsisproducing voluminous,fragmented,andcontroversialresearchstreams.Sciencemappingiscomplex andunwieldlybecauseitismulti-stepandfrequentlyrequiresnumerousanddiverse soft-waretools,whicharenotallnecessarilyfreeware.Althoughautomatedworkflowsthat integratethesesoftwaretoolsintoanorganizeddataflowareemerging,inthispaperwe proposeauniqueopen-sourcetool,designedbytheauthors,calledbibliometrix,for per-formingcomprehensivesciencemappinganalysis.bibliometrixsupportsarecommended workflowtoperformbibliometricanalyses.AsitisprogrammedinR,theproposedtoolis flexibleandcanberapidlyupgradedandintegratedwithotherstatisticalR-packages.Itis thereforeusefulinaconstantlychangingsciencesuchasbibliometrics.

©2017ElsevierLtd.Allrightsreserved.

1. Introduction

Thenumberofacademicpublicationsisincreasingatarapidpaceanditisbecomingincreasinglyunfeasibletoremain currentwitheverythingthatisbeingpublished.Moreover,theemphasisonempiricalcontributionshasresultedin volu-minousandfragmentedresearchstreams(Briner&Denyer,2012).Thishamperstheabilitytoaccumulateknowledgeand activelycollectevidencethroughasetofpreviousresearchpapers.Therefore,literaturereviewsareincreasinglyassuminga crucialroleinsynthesizingpastresearchfindingstoeffectivelyusetheexistingknowledgebase,advancealineofresearch, andprovideevidence-basedinsight intothepracticeofexercisingandsustainingprofessionaljudgmentand expertise (Rousseau,2012).

Scholarsusedifferentqualitativeandquantitativeliteraturereviewingapproachestounderstandandorganizeearlier findings.Amongthese,bibliometricshasthepotentialtointroduce asystematic,transparent,andreproduciblereview processbasedonthestatisticalmeasurementofscience,scientists,orscientificactivity(Broadus,1987;Diodato,1994; Pritchard,1969).Unlikeothertechniques,bibliometricsprovidesmoreobjectiveandreliableanalyses.Theoverwhelming volumeofnewinformation,conceptualdevelopments,anddataarethemilieuwherebibliometricsbecomesusefulby providingastructuredanalysistoalargebodyofinformation,toinfertrendsovertime,themesresearched,identifyshifts intheboundariesofthedisciplines,todetectthemostprolificscholarsandinstitutions,andtopresentthe“bigpicture”of extantresearch(Crane,1972).

∗ Correspondingauthor.

E-mailaddresses:aria@unina.it,massimo.aria@unina.it(M.Aria),corrado.cuccurullo@unicampania.it(C.Cuccurullo).

http://dx.doi.org/10.1016/j.joi.2017.08.007

(2)

Althoughovertime,theuseofbibliometricshasbeenextendedtoalldisciplines,bibliometricanalysisiscomplexbecause itentailsseveralstepsthat employnumerousanddiverseanalysesandmappingsoftwaretools,whicharefrequently availableonlyundercommerciallicenses(Guler,Waaijer,andPalmblad,2016).Thesedifficultiesarecompoundedbythe realitythatfewresearchersandpractitionersaretrainedinhowtoreviewliteratureandtoidentifyevidence-basedpractices (Briner&Denyer,2012).Thecumbersomenatureoftheprocessreducesthepossibilitiesandthepotentialofbibliometrics, especiallyforscholarswhohavenogeneralprogrammingskills.

Recently,automatedworkflowstoassemblespecializedsoftwareintoacomprehensiveandorganizeddataflowhave beguntoemergeforbibliometrics.Theyareparticularlywellsuitedtomulti-stepanalysesusingdifferenttypesofsoftware tools(Guler,Waaijer,Mohammed,&Palmblad,2016).Inthispaper,weproposeauniquetool,developedintheRlanguage, whichfollowsaclassiclogicalbibliometricworkflowthatwereconstruct.WehavedesignedandproducedanR-toolfor comprehensivebibliometricanalyses.Risalanguageandenvironmentforstatisticalcomputingandgraphics(RCoreTeam, 2016).Itprovidesawidevarietyofstatisticalandgraphicaltechniquesandishighlyextensible(Matloff,2011).Inadditionto enablingstatisticaloperations,itisanobject-orientedandfunctionalprogramminglanguage;hence,youcanautomateyour analysesandcreatenewfunctions.Ithasanopen-softwarenature,whichmeansitiswellsupportedbytheusercommunity andnewfunctionsareregularlycontributedbyusers,manyofwhomareprominentstatisticians.AsitisprogrammedinR, theproposedtoolisflexible,canberapidlyupgraded,andcanbeintegratedwithotherstatisticalR-packages.Itistherefore usefulinaconstantlychangingfieldsuchasbibliometrics.

Theaimofthispaperistwofold.First,wepresenttheproposedopen-sourcebibliometrixR-packageforperforming com-prehensivebibliometricanalyses,comparingittootherimportantsoftwaretools.Secondly,wediscusshowtheproposedtool supportsarecommendedworkflowforperformingbibliometricstudies.Weillustratethemainbibliometrixfunctionsinthis workflow,usingallthearticleswritteninEnglishonbibliometricsinthemanagement,business,andpublicadministration domainsoveraspanof30years.

2. Recommendedworkflowforsciencemapping

ThegeneralsciencemappingworkflowwasdescribedbyBörner,Chen,andBoyack(2003).Cobo,Lopez-Herrera, Herrera-Viedma,andHerrera,(2011a)comparedsciencemappingsoftwaretoolsusingasimilarworkflow.Astandardworkflow consistsoffivestages(Zupic& ˇCater,2015):

1.Studydesign; 2.Datacollection; 3.Dataanalysis; 4.Datavisualization; 5.Interpretation.

Instudydesign, scholarsdefinetheresearchquestion(s)andchoosetheappropriatebibliometricmethodsthatcan answerthequestion(s).Threegeneraltypesofresearchquestionscanbeansweredusingbibliometricsforsciencemapping: (i)identifyingtheknowledgebaseofatopicorresearchfieldanditsintellectualstructure;(ii)examiningtheresearchfront (orconceptualstructure)ofatopicorresearchfield;and(iii)producingasocialnetworkstructureofaparticularscientific community.Instudydesign,oneofthemostsignificantchoicesforscholarsisthetimespanordecisiontodividethetimespan intotimeslices.Bibliometricanalysisisperformedataspecificpointintimetorepresentastaticpictureofthefieldatthat moment;itcandividethetimespanintomultipletimeperiodstocapturethedevelopmentofthefieldthroughtime.

Indatacollection,scholarsselectthedatabasethatcontainsthebibliometricdata,filterthecoredocumentset,andexport thedatafromtheselecteddatabase.Thisstepcaninvolveconstructingone’sowndatabase(Waltman,2016).

Fordataanalysis,oneormorebibliometricorstatisticalsoftwaretoolsareemployed.Alternatively,scholarscanwrite theirowncomputercodetomeettheirrequirements.

Thefourthstageisdatavisualization.Scholarsmustdecidewhatvisualizationmethodistobeusedontheresultsofthe thirdstepandthenemploytheappropriatemappingsoftware.

Thelaststageisinterpretation,wherescholarsinterpretanddescribetheirfindings.Althoughbibliometricmethods willfrequentlyrevealthestructureofafielddifferentlytotheclassificationoftraditionalliteraturereviews,theyarenota substituteforextensivereadinginthefield.Scholarswithin-depthknowledgeofthefieldhaveacleardistinctiveadvantage.

Thesecondtofourthstagesaretypicallysoftware-assistedandincludedifferentsub-stages. 2.1. Datacollection

Datacollectionisdividedinthreesub-stages.Thefirstisdataretrieval.Manyonlinebibliographicdatabases,where metadata regarding scientific works are stored, can be sources of bibliographic information, such as Clarivate Ana-lytics Web of Science (WoS at http://www.webofknowledge.com), Scopus (http://www.scopus.com), Google Scholar (http://scholar.google.com),andScienceDirect(http://www.sciencedirect.com/)(Coboetal.,2011a).Theydonotcover thescientificfieldsandjournalsinthesamemannerandhencethechoiceisnotneutral(Waltman,2016;Zupic& ˇCater,

(3)

Table1

Mostcommonbibliometrictechniquesperunitofanalysis(adaptedbyCoboetal.,2011).

Bibliometrictechniquetaxonomy Unitofanalysisused Kindofrelation BibliographicCoupling • Author

• Document • Journal

• Commonreferencesinauthors’oeuvres • Commonreferencesindocuments • Commonreferencesinjournals’oeuvres

Co-citation • Author • Reference • Journal • Co-citedauthors • Co-citeddocuments • Co-citedjournals Co-author • Author

• Countryfromaffiliation • Institutionfromaffiliation

• Co-occurrenceofauthorsintheauthorlistofa document

• Co-occurrenceofcountriesintheaddresslistof adocument

• Co-occurrenceofinstitutionsintheaddresslist ofadocument

Co-word • Keyword,ortermextractedfromtitle,abstract ordocument’sbody

• Co-occurrenceoftermsinadocument

2015).Othersimilardatabasesexistforspecificdisciplines(e.g.,Medline,AstrophysicsDataSystem),patentdata,anddigital materials(e.g.,arXiv,DBPL,CiteSeerXPatent).

Thesecondsub-stageisdataloadingandconverting,wherescholarsmustconvertdataintoasuitableformatforthe employedbibliometrictools.

Thefinalsub-stageisdatacleaning.Thequalityoftheresultdependsonthequalityofthedata.Severalpreprocessing methodscanbeapplied,forexample,todetectduplicateandmisspelledelements.Althoughthemajorityofbibliometric dataarereliable,citedreferencescancontainmultipleversionsofthesamepublicationanddifferentspellingsofanauthor’s name.Moreover,becauseauthorsaretypicallyabbreviatedbytheirsurnameandinitials,aproblemcanarisewithcommon names.Citedjournalscanalsoappearinslightlydifferentforms.Bookshavedifferenteditions,whichcanappearasdifferent citations.

2.2. Dataanalysis

Dataanalysisentailsdescriptiveanalysisandnetworkextraction.Differentapproacheshavebeendevelopedtoextract networksusingdifferentunitsofanalysis(Table1).Forexample,co-wordanalysis(Callon,Courtial,Turner,&Bauin,1983) usesthemostimportantwordsorkeywordsofdocumentstostudytheconceptualstructureofaresearchfield.Itisthe onlymethodthatusestheactualcontentofthedocumentstoconstructasimilaritymeasure;theothersconnectdocuments indirectlythroughcitations.Co-wordanalysisproducessemanticmapsofafieldthatfacilitatetheunderstandingofits cognitivestructure.Itcanbeappliedtodocumentkeywords,abstracts,orfulltexts.Theunitofanalysisisusuallyaconceptor keyword,notadocument,author,orjournal.Anothercommonbibliometricanalysisisco-authoranalysis,whichexamines theauthorsandtheiraffiliationstostudythesocialstructureandcollaborationnetworks(Glänzel,2001;Peters&Van Raan,1991).Themostcommonanalysisinbibliometricsiscitationanalysis.Itemployscitationcountsasa measureof similaritybetweendocuments,authors,andjournals.Citationanalysiscanbedecomposedintobibliographiccouplingand co-citationanalysis.Examplesareauthorcoupling(Zhao&Strotmann,2008),authorco-citation(White&McCain,1998; White&Griffith,1981),journalcoupling(Gao&Guan,2009;Small&Koenig,1977;Yan&Ding,2012),andjournalco-citation (McCain,1991).

Abibliographiccouplingconnectionisestablishedbytheauthorsofthearticlesinquestion,whereasaco-citation con-nectionisestablishedbytheauthorswhoarecitingthedocumentsanalysed.Thatis,bibliographiccoupling(Kessler,1963) analysesthecitingdocuments,whereasco-citationanalysis(Small,1973)studiestheciteddocuments.Although biblio-graphiccouplingishelpfulindetectingtheconnectionsofresearchgroups(Yang,Han,Wolfram,&Zhao,2016),co-citation analysis,whenexaminedovertime,isalsohelpfulindetectingashiftinparadigmsandschoolsofthought.Thechoiceofthe techniquetoemploydependsonthegoalsoftheanalysis.Usually,co-citationanalysisisperformedformappingolderpapers (prospectiveanalysis–itisdynamicandisbestperformedwithindifferenttimeslices),whereasbibliographiccouplingis usedtomapacurrentresearchfront(retrospectiveanalysis–itdoesnotchangeovertime).Recently,KlavansandBoyack (2017)suggestedthatdirectcitationsaremoreaccurateinrepresentingaresearchfrontthanbibliographiccouplingand co-citation.

Oncethenetworkhasbeenbuilt,anormalizationprocesscanbecommonlyperformedovertherelations(edges)between itsnodes(vertices)usingsimilaritymeasuressuchasSalton’scosine,Jaccard’scoefficient,andPearson’scorrelation.

Finally,datareductionishelpfulinidentifyingsubfields.Withthenormalizeddata,differenttechniquescanbeusedto buildthemap.Variousdimensionalityreductiontechniquescanbeapplied,suchasprincipalcomponentanalysis/factor analysis,multidimensionalscaling(MDS),multiplecorrespondenceanalysis(MCA),andclusteringalgorithms.

(4)

2.3. Datavisualization

Analysismethodsallowtheextractionofusefulknowledgefromdataandtorepresentitthroughintuitivevisualizations ormapssuchasbi-dimensionalmaps,dendrograms,andsocialnetworks.Networkanalysisallowsustoperformastatistical analysisoverthemapsgeneratedtoindicatedifferentmeasuresoftheentirenetworkormeasuresoftherelationshiporthe overlappingofthedifferentclustersdetected.

Visualizationtechniquesareusedtorepresenta sciencemap andtheresultofthedifferentanalyses.Forexample, networkscanberepresentedusingheliocentricmaps(deMoya-Anegonetal.,2005),geometricalmodels(Skupin,2009), thematicnetworks(Bailón-Moreno,Jurado-Alameda,&Ruiz-Ba ˜nos,2006;Cobo,López-Herrera,Herrera-Viedma,&Herrera, 2011b),ormapswheretheproximitybetweenitemsrepresentstheirsimilarity(vanEck&Waltman,2010).Alternatively, temporalanalysisaimstoindicatetheconceptual,intellectual,orsocialevolutionoftheresearchfieldbydiscovering pat-terns,trends,seasonality,andoutliers.Burstdetection,atemporalanalysis,aimstoidentifyfeaturesthathavehighintensity overfinitedurationsoftimeperiods.Todemonstratetheevolutionindifferenttimeperiods,clusterstrings(Small,2006; Small&Upham,2009;Upham&Small,2010)andthematicareas(Coboetal.,2011b)canbeused.Finally,geospatialanalysis aimstodiscoverwhereaneventoccursanditsimpactontheneighbouringareas.

3. Relatedbibliometricsoftwaretools 3.1. Softwaretoolsforsciencemapping

Numeroussoftware tools supportbibliometric analysis; however, many of thesedo not assist scholarsin a com-plete recommended workflow. The most relevant tools are CitNetExplorer (van Eck & Waltman, 2014), VOSviewer (van Eck & Waltman, 2010), SciMAT (Cobo, López-Herrera, Herrera-Viedma, & Herrera, 2012), BibExcel (Persson, Danell,&Schneider,2009),ScienceofScience(Sci2)Tool(Sci2Team,2009),CiteSpace(Chen,2006),andVantagePoint (www.thevantagepoint.com).

CitNetExplorerand VOSvieweraretwo freeJavaapplications,designedbyvanEckandWaltman,foranalysingand visualizingcitationnetworksofscientificcollections.CitNetExplorerallowstheuserto(i)analysethedevelopmentofa researchfieldovertime,(ii)identifythecoreliteratureonaresearchtopic,and(iii)explorethepublicationoeuvreofa researcheranditsinfluenceonthepublicationsofotherresearchers.VOSvieweraddressesthegraphicalrepresentationof bibliometricmapsandisespeciallyusefulfordisplayinglargebibliometricmapsinaneasy-to-interpretmanner.

SciMATisanopensourcesoftwaretooldevelopedtoperformasciencemappinganalysisunderalongitudinalframework. SciMATprovidesthreedifferentmodules:(i)managementofaknowledgebaseanditsentities;(ii)sciencemappinganalysis; and(iii)visualizationofthegeneratedresults.

BibExcelisdesignedtoassistascholarinanalysingbibliographicdata,oranydataofatextualnatureformattedina similarmanner.ItgeneratesdatafilesthatcanbeimportedintoExcel,oranyprogramthatacceptstabbeddatarecords,for furtherprocessing.However,BibExceldoesnotincludeanymoduletovisualizeandmaptheresults.

TheScienceofScience(Sci2)Toolisfreesoftwarethatsupportsthetemporal,geospatial,topical,andnetworkanalysis andvisualizationofbibliographiccollections.

CiteSpaceisafreeJavaapplicationforvisualizingandanalysingtrendsandpatternsinscientificliterature.Itfocuseson identifyingcriticalpointsinthedevelopmentofafieldoradomain,especiallyintellectualturningpointsandpivotalpoints. VantagePointiscommercialsoftwareforsciencemappinganalysis.Itsmajorstrengthistheabilitytoreadvirtually anystructuredtextcontent.Itsupportsmorethan190differentimportfilters.Moreover,VantagePointincludesatoolfor visualizingthemainbibliometricmaps.

3.2. R-packagesforbibliometricanalysis

IntheRenvironment,otherpackageshavebeenpublishedrecentlyontheofficialrepository(CRAN,TheComprehensiveR ArchiveNetwork,https://cran.r-project.org/)addressingbibliometrics.Eachofthemprovidesforspecificanalysisfunctions; however,noneaddressestheentireworkflow.Forexample,theprimaryaimofCITAN(Gagolewski,2011)–CITation ANal-ysispackageforRstatisticalcomputingEnvironmentistosupportscholarswithatool(i)forpreprocessingandcleaning bibliographicdataretrievedfromScopusand(ii)forcalculatingthemostpopularindicesofscientificimpact.Moreover, CITANprovidesmetricssuchash-index,g-index,andL-index.Unlikebibliometrix,CITAN(i)canuseonlydatafromScopus and(ii)hasnofunctionsforco-citation,bibliographiccoupling,scientificcollaboration,co-wordanalysis,ortextextraction fromtitlesandabstracts.

ScientoText(Uddin,2016)isanotherrecentpackagethatisperhapsthemostcomparabletothebibliometrixR-package. Nevertheless,althoughScientoTextstatesthatitusesdatafromtheWoSandScopusdatabases,itcurrentlyhasnofunctions forimportingandconvertingdata.

H-indexCalculator(Alavifard,2015)usesonlydatafromtheClarivateAnalyticsWoSforcalculatingtheh-index. Finally,Scholar(Keirstead,2015)offerssimilarfunctionalitiesasthewell-knownsoftwaretoolPublishorPerish(Harzing, 2007).ItenablesdatatobeextractedfromGoogleScholarforoneormoreresearchersforanalysingcitationsandcalculating certainimpactmetrics.AswithCITAN,Scholardoesnotincludeanyfunctionforco-citation,bibliographiccoupling,scientific

(5)

Fig.1. bibliometrixandtherecommendedsciencemappingworkflow.

collaboration,co-wordanalysis,ortextextractionfromtitlesandabstracts.Moreover,itcanonlyusedatafromGoogle ScholarwithallthelimitationswithrespecttotheWoSandScopusdatabases(Bar-Ilan,2007;Yang&Meho,2006). 4. bibliometrixandtherecommendedsciencemappingworkflow

ThebibliometrixR-package(http://www.bibliometrix.org)providesasetoftoolsforquantitativeresearchinbibliometrics andscientometrics.ItiswrittenintheRlanguage,whichisanopen-sourceenvironmentandecosystem.Theexistenceof substantial,effectivestatisticalalgorithms,accesstohigh-qualitynumericalroutines,andintegrateddatavisualizationtools areperhapsthestrongestqualitiestopreferRtootherlanguagesforscientificcomputation.

Fig.1illustratesthebibliometrixworkflowsupportingthesecondthroughfourthstagesoftherecommendedscience mappingworkflowpresentedinSection2.

1.Datacollection.bibliometrixsupportsthefollowingsub-stage: a DataloadingandconversiontoRdataframe(Section4.1). 2.DataAnalysis,articulatedinthreesub-stages:

aDescriptiveanalysisofabibliographicdataframe(Section4.2);

bNetworkcreationforbibliographiccoupling,co-citation,collaboration,andco-occurrenceanalyses(Section4.3); cNormalization(Section4.4).

3.Datavisualization:

aConceptualstructuremapping(Section4.3e); bNetworkmapping(Section4.5).

To describe the main functions of bibliometrix (Table 2), we analysed articles on bibliometrics in the management and business fields between 1985 and 2015. The data is available on the bibliometrix website (http://www.bibliometrix.org/datasets/bibliometricmanagementbusinesspa.txt).

4.1. DataloadingandconvertingtoRdataframe

Datacollectionisataskcomposedofdifferentsubtasksasfollows.

• Dataretrieval.bibliometrixfunctionswithdataextractedfromthetwomainbibliographicdatabases,namelyClarivate AnalyticsWoSandScopus.Thebibliometrixtutorial(http://www.bibliometrix.org/index.html#header3-16)assists schol-arswithqueryingthesedatabases.Moreover,bibliometrixconnectswiththeScopusAPItoautomaticallycollectmetadata regardingthecompletescientificproductionofalistofscholars.Intheexamplethatfollows,weusedClarivateAnalytics– WebofSciencecorecollection–SocialSciencesCitationIndex(SSCI)andScienceCitationIndexExpanded(SCI-Expanded) andchose(1)thegenerickeyword“bibliometric*”asthetopic,(2)onlyarticleswritteninEnglishforthedocumenttype, (3)“management”,“business”,and“publicadministration”assubjectcategories,and(4)thetimespan1985–2015. • Data loading and converting (hereinafter, square brackets denote the R syntax for commands). The export files are

(6)

Table2

Mainbibliometrixfunctions.

Softwareassistedworkflowsteps bibliometrixfunction Description Output Dataloadingand

converting

• readFiles() • LoadsasequenceofScopusandClarivateAnalyticsWoS exportfilesintoR

• Bibliographicdata frame

• convert2df() • Createsabibliographicdataframe

• retrievalByAuthorID() • UsesScopusAPIsearchtoobtaininformationregarding documentsonasetofauthorsusingScopusID Descriptivebibliometric

analysis

• biblioAnalysis() • Returnsanobjectofclassbibliometrix • Tablesofresults • summary()andplot() • Summarizethemainresultsofthebibliometricanalysis

• citations() • Identifiesthemostcitedreferencesorauthors • localCitations() • Identifiesthemostcitedlocalauthors • dominance() • Calculatestheauthors’dominanceranking

• Hindex() • Measuresproductivityandcitationimpactofascholar • lotka() • EstimatesLotka’slawcoefficientsforscientificproductivity • keywordGrowth() • Calculatesyearlycumulativeoccurrencesoftop

keywords/terms

• keywordAssociation() • Associatesauthors’keywordstokeywordsplus DocumentxAttribute

matrixcreation

• metaTagExtraction() • Extractsotherfieldtags,differentfromthestandard WoS/Scopuscodify

• DocumentxAttribute matrix

• termExtraction() • Extractsandstemstermsfromtextualfields(abstract,title, author’skeywords,andothers)ofabibliographicdata frame

• cocMatrix() • ComputesaDocumentxAttributematrix

Normalization • normalizeSimilarity() • Calculatesassociationstrength,inclusionindex,Jaccard’s coefficient,andSalton’ssimilaritycoefficientamong objectsofabibliographicnetwork

• Similaritymatrix

DataReduction • conceptualStructure() • Createsconceptualstructuremapofascientificfieldusing MCAandClustering

• Wordoccurrence matrix,MCA,and clusteringresults Networkmatrixcreation • biblioNetwork() • Calculatesthemostfrequentlyusedbibliographiccoupling,

co-citation,collaboration,andco-occurrencenetworks

• Networkmatrixand historicalnetwork matrix • histNetwork() • Createsahistoricalco-citationnetworkfroma

bibliographicdataframe

Mapping • networkPlot() • PlotsabibliographicnetworkusinginternalRlibraryor VOSviewersoftware

• Networkgraph, networkPajekformat forVOSviewer, Historiograph,and semanticmap • histPlot() • Plotsahistoricaldirectcitationnetwork

• conceptualStructure() • Plotsconceptualstructuremapofascientificfieldusing MCAandClustering

businesspa.txt)]thatcreatesalargecharacterobjectcalledD.Thefunctionsupportsplaintext(forClarivateAnalytics database)and BibTex (forboth ClarivateAnalyticsand Scopus databases)formats and allows importing simultane-ouslymultipleexportfiles.Thesecanbeconvertedintoadataframeusingtheconvert2dffunction[M<-convert2df(D, dbsource=“isi”,format=“plaintext”)].convert2dfcreatesabibliographicdataframewithcasescorrespondingto docu-mentsandvariablestofieldtagsintheoriginalexportfile.Eachdocumentcontainsseveralelementssuchasauthors’ names,title,keywordsandotherinformation.Theseelementsconstitutethebibliographicattributesofadocument,also calledthemetadata.Wehavechosentousestandardcolumnnamesforthebibliographicdataframeadoptingthefield tagsproposedbyClarivateAnalyticsandforScopuscollections.ThisfacilitatesmergingdifferentsourcesandapplyingR

(7)

Table3

bibliometrixdataframestructure.

FieldTag Class Description

UT CHARACTER UniqueArticleIdentifier

AU CHARACTER Authors

TI CHARACTER DocumentTitle

SO CHARACTER PublicationName(orSource)

JI CHARACTER ISOSourceAbbreviation

DT CHARACTER DocumentType

DE CHARACTER Authors’Keywords

ID CHARACTER KeywordsassociatedbyWoSorScopusdatabase

AB LARGECHARACTER Abstract

C1 CHARACTER AuthorAddress

RP CHARACTER ReprintAddress

CR LARGECHARACTER CitedReferences

TC NUMERIC TimesCited

PY NUMERIC Year

SC CHARACTER SubjectCategory

DB CHARACTER BibliographicDatabase

Table4

Elementlistofabibliometrixobject. Listelement Description

Articles Totalnumberofdocuments Authors Authors’frequencydistribution

AuthorsFrac Authors’frequencydistribution(fractionalized) FirstAuthors Firstauthorofeachdocument

nAUperPaper Numberofauthorsperdocument Appearances Numberofauthorappearances nAuthors Totalnumberofauthors

AuMultiAuthoredArt Numberofauthorsofmulti-authoredarticles Years Publicationyearofeachdocument

FirstAffiliation Affiliationofthefirstauthorforeachdocument

Affiliations Frequencydistributionofaffiliations(ofallco-authorsforeachdocument) Afffrac Fractionalizedfrequencydistributionofaffiliations(ofallco-authorsforeachpaper) CO Affiliationcountryoffirstauthor

Countries Affiliationcountries’frequencydistribution TotalCitation Numberoftimeseachdocumenthasbeencited

TCperYear Yearlyaveragenumberoftimeseachdocumenthasbeencited Sources Frequencydistributionofthesources(journals,books,others) DE Frequencydistributionoftheauthors’keywords

ID FrequencydistributionofkeywordsassociatedtothedocumentbyClarivateAnalyticsWebofScienceandScopusdatabases

routines.Table3containsthestructureofthebibliometrixdataframeconsideringthemainfieldtags.Thecolumn“class” reportsthedatatypeofeachdataframecolumn.

• Datacleaning.bibliometrixdoesnothavespecificroutinesdedicatedtodatacleaning.Itdoesincludeinitsmainfunctions (e.g.,loadingandconverting,citationanalysis)asetofcleaningrulessuchas:(i)transformfulltextintouppercase,(ii) removenon-alphanumericcharacters,(iii)removepunctuationsymbolsandextraspaces,and(iv)truncateauthor’sfirst andmiddlenamestotheinitials.

4.2. Descriptiveanalysisofabibliographicdataframe

Thedescriptiveanalysisofthebibliographicdataframeusesmanyfunctions.

• ThebiblioAnalysisfunctioncalculatesthemainbibliometricmeasuresusingsimplesyntax[results<-biblioAnalysis(M, sep=“;”)].The biblioAnalysisfunctionreturnsanobject ofclass“bibliometrix”,which isa listcontainingtheelements reportedinTable4.

• Thefunctionssummaryandplotsummarizethemainresultsofthebibliometricanalysis.Theydisplaytheprincipal infor-mationregardingthebibliographicdataframeandsixtables.summaryacceptstwoadditionalarguments:kisaformatting valuethatindicatesthenumberofrowsforeachtable;pauseisalogicalvalue(TRUEorFALSE)usedtopermit(ornot)a pauseinscreenscrolling.Forexample,choosingk=10,weexpressedthedesiretoviewthefirsttenauthorsorfirstten sources.TheresultsaredisplayedinTables5–10andinFig.2.

• Thecitationsfunctiongeneratesthefrequencytableofthemostcitedreferencesorthemostcitedfirstauthors(of refer-ences).Foreachdocument,citedreferencesareinasinglestringstoredinthe“CR”columnofthedataframe.Foracorrect

(8)

Table5

Descriptiveanalysis:Maininformationregardingthecollection.

Description

Articles 304

Period 1985–2015

AnnualPercentageGrowthRate 12.40 Averagecitationsperarticle 26.56

Authors 617

AuthorAppearances 801

Authorsofsingleauthoredarticles 32 Authorsofmultiauthoredarticles 585

ArticlesperAuthor 0.493

AuthorsperArticle 2.03

Co-AuthorsperArticles 2.63

CollaborationIndex 2.49

Table6

Descriptiveanalysis:Top10–Mostproductiveauthors.

Author No.ofArticles Author No.ofArticlesFractionalized

KostoffRN 16 KostoffRN 7.77 KajikawaY 9 VogelR 3.50 PorterAL 9 PorterAL 3.46 AbramoG 5 KajikawaY 3.00 D’AngeloCA 5 ShilburyD 3.00 MoedHF 5 TalukdarD 2.33 BowlesCA 4 HicksD 2.08 HicksD 4 EomSB 2.00 LeePC 4 McmillanGS 2.00 SakataI 4 SaetrenH 2.00 Table7

Descriptiveanalysis:Top10–Mostcitedpapers.

Paper TotalCitations(TC) TCperYear

ChenHC,ChiangRHL,StoreyVC,(2012),MisQ. 386 77.20

DaimTU,RuedaG,MartinH,GerdsriP,(2006),Technol.Forecast.Soc.Chang. 240 21.82 MoedHF,BurgerWJM,FrankfortJG,VanraanAFJ,(1985),Res.Policy 232 7.25 KostoffRN,ScallerRR,(2001),IEEETrans.Eng.Manage. 220 13.75

LohL,VenkatramanN,(1992),Inf.Syst.Res. 210 8.40

VolberdaHW,FossNJ,LylesMA,(2010),OrganSci. 202 28.86

Ramos-RodriguezAR,Ruiz-NavarroJ,(2004),Strateg.Manage.J. 190 14.62

MurrayF,(2002),Res.Policy 187 12.47

MelinG,(2000),Res.Policy 187 11.00

GambardellaA,(1992),Res.Policy 139 5.56

Table8

Descriptiveanalysis:Top10–Mostproductivecountries(basedonfirstauthor’saffiliation).

Country No.ofArticles %ofArticles

USA 89 29.5 Netherlands 25 8.3 England 21 6.9 Germany 20 6.6 Italy 20 6.6 Japan 15 5.0 Spain 15 5.0 Sweden 10 3.3 Taiwan 10 3.3 Australia 8 2.7

extraction,wemustidentifytheseparatorfieldamongdifferentreferencesusedbytheselecteddatabase.Typically,the WoSdefaultseparatoris“;”.Thebibliometrixtutorialalsodescribesotherseparators.

Citedreferencesfrequentlyhavenumerousinconsistenciesinthedataformat.Forexample,somedatabases,suchas Scopus,donothaveastandardizedformat.Thecitationsfunctionalsoimplementsasetofcleaningrulesasdescribedin Section4.1.

Table11containsthemostfrequentlyciteddocuments[CR<-citations(M,field=“article”,sep=“;”)].ThelocalCitations function[CR<-localCitations(M,results,sep=“;”)]generatesthefrequencytableofthemostlocalcitedauthors.Local

(9)

Table9

Descriptiveanalysis:Top10–Mostfrequentjournals.

Sources No.ofArticles %ofArticles

ResearchPolicy 58 19.1

TechnologicalForecastingandSocialChange 56 18.4

TechnologyAnalysis&StrategicManagement 17 5.6

Technovation 15 4.9

InternationalJournalofTechnologyManagement 6 2.0

R&DManagement 6 2.0

ScienceandPublicPolicy 6 2.0

AfricanJournalofBusinessManagement 5 1.6

JournalofTechnologyTransfer 5 1.6

JournalofBusinessEthics 4 1.3

Table10

Descriptiveanalysis:Top10–Mostfrequentkeywords.

AuthorKeywords(DE) No.ofArticles Keywords-Plus(ID) No.ofArticles

Bibliometrics 87 Science 82

BibliometricAnalysis 30 Innovation 39

CitationAnalysis 26 Performance 33

Nanotechnology 17 Journals 30

Research 17 Technology 30

Innovation 16 Impact 28

Analysis 13 Knowledge 26

Scientometrics 13 Management 26

TextMining 10 Bibliometrics 25

PatentAnalysis 9 CitationAnalysis 25

Fig.2. Publicationsperyear1985–2015. Table11

Citationanalysis:Top10–Mostcitedreferences.

CitedReference Citations

Ramos-RodriguezAR,2004,StrategicManageJ,V25,P981,Doi101002/Smj397 32 SmallH,1973,JAmSocInformSci,V24,P265,Doi101002/Asi4630240406 29 MccainKW,1990,JAmSocInformSci,V41,P433 26

NelsonRR,1982,EvolutionaryTheory 22

WhiteHR,1981,JAmSocInformSci,V32,P163,Doi101002/Asi4630320302 21

PriceDJ,1963,LittleSciBigSci 19

CohenWM,1990,AdminSciQuart,V35,P128,Doi102307/2393553 18 DaimTU,2006,TechnolForecastSoc,V73,P981,Doi101016/Jtechfore200604004 18 HoffmanDL,1993,JConsumRes,V19,P505,Doi101086/209319 18 NerurSP,2008,StrategManageJ,V29,P319,Doi101002/SMJ659 18 SmallH,1974,SciStud,V4,P17,Doi101177/030631277400400102 18

WhiteHD,1998,JAmSocInformSci,V49,P327 18

citationsmeasurehowmany timesanauthor includedinthis collectionhasbeencitedbyotherauthorsalsointhe collection.Table12reportsthemostfrequentlocalauthors.

• Theauthors’h-indexisanauthor-levelmetricthatattemptstomeasureboththeproductivityandcitationimpactofthe publicationsofascientistorscholar(Hirsch,2005).Theindexisbasedonthesetofthescientist’smostcitedpapersand

(10)

Table12

Localcitationanalysis:Top10–Mostcitedauthors.

LocalCitedAuthor Citations

KostoffRN 317 NarinF 103 PorterAL 96 LeydesdorffL 84 MoedHF 71 PilkingtonA 53 HicksD 48 MeyerM 47 MartinB 45 PavittK 43

thenumberofcitationsthattheyhavereceivedinotherpublications.TheHindexfunctioncalculatestheauthors’h-index anditsvariants(g-indexandm-index)inabibliographiccollection(vanEck&Waltman,2008).Functionargumentsare thefollowing:Mabibliographicdataframeandauthorsacharactervectorcontainingtheauthors’namesforwhichyou wanttocalculatetheh-index.Forexample,tocalculatetheh-index,inthiscollection,forauthorRonaldKostoff,youwould use[Hindex(M,authors=“KOSTOFFR”,sep=“;”)]

4.3. Networkcreationforbibliographiccoupling,co-citation,collaboration,andco-occurrenceanalyses

Adocument’sattributesareconnectedtoeachotherthroughthedocumentitself(e.g.,author(s)tojournal,keywordsto publicationdate).TheseconnectionsofdifferentattributescanberepresentedthroughamatrixDocument×Attribute.

cocMatrixisthegeneralfunctiontocreatearectangularmatrixDocument×AttributethatwecallA.Anattributeisan itemofinformationassociatedtothedocumentandstoredinafieldtagwithinthebibliometricdataframe(e.g.,authors, publicationsource,keywords,citedreferences,affiliations).

Insomecases,thismatrixcanbeinterpretedasabipartiteortwo-modenetwork(e.g.,whereattributeisauthor,keyword, citedreference).Forexample,tocreateaDocument×Citedreferencematrix,wemustusethefieldtag“CR”[A<-cocMatrix(M, Field=“CR”,sep=“;”)].Inthiscase,Aisarectangularbinarymatrix(andalsoabipartitenetwork)whereeachrowisa documentandeachcolumnconcernsacitedreferenceofthecollection.Thegenericelementaijis“1”ifthedocumentihas

citedthereferencej,otherwiseitis“0”.Thej-thcolumnsuma+jisthenumberofdocumentscitingthereferencej.Thei-th

rowsumai+isthenumberofreferencescitedbydocumenti.

UsingthecocMatrixfunction,severalmatricescanbecomputed,suchas: • Document×Author[A<-cocMatrix(M,Field=“AU”,sep=“;”)];

• Document×Country. Authors’ Countries is not a standard attribute of the bibliographic data frame. We must extract this information from the affiliation attribute using the metaTagExtraction function [M <- metaTagExtrac-tion(M,Field=“AUCO”,sep=“;”);A<-cocMatrix(M,Field=“AUCO”,sep=“;”)].metaTagExtractionallowsthefollowing additional field tags to be extracted: Authors’ countries (Field=“AUCO”), First author of each cited reference (Field=“CRAU”), Publication source of each cited reference (Field=“CRSO”), and affiliation for each co-author (Field=“AUUN”);

• Document×Authors’keyword[A<-cocMatrix(M,Field=“DE”,sep=“;”)]orDocumentxKeywordPlus[A<-cocMatrix(M, Field=“ID”,sep=“;”)].

a)Bibliographiccoupling

Twoarticlesaresaidtobebibliographicallycoupledifatleastonecitedsourceappearsinthebibliographiesorreference listsofbotharticles(Kessler,1963).Abibliographiccouplingnetworkcanbeobtainedusingthegeneralformula:

Bcocit=A×A

whereAis aDocument×Citedreference matrix.Elementbij indicateshowmany bibliographiccouplingsexistbetween

documentsiandj.Bcoupisanon-negativeandsymmetricalmatrixBcoup=Bcoup.

Thestrengthofthebibliographiccouplingoftwoarticles,iandjisdefinedsimplybythenumberofreferencesthatthe articleshaveincommon,asgivenbytheelementbijofmatrixBcoup.

ThebiblioNetworkfunctioncalculates,startingfromabibliographicdataframe,themostfrequentlyusedbibliographic couplingnetworkssuchasdocuments,authors,sources,keywords,andcountries.TousebiblioNetworkitisnecessarytoset twoarguments.First,thetypeofanalysisisset.Inthiscase,theanalysisargumentis“coupling”(alternatively,“co-citation”, “collaboration”,and“co-occurrences”).Then,thenetworkunitofanalysis,whichcanbealternatively“authors”,“references”, “sources”,“countries”,“keywords”,“authorkeywords”,“titles”,or“abstracts”mustbeset.

Thefollowingcodecalculatesaclassicaldocumentbibliographiccouplingnetwork[NetMatrix<-biblioNetwork(M, analysis=“coupling”,network=“references”,sep=“;”)].Articleswithonlyasmallnumberofreferences,therefore,would

(11)

tendtobemoreweaklybibliographicallycoupledwhenbibliographiccouplingstrengthissimplymeasuredaccordingto thenumberofreferencesthatarticleshaveincommon.

b)Co-citationanalysis

Co-citationoftwoarticlesoccurswhenbotharecitedinathirdarticle.Thus,co-citationisthecounterpartofbibliographic coupling.Aco-citationnetworkcanbeobtainedusingthegeneralformula:

Bcoup=A×A

whereAisaDocument×Citedreferencematrix.

SimilartomatrixBcoup,matrixBcocitisalsosymmetric.Elementbijindicateshowmanyco-citationsexistbetween

docu-mentsiandj.ThemaindiagonalofBcocitcontainsthenumberofdocumentswhereareferenceiscitedinourdataframe.That

is,thediagonalelementbiiisthenumberoflocalcitationsofthereferencei.ThebiblioNetworkfunctionprovidesaclassical referenceco-citationnetwork[NetMatrix<-biblioNetwork(M,analysis=“co-citation”,network=“references”,sep=“;”)].

c)Collaborationanalysis

Ascientificcollaborationnetworkisanetworkwherenodesareauthorsandlinksareco-authorships.Itisoneofthe mostwell-documentedformsofscientificcollaboration(Glänzel&Schubert,2004).Anauthorcollaborationnetworkcan beobtainedusingthegeneralformula:

Bcoll=A×A

whereAisaDocument×Authormatrix.Elementbijindicateshowmanycollaborationsexistbetweenauthorsiandj.The

diag-onalelementbiiisthenumberofdocumentsauthoredorco-authoredbyresearcheri.ThebiblioNetworkfunctioncalculatesan

authors’collaborationnetwork[NetMatrix<-biblioNetwork(M,analysis=“collaboration”,network=“authors”,sep=“;”)]or acountrycollaborationnetwork[NetMatrix<-biblioNetwork(M,analysis=“collaboration”,network=“countries”,sep=“;”)].

d)Co-wordanalysis

Theaimoftheco-wordanalysisistodrawtheconceptualstructureofaframeworkusingawordco-occurrence net-worktomapandclustertermsextractedfromkeywords,titles,orabstractsinabibliographiccollection[NetMatrix <-biblioNetwork(M,analysis=“co-occurrences”,network=“keywords”,sep=“;”)].

Aco-wordnetworkcanbeobtainedusingthegeneralformula: Bcoc=A×A

whereAisaDocument×Wordmatrix,whereWordis,alternatively,authors’keywords,keywordsplus,ortermsextracted fromtitlesorabstracts.Elementbijindicateshowmanyco-occurrencesexistbetweenwordsiandj.Thediagonalelement

biiisthenumberofdocumentscontainingthewordi.

ThetermExtractionfunctionextractstermsfromatextualfield(e.g.,abstract,title,author’skeywords),deletes stop-words,andappliesPorter’sstemmingalgorithm(Porter,1980).Stemmingistheprocessofreducinginflected(orsometimes derived)wordstotheirwordstem,baseorrootform,typicallyawrittenwordform.Hence,thisfunctionnormalizesterms beforeperformingtheco-occurrenceanalysis[M<-termExtraction(M,Field=“TI”,stemming=TRUE,language=“english”, verbose=TRUE)].

ThebibliometrixR-packageallowsusingtheconceptualStructurefunctiontoperformmultiplecorrespondenceanalysis (MCA)todrawaconceptualstructureofthefieldandK-meansclusteringtoidentifyclustersofdocumentsthatexpress commonconcepts.

MCAisanexploratorymultivariatetechniqueforthegraphicalandnumericalanalysisofmultivariatecategoricaldata (Benzécri,1982;Greenacre&Blasius,2006;Lebart,Morineau,&Warwick,1984).MCAperformsahomogeneityanalysis ofanindicatormatrixtoobtainalow-dimensionalEuclideanrepresentationoftheoriginaldata(Gifi,1990).Inco-word analysis,MCAisappliedtoaDocumentxWordmatrixA.Thewordsareplottedonatwo-dimensionalmap[CS<- conceptu-alStructure(M,field=“ID”,minDegree=5,k.max=5,stemming=FALSE),labelsize=5].Theresultsareinterpretedbasedonthe relativepositionsofthepointsandtheirdistributionalongthedimensions;aswordsaremoresimilarindistribution,the closertheyarerepresentedinthemap(Fig.3)(Cuccurullo,Aria,&Sarto,2016).

4.4. Normalization

ThenormalizeSimilarityfunctionallowstheusertonormalizebibliographiccoupling,co-citation,andco-occurrencedata calculatingasimilaritymeasure.Thisfunctioncomputesthefollowingmeasures(vanEck&Waltman,2009):theassociation strength(alsocalledproximityindex),theinclusionindex(alsocalledSimpson’scoefficient),theJaccard’scoefficient,and theSalton’scosine.

(12)

Fig.3.Conceptualmapandkeywordclusters.

Theassociationstrengthistheratiobetweentheobservedandexpectedstrengthundertheassumptionofprobabilistic independence:

SAij= bbij iibjj

[S<-normalizeSimilarity(NetMatrix,type=“association”)].

Theinclusionindexisanoverlapmetricthatmeasureshowmuchasetisincludedinanother: SIij= bij

min



bii,bjj



[S<-normalizeSimilarity(NetMatrix,type=“inclusion”)].

TheJaccard’sindex(orJaccard’ssimilaritycoefficient)isarelativemeasureoftheintersectionoftwosets.Itiscalculated astheratiobetweentheintersectionandtheunionofthetwoobjects:

SJij=b bij ii+bjj−bij

(13)

Fig.4.KeywordPlusco-occurrencenetwork(Kamada&Kawailayout).

TheSalton’sindexrelatestheintersectionofthetwoobjectstothegeometricmeanofthesizeofboth: SSij=



bij

biibjj

[S<-normalizeSimilarity(NetMatrix,type=“salton”)].

ThesquareofSalton’sindexisalsocalledtheequivalenceindex[S<-normalizeSimilarity(NetMatrix,type= “equiva-lence”)].

4.5. Networkmapping

Allbibliometricnetworkscanbegraphicallyvisualizedormodelled.ThenetworkPlotfunctionplotsanetworkcreatedby biblioNetworkusingRroutinesorusingVOSviewersoftwarebyNeesJanvanEckandLudoWaltman(VanEck&Waltman, 2010;vanEck,Waltman,&Noyons,2010;Waltman,VanEck,&Noyons,2010).

Fig.4displaysakeywordplusco-occurrencenetworkusingthekamada-kawailayout(Kamada&Kawai,1989).The networkis drawnselectingthe40verticeswithhighestdegree[COC<-biblioNetwork(M,analysis=“co-occurrences”,

(14)

Fig.5. Countrycollaborationnetwork(Spherelayout).

network=“keywords”,sep=“;”),networkPlot(COC,n=40,size=TRUE,remove.multiple=T,Title=“Termco-occurrences”, type=“kamada”,labelsize=0.5)].

Fig.5displaysanotherexampleofabibliographicnetworkconsideringcollaborationlinksbetweencountries.Inthiscase, weusedspherelayout[M<−metaTagExtraction(M,Field=“AUCO”);CC<-biblioNetwork(M,analysis=“collaboration”, network=“countries”,sep=“;”);networkPlot(CC,n=44,size=TRUE,remove.multiple=FALSE,Title=“CountryCollaboration”, type=“sphere”)].

bibliometrixalsoperforms historiographicanalysis,asproposedbyGarfield(2004)[histResults<-histNetwork(M,n =20,sep=“;”)].ThehistPlotfunctionplotsachronologicalcitationnetwork(calledahistoriograph,pleaseseeFig.6and Table13)that representsa chronologicalmap ofthemostrelevant citationsresultingfrom abibliographiccollection [histPlot(histResults,size=FALSE)].

5. Conclusions

Sciencemappingisbecominganessentialactivityforscholarsofallscientificdisciplines.Asthenumberofpublications continuestoexpandatincreasingratesandpublicationsdevelopfragmentarily,thetaskofaccumulatingknowledgebecomes morecomplicated.Thedeterminationofintellectualstructureandtheresearch-frontofscientificdomainsareimportant notonlyfortheresearchbutalsoforthepolicy-makingandpractice.

(15)

Fig.6.Historiograph.

Specializedsoftwaretoolscommonlyperformonlycertainstepsofsciencemappinganalysis.Onlyasmallnumberof theseallowscholarstofollowthecompleteworkflow.bibliometrixisanopen-sourcetoolforexecutingacomprehensive sciencemappinganalysisofscientificliterature.ItwasprogrammedinRtobeflexibleandfacilitateintegrationwithother statisticalandgraphicalpackages.Indeed,bibliometricsisaconstantlychangingscienceandbibliometrixhastheflexibility tobequicklyupgradedandintegrated.Itsdevelopmentcanaddressalargeandactivecommunityofdevelopersformed byprominentresearchers.Theadvantagesaredirect.Infact,sourcesarepublishedonGitHubpermittingthecreationofa shareddevelopment.Otheradvantagesareindirect.Inanenvironmentcomposedofthousandsofpackages,bibliometrixcan beastepinalargerworkflow,exploitingotherRsolutions.

Wearealreadyworkingonnewdevelopments.Theyconcern(i)theextensionofcompatibilitywithotherbibliographic databasessuchasPubMed,(ii)theimprovementofreferencedisambiguationbystringmetric-basedalgorithms,(iii)the introductionofdirectcitation(Klavans&Boyack,2017)andtri-citationanalysis(Marion,2002;McCain,2009),and(iv) theuseofhybridmethodsthatcombinebibliometricandsemanticapproaches(Glänzel&Thijs,2012;Thijs,Schiebel,& Glänzel,2013).Thelast-mentioneddevelopmentincludesterm-burstdetectionthroughexpectilesmoothing(Schnabel& Eilers,2009),thematicmappingandevolution(Coboetal.,2011b),andlatentsemanticanalysis(Dumais,2004).

(16)

Table13

Historiographlegend.

ID Reference DOI Localcitations Totalcitations

1985−1 MOEDHF,1985,RESPOLICY 10.1016/0048−7333(85)90012-5 14 232 1988−2 NARINF,1988,RESPOLICY 10.1016/0048-7333(88)90039-X 9 44 1993−3 NEDERHOFAJ,1993,RESPOLICY 10.1016/0048−7333(93)90005-3 6 61

1993−4 HOFFMANDL,1993,JCONSUMRES 10.1086/209319 18 67

1995−5 PORTERAL,1995,TECHNOLFORECASTSOC 10.1016/0040−1625(95)00022-3 8 89 1995−6 USDIKENB,1995,ORGANSTUD 10.1177/017084069501600306 11 80 1997−7 WATTSRJ,1997,TECHNOLFORECASTSOC 10.1016/S0040-1625(97)00050-4 14 112 1998−8 RINIAEJ,1998,RESPOLICY 10.1016/S0048-7333(98)00026-2 10 113 2001−9 KOSTOFFRN,2001,IEEETENGMANAGE 10.1109/17.922473 13 220 2004−10 RAMOS-RODRIGUEZAR,2004,STRATEGICMANAGEJ 10.1002/SMJ.397 32 190 2004−11 KOSTOFFRN,2004,TECHNOLFORECASTSOC 10.1016/S0040-1625(03)00048-9 6 100 2005−12 KOSTOFFRN,2005,TECHNOLFORECASTSOC 10.1016/J.TECHFORE.2005.02.001 6 14 2006−13 DAIMTU,2006,TECHNOLFORECASTSOC 10.1016/J.TECHFORE.2006.04.004 18 240 2006−14 SCHILDTHA,2006,ENTREPTHEORYPRACT 10.1111/J.1540-6520.2006.00126.X 8 59 2006−15 PILKINGTONA,2006,TECHNOVATION 10.1016/J.TECHNOVATION.2005.01.009 6 38 2007−16 KOSTOFFRN,2007,TECHNOLFORECASTSOC 10.1016/J.TECHFORE.2007.02.007 6 26 2007−17 KOSTOFFRN,2007,TECHNOLFORECASTSOC 10.1016/J.TECHFORE.2007.04.004 7 47 2007−18 BIEMANSW,2007,JPRODINNOVATMANAG 10.1111/J.1540-5885.2007.00245.X 6 20 2008−19 KAJIKAWAY,2008,TECHNOLFORECASTSOC 10.1016/J.TECHFORE.2008.04.007 7 44 2008−20 SHIBATAN,2008,TECHNOVATION 10.1016/J.TECHNOVATION.2008.03.009 7 83 2008−21 KAJIKAWAY,2008,TECHNOLFORECASTSOC 10.1016/J.TECHFORE.2007.05.005 14 76 2008−22 NERURSP,2008,STRATEGMANAGEJ 10.1002/SMJ.659 18 96

2008−23 CHARVETFF,2008,JBUSLOGIST <NA> 9 34

2009−24 KAJIKAWAY,2009,TECHNOLFORECASTSOC 10.1016/J.TECHFORE.2009.04.004 7 34 2009−25 ABRAMOG,2009,RESPOLICY 10.1016/J.RESPOL.2008.11.001 6 50

Authorcontributions

MassimoAria,CorradoCuccurullo:Conceivedanddesignedtheanalysis;Collectedthedata;Contributeddataoranalysis tools;Performedtheanalysis;Wrotethepaper.

Acknowledgements

Theauthorswouldliketothanktheeditorandrefereesfortheirhelpfulcomments.Thesehaveallowedustosignificantly improvethequalityofthispaper.

References

Alavifard,S.(2015).hindexcalculator:H-indexcalculatorusingdatafromawebofscience(WoS)citationreport.Rpackageversion1.0.0. https://CRAN.R-project.org/package=hindexcalculator

Börner,K.,Chen,C.,&Boyack,K.W.(2003).Visualizingknowledgedomains.AnnualReviewofInformationScienceandTechnology,37(1),179–255.

Bailón-Moreno,R.,Jurado-Alameda,E.,&Ruiz-Ba ˜nos,R.(2006).Thescientificnetworkofsurfactants:Structuralanalysis.JournaloftheAmericanSociety forInformationScienceandTechnology,57(7),949–960.

Bar-Ilan,J.(2007).Whichh-index?AcomparisonofWoS,ScopusandGoogleScholar.Scientometrics,74(2),257–271.

Benzécri,J.P.(1982).L’AnalysedesDonnéss.II.L’analysedescorrespondances.Paris:Dunod.

Briner,R.B.,&Denyer,D.(2012).Systematicreviewandevidencesynthesisasapracticeandscholarshiptool.InHandbookofevidence-basedmanagement: Companies,classroomsandresearch.pp.112–129.

Broadus,R.(1987).Towardadefinitionofbibliometrics.Scientometrics,12(5–6),373–379.

Callon,M.,Courtial,J.-P.,Turner,W.A.,&Bauin,S.(1983).Fromtranslationstoproblematicnetworks:Anintroductiontoco-wordanalysis.SocialScience Information,22(2),191–235.http://dx.doi.org/10.1177/053901883022002003

Chen,C.(2006).CiteSpaceII:Detectingandvisualizingemergingtrendsandtransientpatternsinscientificliterature.JournaloftheAssociationfor InformationScienceandTechnology,57(3),359–377.

Cobo,M.J.,Lopez-Herrera,A.G.,Herrera-Viedma,E.,&Herrera,F.(2011).ScienceMappingSoftwareTools:Review,analysis,andcooperativestudy amongtools.JournaloftheAmericanSocietyforInformationScienceandTechnology.

Cobo,M.J.,López-Herrera,A.G.,Herrera-Viedma,E.,&Herrera,F.(2011).Anapproachfordetecting,quantifying,andvisualizingtheevolutionofa researchfield:apracticalapplicationtothefuzzysetstheoryfield.JournalofInformetrics,5(1),146–166.

Cobo,M.J.,López-Herrera,A.G.,Herrera-Viedma,E.,&Herrera,F.(2012).SciMAT:Anewsciencemappinganalysissoftwaretool.JournaloftheAmerican SocietyforInformationScienceandTechnology,63(8),1609–1630.

Crane,D.(1972).Invisiblecolleges:Diffusionofknowledgeinscientificcommunities.Chicago:UniversityofChicagoPress.

Cuccurullo,C.,Aria,M.,&Sarto,F.(2016).Foundationsandtrendsinperformancemanagement.Atwenty-fiveyearsbibliometricanalysisinbusinessand publicadministrationdomains.Scientometrics,108(2),595–611.

Diodato,V.(1994).Dictionaryofbibliometrics.Binghamton,NY:HaworthPress.

Dumais,S.T.(2004).Latentsemanticanalysis.AnnualReviewofInformationScienceandTechnology,38,189–230.

Gagolewski,M.(2011).BibliometricimpactassessmentwithRandtheCITANpackage.JournalofInformetrics,5(4),678–692.

Gao,X.,&Guan,J.(2009).Networksofscientificjournals:AnexplorationofChinesepatentdata.Scientometrics,80(1),283–302.

Garfield,E.(2004).Historiographicmappingofknowledgedomainsliterature.JournalofInformationScience,30(2),119–145.

Gifi,A.(1990).Nonlinearmultivariateanalysis.JohnWiley&SonsIncorporated.

Glänzel,W.,&Schubert,A.(2004).Analysingscientificnetworksthroughco-authorship.pp.257–279.Handbookofquantitativescienceandtechnology research(11).

(17)

Glänzel,W.,&Thijs,B.(2012).Usingcoredocumentsfordetectingandlabellingnewemergingtopics.Scientometrics,91(2),399–416.

http://dx.doi.org/10.1007/s11192-011-0591-7

Glänzel,W.(2001).Nationalcharacteristicsininternationalscientificco-authorshiprelations.Scientometrics,51(1),69–115.

Greenacre,M.,&Blasius,J.(Eds.).(2006).Multiplecorrespondenceanalysisandrelatedmethods.CRCPress.

Guler,A.T.,Waaijer,C.J.,Mohammed,Y.,&Palmblad,M.(2016).AutomatingbibliometricanalysesusingTavernascientificworkflows:Atutorialon integratingWebServices.JournalofInformetrics,10(3),830–841.

Guler,A.T.,Waaijer,C.J.,&Palmblad,M.(2016).Scientificworkflowsforbibliometrics.Scientometrics,107(2),385–398.

Harzing,A.W.(2007).PublishorPerish.[availablefrom].http://www.harzing.com/pop.htm

Hirsch,J.E.(2005).Anindextoquantifyanindividual’sscientificresearchoutput.ProceedingsoftheNationalacademyofSciencesoftheUnitedStatesof America,16569–16572.

Kamada,T.,&Kawai,S.(1989).Analgorithmfordrawinggeneralundirectedgraphs.InformationProcessingLetters,31(1),7–15[Elsevier].

Keirstead,J.(2015).scholar:analysecitationdatafromGoogleScholar.Rpackage.

Kessler,M.M.(1963).Bibliographiccouplingbetweenscientificpapers.JournaloftheAssociationforInformationScienceandTechnology,14(1),10–25.

Klavans,R.,&Boyack,K.W.(2017).Whichtypeofcitationanalysisgeneratesthemostaccuratetaxonomyofscientificandtechnicalknowledge?Journal oftheAssociationforInformationScienceandTechnology,68(4),984–998.

Lebart,L.,Morineau,A.,&Warwick,K.M.(1984).Multivariatedescriptivestatisticalanalysis(correspondenceanalysisandrelatedtechniquesforlarge matrices).Chichester:Wiley.

Marion,L.(2002).Atri-citationanalysisexploringthecitationimageofKurtLewin.ProceedingsoftheAmericanSocietyforInformationScienceand Technology,39(1),3–13.

Matloff,N.(2011).TheartofRprogramming:Atourofstatisticalsoftwaredesign.NoStarchPress.

McCain,K.W.(1991).Mappingeconomicsthroughthejournalliterature:Anexperimentinjournalcocitationanalysis.JournaloftheAmericanSocietyfor InformationScience,42(4),290.

McCain,K.W.(2009).Usingtricitationtodissectthecitationimage:ConradHalWaddingtonandtheriseofevolutionarydevelopmentalbiology.Journal oftheAmericanSocietyforInformationScienceandTechnology,60(7),1301–1319.http://dx.doi.org/10.1002/asi

Persson,O.,Danell,R.,&Schneider,J.W.(2009)..pp.9–24.HowtouseBibexcelforvarioustypesofbibliometricanalysis.Celebratingscholarlycommunication studies:AFestschriftforOllePerssonathis60thBirthday(5).

Peters,H.,&VanRaan,A.(1991).Structuringscientificactivitiesbyco-authoranalysis:Anexperciseonauniversityfacultylevel.Scientometrics,20(1), 235–255.

Porter,M.F.(1980).Analgorithmforsuffixstripping.Program,14(3),130–137.

Pritchard,A.(1969).Statisticalbibliographyorbibliometrics.JournalofDocumentation,25,348.

RCoreTeam.(2016).R:Alanguageandenvironmentforstatisticalcomputing.Vienna,Austria:RFoundationforStatisticalComputing.

https://www.R-project.org

Rousseau,D.M.(Ed.).(2012).TheOxfordhandbookofevidence-basedmanagement.OxfordUniversityPress.

Schnabel,S.K.,&Eilers,P.H.(2009).Optimalexpectilesmoothing.ComputationalStatistics&DataAnalysis,53(12),4168–4177.

Sci2Team.(2009).ScienceofScience(Sci2)Tool.IndianaUniversityandSciTechStrategies.https://sci2.cns.iu.edu

Skupin,A.(2009).Discreteandcontinuousconceptualizationsofscience:Implicationsforknowledgedomainvisualization.JournalofInformetrics,3(3), 233–245.

Small,H.G.,&Koenig,M.E.(1977).Journalclusteringusingabibliographiccouplingmethod.InformationProcessing&Management,13(5),277–288.

Small,H.,&Upham,P.(2009).Citationstructureofanemergingresearchareaonthevergeofapplication.Scientometrics,79(2),365–375.

Small,H.(1973).Co-citationinthescientificliterature:Anewmeasureoftherelationshipbetweentwodocuments.JournaloftheAssociationfor InformationScienceandTechnology,24(4),265–269.

Small,H.(2006).Trackingandpredictinggrowthareasinscience.Scientometrics,68(3),595–610.

Thijs,B.,Schiebel,E.,&Glänzel,W.(2013).Dosecond-ordersimilaritiesprovideadded-valueinahybridapproach?Scientometrics,96(3),667–677.

http://dx.doi.org/10.1007/s11192-012-0896-1

Uddin,A.(2016).scientoText:Text&ScientometricAnalytics.Rpackageversion0.1.[https://CRAN.R-project.org/package=scientoTextversion0.1.4, http://github.com/jkeirstead/scholar].

Upham,S.P.,&Small,H.(2010).Emergingresearchfrontsinscienceandtechnology:patternsofnewknowledgedevelopment.Scientometrics,83(1), 15–38.

Waltman,L.,VanEck,N.J.,&Noyons,E.C.M.(2010).Aunifiedapproachtomappingandclusteringofbibliometricnetworks.JournalofInformetrics,4(4), 629–635.

Waltman,L.(2016).Areviewoftheliteratureoncitationimpactindicators.JournalofInformetrics,10(2),365–391.

White,H.D.,&Griffith,B.C.(1981).Authorcocitation:Aliteraturemeasureofintellectualstructure.JournaloftheAmericanSocietyforInformationScience, 32(3),163–171.http://dx.doi.org/10.1002/asi.4630320302

White,D.,&McCain,K.(1998).Visualizingadiscipline:Anauthorco-citationanalysisofinformationscience,1972–1995.JournaloftheAmericanSociety forInformationScience,49(4),327–355.

Yan,E.,&Ding,Y.(2012).Scholarlynetworksimilarities:Howbibliographiccouplingnetworks,citationnetworks,cocitationnetworks,topicalnetworks, coauthorshipnetworks,andcowordnetworksrelatetoeachother.JournaloftheAmericanSocietyforInformationScienceandTechnology,63(7), 1313–1326.

Yang,K.,&Meho,L.I.(2006).Citationanalysis:acomparisonofGoogleScholar,Scopus,andWebofScience.ProceedingsoftheAmericanSocietyfor informationscienceandtechnology,43(1),1–15.

Yang,S.,Han,R.,Wolfram,D.,&Zhao,Y.(2016).Visualizingtheintellectualstructureofinformationscience(2006–2015):Introducingauthorkeyword couplinganalysis.JournalofInformetrics,10(1),132–150.

Zhao,D.,&Strotmann,A.(2008).Evolutionofresearchactivitiesandintellectualinfluencesininformationscience1996–2005:Introducingauthor bibliographic-couplinganalysis.JournaloftheAmericanSocietyforInformationScience,59(1998),2070–2086.http://dx.doi.org/10.1002/asi

Zupic,I.,& ˇCater,T.(2015).Bibliometricmethodsinmanagementandorganization.OrganizationalResearchMethods,18(3),429–472.

deMoya-Anegon,F.,Vargas-Quesada,B.,Chinchilla-Rodriguez,Z.,Corera-Alvarez,E.,Herrero-Solana,V.,&Munoz-Fernández,F.J.(2005).Domainanalysis andinformationretrievalthroughtheconstructionofheliocentricmapsbasedonISI-JCRcategorycocitation.InformationProcessing&Management, 41(6),1520–1533.

vanEck,N.J.,&Waltman,L.(2008).Generalizingtheh-andg-indices.JournalofInformetrics,2(4),263–271.

vanEck,N.J.,&Waltman,L.(2009).Howtonormalizecooccurrencedata?Ananalysisofsomewell-knownsimilaritymeasures.JournaloftheAssociation forInformationScienceandTechnology,60(8),1635–1651.

vanEck,N.J.,&Waltman,L.(2010).Softwaresurvey:VOSviewer,acomputerprogramforbibliometricmapping.Scientometrics,84(2),523–538.

vanEck,N.J.,&Waltman,L.(2014).CitNetExplorer:Anewsoftwaretoolforanalyzingandvisualizingcitationnetworks.JournalofInformetrics,8(4), 802–823.

vanEck,N.J.,Waltman,L.,&Noyons,C.M.(2010).Aunifiedapproachtomappingandclusteringofbibliometricnetworks14.EleventhInternational ConferenceonScienceandTechnologyIndicators[p.284].

Riferimenti

Documenti correlati

Al momento dello scoppio della Grande Guerra nessuna marina aveva mai progettato una tattica per un'eventuale guerra sottomarina e non esistevano dati o prove su come usare

Importantly, oral supplementation with Ps-1 in individuals lacking this protein in saliva enhanced their PROP bitter taste responsiveness, and this effect was most potent in

Apart from traditional risk factors, genetic determinants, such as polymorphisms of the renin angiotensin system (RAS), may be relevant in modulating the atherosclerotic process

The simplest and most commonly used probabilistic topic approach to document modelling is the generative model Latent Dirichlet Allocation (LDA) [4]. The idea behind LDA is

ACF.gls: Autocorrelation Function for gls Residuals (nlme) anova.gls: Compare Likelihoods of Fitted Objects (nlme) gls: Fit Linear Model Using Generalized Least Squares (nlme)

This part of the research aimed at verifying differences in size and intensity of passive mobbing by counting the maximum number of individuals in flight over the head

Experiments in the presence of the capsule occupied by a competitive tetraethylammonium guest showed a marked decrease of product formation after the same reaction time (33%

The simultaneous minimization of stability constants and enthalpies were run with cEST for the experimental data reported in the spreadsheet in ESI (Sheet “data”