• Non ci sono risultati.

Testing software tools for newborn cry analysis using synthetic signals

N/A
N/A
Protected

Academic year: 2021

Condividi "Testing software tools for newborn cry analysis using synthetic signals"

Copied!
7
0
0

Testo completo

(1)

ContentslistsavailableatScienceDirect

Biomedical

Signal

Processing

and

Control

j ou rn a l h o m e pa g e :w w w . e l s e v i e r . c o m / l o c a t e / b s p c

Testing

software

tools

for

newborn

cry

analysis

using

synthetic

signals

S.

Orlandi

a,∗

,

A.

Bandini

a,b

,

F.F.

Fiaschi

a

,

C.

Manfredi

a

aDepartmentofInformationEngineering,UniversitàdegliStudidiFirenze,Firenze,Italy

bDepartmentofElectrical,ElectronicandInformationEngineering(DEI)“GuglielmoMarconi”,UniversitàdiBologna,Bologna,Italy

a

r

t

i

c

l

e

i

n

f

o

Articlehistory: Received7June2016 Receivedinrevisedform 20September2016 Accepted15December2016 Availableonline21December2016 Keywords: Infantcry Acousticalanalysis Autoregressivemodels Wavelettransform Fundamentalfrequency Resonancefrequencies

a

b

s

t

r

a

c

t

Contactlesstechniquesareofincreasingclinicalinterestastheycanprovideadvantagesintermsof

comfortandsafetyofthepatientwithrespecttosensor-basedmethods.Therefore,theyareparticularly

wellsuitedforvulnerablepatientssuchasnewborns.Specificallytheacousticalanalysisoftheinfantcry

isacontactlessapproachtoassisttheclinicalspecialistinthedetectionofabnormalitiesininfantswith

possibleneurologicaldisorders.Alongwiththeperceptualanalysis,theautomatedanalysisofinfant

cryisusuallyperformedthroughsoftwaretoolsthathowevermightnotbedevotedtothisspecific

signal.Thenewborncryisasignalextremelydifficulttoanalyzewithstandardtechniquesduetoits

quasi-stationarityandtoveryhighrangeoffrequenciesofinterest.Thereforesoftwaretoolsshouldbe

specificallysetandusedwithcaution.Toaddressthisissuethreemethodsaretestedandcompared,

onefreelyavailableandothertwospecificallybuiltusingdifferentapproaches:autoregressiveadaptive

modelsandwavelets.Thethreemethodsarecomparedusingsyntheticsignalscomingfromasynthesizer

developedforthegenerationofbasicmelodicshapesofthenewborncry.Resultspointoutstrengthsand

weaknessesofeachmethod,thussuggestingtheirmostappropriateuseaccordingtothegoalsofthe

analysis.

©2016ElsevierLtd.Allrightsreserved.

1. Introduction

Cryingisthefirstandprimarymethodofcommunicationamong humans.It is a functional expression of basicbiological needs, emotionalorpsychologicalconditionssuchashunger,cold,pain, crampsand even joy [1,2]. It involvesactivation of thecentral nervoussystemandrequiresacoordinatedeffortofseveralbrain regions,mainlybrainstemandlimbicsystem.Thereforeabrain dys-functionmayleadtodisordersinthevibrationofthevocalfoldsand inthecoordinationofthelarynx,pharynxandvocaltract,giving risetoanabnormalcry[3,4].Todatenewborncryanalysisismost oftenperformedwithaperceptiveexaminationbasedon listen-ingtothecryandvisuallyinspectingthesignalwaveformandits spectrogram.Thisapproachisoperator-dependentandrequiresa considerableamountoftimeoftenprohibitiveindailyclinical prac-tice.Anaccurateandautomatedacousticanalysisofnewborncry couldbehelpfultoassistthecliniciantodetectriskmarkersof neu-rodevelopmentaldisorders.Specifically,thedistinctionbetweena regularandanabnormalcryingcanbeveryusefulintheclinical practice.Thereforethescientificcommunityispayingspecial

atten-∗ Correspondingauthor.

E-mailaddress:silvia.orlandi@unifi.it(S.Orlandi).

tiontotechniquesdevotedtoanaccurateautomatedanalysisofthe newborncry[5–7].

Themostsignificantacousticalparametersofinfantcryingare thefundamentalfrequency(F0)andthefirstthreeformant

frequen-ciesofthevocaltract(F1,F2andF3).F0reflectstheregularityofthe

vibrationofthevocalfoldswhileF1,F2andF3 arerelatedtothe

varyingshapeandlengthofthevocaltractduringphonationand thustoitscontrol[8].Actually,itismoreappropriatetoreferto resonancefrequencies(RFs)ratherthanformantsinnewborns.In fact,thevocaltractisalmostflat,themobilityoftheoralcavityis reducedandthebabyisunabletoarticulatevowelorconsonant soundsasthepharynxistooshortandnotwideenoughforthat purpose.ThereforehereafterwewillrefertoF1-F3asofresonance

frequencies(RFs).

InfantF0valuesareusuallyintherange200Hz–800Hz(inthe

caseofhyperphonationtheycanreachandexceed1000Hz)[5,9]. Thisrangeisquitewideincludingbothhealthyandpathological newborns.Indeed nopreciserangesare validatedin the litera-turefordifferentiatingthetwocategories,thepossiblediseases orcategoriesbeingofverydifferentnature(i.e.,deaf,asphyxiated, gastroschistic,preterm,etc).TypicalvaluesforthefirstthreeRFs arearound1000Hz,3000Hzand5000Hz,respectively[10].The RFsestimationisverydifficulttoperformconsideringtheirhigh

http://dx.doi.org/10.1016/j.bspc.2016.12.012

(2)

variability.Significantdeviationsfromtheserangesmayberelated topathologicalconditionsofthecentralnervoussystem[11,12].

The automated analysis of infant cry hasits origins several decadesagowhenthetechnologywaslimited.Therefore,itwas mainlybasedontheperceptualanalysismadebycliniciansthrough listeningtothecrysignalandvisualinspectionofspectrogram esti-matedwiththeFastFourierTransform(FFT)[13].Thisapproachis implementedintheMultidimensionalVoiceProgram(MDVPTM),

thefirstandstillusedcommercialtool,thoughdevelopedforadult voices[14].Currently,manyresearchersusePRAAT[15,16]freely availableonline.AsMDVPTM,itwasdevelopedfortheadultvoice.

Itrequiresacarefulmanualsettingofsomeparametersandthus sometechnicalskills[16].

Thus,sinceearlystudies,itwashighlightedtheneedtodevelop dedicatedsoftwaretoolsthatcouldprovideautomaticallythemain parametersoftheinfantcry.Inthelasttwentyyears,mostofthe researchwasdevotedtoF0estimationwithtraditionalapproaches

suchas FFT, correlation and cepstrum [13,17–20]. Instead, few papersaddressedtheRFsestimation:insomepapersF1was esti-matedwithFFT[10,18,21,22]andrecentlyFFTisappliedforF1-F3

estimation[20].InRobbetal.[22],FFTandLinearPrediction(LP) methodswerecompared,withresultscomparableonlyforF1.

Several approaches were tested on synthetic and real sig-nals[7,10,11,23].Anautomaticadaptiveparametricapproachwas developed,calledBioVoice[24],andsuccessfullyappliedto new-borncry[7,25,26].AswellasforF0,thedifficultyintheestimation

oftheRFsismainlyduetothequasi-stationarityandtheveryhigh rangeoffrequenciesofinterestinthenewborncrywhichrequires sophisticatedadaptivenumericaltechniquescharacterizedbyhigh time-frequencyresolution. Totheauthors’knowledgeonly two softwaretoolsexistspecificallydesignedfortheautomaticanalysis ofinfantcrying.TheabovementionedBioVoice[7,23,25–27],based onaninnovativeadaptiveparametricapproachforF0andformants

estimation[10]andsuccessfullytestedagainstothersoftwaretools [11,28]andasoftwaretoolrecentlyproposedbyReggianninietal. [20]thatestimatesF0 bymeansofacepstrumapproach.[11].It

wasshownthatthecepstrumapproachisfasterthanaparametric approach,butlessrobustagainstnoiseinfrequencies(F0andRFs)

estimation[11].

Takingintoaccountthehighvariabilityofthesignalthewavelet approachcouldbeparticularlysuitedtothestudyofneonatalcry thankstoitstime-frequencyvaryingresolutionandlowcomputing time.Thereforeanewautomatedmethodbasedonthewavelet transformisproposedherefortheestimationofF0 andtheRFs

ofnewborncrythat,liketheBioVoicetool,doesnotrequireany manualsettingtobemadebytheuser.Basedontheliteraturethe MexicanhatContinuousWaveletTransform(CWT)isappliedforF0

estimationandthecomplexMorletforRFsestimation[29,30].The proposedapproach,namedWInCA(WaveletInfantCryAnalyzer) iscurrentlyimplementedinMATLABbutcanbeeasilyadaptedto anyembeddedprocessor.

Anewsynthesizercapabletoreproducevariablemelodicshapes ofnewborncrywasdeveloped,basedonthemethodsdescribedin [8,31].ThissynthesizerisusedtotestWInCAagainstPRAATand BioVoice.

Thispaperpresentsthefirstattempttoapplywaveletstothe analysisofnewborncry,thereforethemethodwillbedescribedin thenextsectionalongwiththenewsynthesizer.

Theinnovativefeaturesofthispaperare:asynthesizer specif-icallydevelopedtoreproducethemelodicshapeoftheneonatal cry;anewmethodfornewborncryanalysisbasedonthewavelet transformandacomparisonofthreedifferentmethodsof auto-matedanalysisofcryonsyntheticsignals.Advantagesandlimits ofthethreemethodsarehighlightedandtheoptimaluseofeach ofthemissuggested.

2. Methods

2.1. Thenewborncrysynthesizer

ThesynthesizerwasdevelopedunderMatlabcomputing envi-ronment.Itiscapableofsynthesizingnewborncrysignalswith differentmelodyshapes.Thesynthesizer,basedonamethod devel-opedforadultmalevoices[8,31]iscomposedbytwoblocks:apulse traingeneratorandavocaltractfilter,accordingto[32].

2.1.1. Pulsetraingenerator

Thepulsetraingenerator,basedonadditivesynthesis,buildsa glottalpulsesequence.Itapproximatesaperiodicsignalthrough alinearcombinationofsinewaveswhosefrequenciesarein har-monicratiowitheachother.Thus,thefirststepoftheinfantcry synthesizerassumestheglottalpulsesequenceasa pulsetrain withperiodTthatisthereciprocalofthefundamentalfrequency F0(F0=1/T).

Thesynthesisofeachharmoniccomponentisobtainedthrough anoscillatorblock.Eachoscillatorisdrivenbytwocontrolvectors: theamplitude(a[n])and thefrequency(f[n])oftheoutputsine waves,wherenisthen-thelementofthecontrolvectors(1≤n≤N) andNisrelatedtotheduration(ins)ofthesynthesizedsignaland tothenumberofframesinwhichthesynthesizedsignalis seg-mented.AssumingasamplingfrequencyFs=44.1kHzandaframe

durationof10ms,weget441samplesperframe(SpF).Thus,the framerate(thefrequencyofthecontrolvector)isgivenbytheratio Fs/SpF(100Hz).Withasynthesizedsignalof2sofdurationN=201

samples.

Inthefirstharmonica0[n]=1andf0[n]=F0.Higherharmonics

areobtainedwiththesameamplitudevectorofthefirstharmonic (a0[n])andfrequencyvectorsgivenbymultiplesofF0:

fk[n] =kf0[n] (1)

where1≤k≤KandK=30isthenumberofdigitaloscillators con-sideredinthesynthesizer.Fromtheliterature[5]weknowthatthe rangeofF0fornewborncryisaround200–800Hzandthefirstthree

resonancefrequenciescanreachvaluesupto10kHz.Assumingan averageF0=400Hz,withK=30weareabletosynthesizesignals

withfrequenciesupto12kHz,thuscovering therangeof inter-est.Signalsobtainedbyeachoscillatorarethensummedupgiving risetotheglottalpulsetrainss[n].Accordingtotheadditive

syn-thesisdescribedabove,F0iskeptconstantthroughoutthewhole

vocalemission.Thus,togetatimevaryingF0inthesynthesized

sig-nal,amodificationofthisapproachisproposedherethroughthe modulationofthefirstharmonic.Modulationisobtaineddriving theoscillatorthatgeneratesthefirstharmonicwithavectorf0[n]

madeupbyvariablevalues,asitwillbediscussedinsubsection A3.AccordingtoEq.(1),theotheroscillators(thatgeneratehigher orderharmonics)willgiverisetofrequencymodulatedsignals. 2.1.2. Vocaltractfilter

Vocaltractresonancesaregivenbyafilterbankwith3parallel resonancefilters[31].Eachfilterisgivenbya2nd-orderall-pole filterwithfrequencyresponse:

H(z)= b1

1−a2z−1+a3z−2

(2) Giventhebandwidth(B)ofthefilter,theresonancecentral fre-quency(fc)andωc=2fc/Fs,thefiltercoefficients areobtained

throughthefollowingrelationships[33]:

r=e−BFs (3)

a2=−2rcos(2fc

(3)

Fig.1.Basicmelodyshapesofthenewborncryobtainedthroughthenewborncrysynthesizer:a)Complex(double-archmelodyshape);b)Falling(singlearchmelodyshape witharapidF0increasefollowedbyaslowdecrease,withasymmetricshapeskewedtotheleft);c)Plateau(flatmelodyprofile);d)Rising(single-archmelodyshapewith

aslowF0increasefollowedbyarapiddecrease,withasymmetricshapeskewedtotheright);e)Symmetric(singlearchmelodyshapealmostsymmetricwithrespecttoits

midpoint).

a3=r2 (5)

b1=(1−r)



1−2rcos(ωc)+r2 (6)

Thusthefrequencyresponse(2)canbewrittenas:

H(Z)= b1

1−2rcos(ωc)z−1+r2z−2

(7) Thecentralfrequenciesofthethreeparallelfilters(fc)correspond

tothevocaltractformantfrequencies(F1–F3).Thus,theproposed

synthesizercanbecontrolledbysettingthevalueofthe funda-mentalfrequency(F0)andofthefirstthreeformantfrequencies

(F1–F3).

2.1.3. Infantcrysynthesis−melodyshapesandvocaltract resonances

Accordingto[5]thefundamentalfrequencyF0ofhealthy

new-bornsusuallyvariesbetween200Hzand800Hz.Thus,theF0values

ofthesynthesizedsignalsareallowedtovaryinthisrange.F0

varia-tionsweresetaccordingtothebasicmelodyshapesofthenewborn cry[23,34–37]:

• Complex:double-archmelodyshape(Fig.1a);

• Falling:singlearchmelodyshapewitharapidF0 increase

fol-lowedbyaslowdecrease,withasymmetricshapeskewedtothe left(Fig.1b);

• Plateau:flatmelodyprofile(Fig.1c);

• Rising:singlearchmelodyshapewithaslowF0increasefollowed

byarapiddecrease,withasymmetricshapeskewedtotheright (Fig.1d);

• Symmetric: singlearch melody shape almostsymmetric with respecttoitsmidpoint(Fig.1e).

Asmentionedabove,thesynthesizedsignalshavefixed dura-tionequalto2s.Thus,thecontrolvectorsthatdrivetheoscillators aremadeupby201samples.Togetcontrolvectorswithvarying frequencyforsynthesizingthe5basicmelodyshapesofF0,aspline

interpolationwasimplemented.Theinterpolationpoints(nodes) weresettoobtainmelodicshapeswithintherange400Hz–650Hz, whichisthecentralintervaloftherangereportedin[5]. Specif-ically, for each basic shape, the frequency control vectors are obtainedasfollows:

• Thecomplexshapeisdefinedascomposedbymultiplearcs(at least2)[34].Wesynthesizedadouble-arcshapewhosemaxima (withfrequencyequalto590Hz)arelocatedat0.25sand1.25s respectively.Betweenthesemaxima,alocalminimum(with fre-quencyequalto520Hz)islocatedat0.6s,inordertomakethe secondpeakslowerwithrespecttothefirstone(Fig.1a).Thefirst andlastnodesofthevectortobeinterpolatedwerebothsetat 500Hz

• Thefallingshapewasobtainedsettinga maximumfrequency (650Hz)at0.4s,thatallowforaslowdecreasetowards450Hz (Fig.1b).Thefirstnodewassetequalto550Hz

• Theplateaushape wasobtainedby varyingF0 withina quite

narrowrange(540–555Hz)aroundanaverageF0valueequalto

550Hz.Inparticular,thereisaslightlyascendingsectionupto 555Hz,followedbyadecreasetowards540Hz(Fig.1c). • Therisingshapewasobtainedbyreversingthefallingshape.The

(4)

decreasetowards550Hz(Fig.1d).Thefirstnodewassetequalto 450Hz

• The symmetric shape was obtained setting the maximum (650Hz)inthemidpointofthevectortobeinterpolated(thus at1s).Thefirstandlastnodesweresetequalto450Hz(Fig.1e). Theamplitudecontrolvectorswereobtainedinthesameway foralltheshapes:increasingamplitudeissetinthefirstpartof thesignal,followedbyanearlyconstantpartandthendecreasing amplitudeattheendofthesignal.

Resonancefrequencyrangesweresetaccordingto[38],where thefirstthreeRFsof28healthytermnewbornswerefoundinthe range:F1[700–1400]Hz;F2[2000–4000]Hz;F3[4000–7000]Hz. Specificallyhereweset:F1=1100Hz,F2=3000HzandF3=5500Hz.

Totesttheproposedmethod,additivewhitenoiseofincreasing amplitude(1%,5%and10%ofthesignalmaximumamplitude)was addedtothesynthesizedsignalsthroughtheAudacitytool[39]. Noisewasaddedtohavemorerealisticmelodicshapesofcriesin healthyorsickinfants,sinceitisalwayscharacterizedby irregu-larities.Thefivemelodicshapesdescribedaboveweresynthesized withoutnoiseandwitheachoneofthethreelevelsofaddednoise. Thereforetwentysyntheticsignalswereanalyzed.

2.2. WInCA:waveletmethod

Inthis section thenovelwavelet-basedmethodfor estimat-ing F0 and RFs is explained. The new tool named WInCA has

beendevelopedspecificallyfortheacousticalanalysisofnewborn cry,characterizedbyhighfundamental frequencyF0 and

quasi-stationarity.Todatenootherapplicationofthewavelettransform tonewborncryexists.Therefore,thechoiceofthemotherwavelet isbasedonresultsobtainedwithadultvoicesignals.Specifically theMexicanHatcontinuouswaveletisappliedforF0 estimation

[29]andthecomplexMorletforRFsestimation[30]. 2.2.1. Continuouswavelettransform

Thewavelettransformfilters asignalf(t)withashiftedand scaledversionofaprototypefunction␺(t),theso-called“mother wavelet”,acontinuousfunctioninboththetimedomainandthe frequencydomain[40].

TheContinuousWaveletTransform(CWT)off(t)isdefinedas [40]: CWT(a,b;f(t),␺(t))=



+∞ −∞ f(t)√1 a␺ ∗



t−b a



dt (8)

ThescaleparameteraofaCWTisrelatedtothewidthoftheanalysis window:iteitherdilatesorcompressesthesignal.Theshift param-eterblocatesthewaveletintime.Varyingaandballowslocating thewavelet atthedesiredfrequency andtimeinstant[40].The relationshipbetweenaandthefrequencyisgivenbytheso-called pseudo-frequency(Fa)inHz,definedbythefollowingequation:

Fa= Fc

a (9)

WhereisthesamplingperiodandFcisthewaveletcentral

fre-quency.

ForF0estimation,aMexicanHatCWTisused.TheMexicanHat

CWTisdefinedas: ␺(t)= √2

3

−1

4(1x2)e−x22 (10)

For each time window and within the F0 frequency range

(200–800Hz),thehighestcoefficientoftheCWTmatrixisfound. Theautocorrelation(AC)iscomputedontherowofthematrixthat

Table1

Frequencybandsofinterestinnewborncry,centerfrequencyFcandbandwidthFb

forthecomplexMorlet.

Frequencyband[Hz] Fc[Hz] Fb[Hz]

F1[800–1500] 0.8 1.98

F2[1500–3500] 0.75 2.25

F3[3500–5500] 1.5 0.56

containsthisvalue,whichcorrespondstotheoptimalscale.F0is

givenby:

F0=Fs/␶ (11)

Where␶referstotheposition(lag)ofthemaximumoftheAC.

TheestimationofF1–F3isperformedinasimilarway,with

dif-ferentrangesfortheband-passfilterasreportedinTable1with

acomplexMorletwaveletasprototype[40].TheCWTprovidesa moreaccuraterepresentationoftheoscillatorycomponentswithin thesignalwithoutintroducingfluctuationsinthecoefficients[40]. ThecomplexMorletwaveletisdescribedbythefollowing relation-ship[41]: (t)= 1 14 ej␻cte − t2 2 2t (12)

where 1/␲1/4 is thenormalization termof the energy;for this

waveletthecenterfrequencyisdefined as␻c=2␲Fc.Thescale

parameter␴t isthestandard deviation(STD).It determinesthe

amplitudeofthewavelet.Infact␻c␴tsetsthelinkbetweenthe

bandwidthof the waveletand itsfrequency Fc.For the Morlet

wavelet,thelattermustassumevaluessuchthat[29,30]:

ωct≥5 (13)

Moreover,thefollowingrelationshipistakenintoaccount:

Fb=2t2 (14)

WhereFbisthebandwidthofthewavelet.Comparingthefrequency

rangesandonanalogyto[29]thevaluesofFcandthecorresponding

valuesofFb weresetasinTable1.Specifically,foreachFc

rela-tivetoeachfrequencyband,Fbwascomputedwith␻c␴t=5and

accordingtoEqs.(13)and(14). 2.2.2. EstimationofF0

FortheestimationofthefundamentalfrequencyF0,the

pro-posedmethodinvolvesthefollowingsteps:

1.Band-passFIRfilteringwiththeKaiserwindow[200–800]Hz; 2.MexicanHatCWTofthesignal.AmatrixM(dimensionspxq)of

coefficientsisobtained,wherepisthemaximumvalueofthe scaleandqisthenumberofframesonwhichthescaledversions ofthemotherwaveletareshifted.Forasignaldurationof2sand afixedshiftof10ms,q=200.

3.Locationofthescale(line)ofMcorrespondingtothecoefficients ofmaximummodulusandestimationofF0accordingtoEq.(11).

AnF0estimateiscomputedevery10ms.

OneachtimewindowtheCWTscaleparameterawasallowed tovaryintherange1÷55.Thischoiceallowsincludingthewhole range ofinfantcryF0 [5].Therefore theMexicanHatCWT was

appliedwitha=55,=1/Fs=1/44.1s, Fc=0.25Hz.Consequently

Fa=200HzaccordingtoEq.(9).

2.2.3. EstimationofRFs

TheestimationofF1–F3iscarriedoutwithaproceduresimilar

(5)

fil-Table2

RMSEforthethreemethods(WInCA,PRAATandBioVoice)appliedtothe20syntheticmelodicshapes:complex,falling,rising,symmetricandplateauwithnoaddednoise, 1%,5%and10%addednoise.Bestresultsarehighlightedinbold.

RMSE[Hz] Noise WInCA PRAAT BIOVOICE

F0 F1 F2 F3 F0 F1 F2 F3 F0 F1 F2 F3 Complex 0.00 6.29 55.27 227.47 600 2.33 50.47 82.53 1050.79 5.87 47.77 65.38 89.35 0.01 6.24 55.27 227.47 600 2.33 50.51 82.56 1049.77 5.86 55.62 77.31 87.91 0.05 6.24 55.27 226.9 600 2.40 50.50 82.37 1025.91 6.10 49.99 79.81 91.70 0.10 6.48 55.91 203.35 600 3.20 51.40 82.59 1007.15 6.10 51.06 76.27 90.74 Falling 0.00 6.62 96.94 467.58 600 2.07 318.30 96.01 1098.48 7.02 120.14 94.51 134.61 0.01 6.62 96.94 467.58 600 2.06 310.11 96.07 1051.83 6.97 102.55 97.15 135.30 0.05 6.62 96.90 464.32 600 2.12 310.11 96.07 1051.83 7.30 107.67 78.41 112.56 0.10 6.62 96.51 441.46 600 2.38 279.06 95.53 987.23 6.98 102.55 97.15 135.30 Rising 0.00 8.99 103.43 495.63 600 2.15 337.57 99.48 1096.02 3.95 119.25 87.68 119.40 0.01 9.04 103.43 495.63 600 2.15 338.22 99.46 1093.86 3.90 105.96 82.37 108.57 0.05 8.85 103.76 474.08 600 2.13 330.36 99.42 1076.90 4.08 102.35 71.76 117.54 0.10 9.34 102.36 463.95 600 2.14 305.95 97.99 1007.47 3.96 104.84 81.57 105.41 Symmetric 0.00 8.21 120.77 388.45 600 2.30 311.86 91.28 1126.26 8.89 138.80 84.80 116.46 0.01 8.28 120.77 388.45 600 2.31 311.62 91.31 1130.12 8.85 128.64 89.26 109.04 0.05 8.53 120.47 362.32 600 2.31 299.99 90.85 1080.83 8.83 128.92 68.46 110.74 0.10 8.59 121.89 308.44 600 2.33 303.84 102.52 996.42 8.92 121.00 73.09 111.82 Plateau 0.00 6.51 24.39 148.42 600 0.17 6.98 89.91 1479.89 3.80 18.85 75.49 46.73 0.01 6.51 24.39 148.42 600 0.17 6.99 90.00 1478.08 3.71 18.80 60.00 45.52 0.05 6.51 24.39 148.42 600 0.24 7.03 90.29 1449.22 3.85 20.73 62.52 47.36 0.10 6.51 24.39 150 600 0.23 7.13 90.26 1365.96 3.89 6.31 73.01 28.76

ter,accordingtoTable1andusingtheComplexMorletasmother

wavelet.

2.3. Estimationthroughexistingmethodsandperformance comparison

WInCAwascomparedwithPRAATandBioVoiceinF0andRFs

estimation.

WithBioVoiceF0estimationisperformedbymeansofa

two-stepalgorithm.First,theSimpleInverseFilterTracking(SIFT)is appliedtotimewindowsofshortandfixedlength;afterwards,F0

isadaptivelyestimatedonsignalframesofvariablelengththrough theAverage Magnitude Difference Function (AMDF) withinthe rangeprovidedbytheSIFT[7,24,42].Resonancefrequenciesare obtainedbypeakpickinginthepowerspectraldensity(PSD) eval-uatedonthesameadaptivetimewindowsusedforF0estimation.

Oneachvaryingtimewindow,anautoregressivemodeloforder q=22(halfofthesamplingfrequencyinkHz)isapplied.Thischoice forqcomesfromacousticandphysiologicalconstraints,givingan enoughdetailed spectrumwhile preventingspectral smoothing andconsequentlossofspectralpeaks[23,43].

InPRAATF0isestimatedthroughtheautocorrelationofthe

sig-nalcalculatedonshortframeswhileLinearPredictiveCoding(LPC) isappliedfortheRFsestimation[15,16].Weunderlineherethat usingthedefaultvaluesofPRAATforbothF0 andRFsrangesas

wellastheirnumber definitelyincorrectresultswereobtained. Thereforeacareful selectionofthebestvalueswascarriedout, necessarilyinanempiricalway.SpecificallywechosetheF0range

equalto200–800HzandforRFsupto10000Hz.ThenumberofRFs wassetto5butweconsideredonlythefirstthreefrequencies[16]. Once that F0–F3 were estimated with the three methods

(WInCA,BiovoiceandPRAAT),theF0vectorswereresampledat

100Hz(samplingfrequencyofthecontrolvectorsofthe synthe-sizedsignals).Inthisway,thecomparisonwiththereferencevalues

waspossiblethroughthecalculationoftheroot-mean-squareerror (RMSE)inHz,definedas:

RMSE=









1 N N

i=1 (xi−yi)2 (15)

where N=201is thenumber of samplesof thecontrol vectors (lengthoftheF0 vectorofthesynthesizedsignal),xiis thei-th

F0 valueofthesyntheticsignaland yi isthecorrespondingi-th

estimatedF0values.AsthesynthesizedsignalshadfixedRFs,the

interpolationprocesswasnotperformedontheRFsvectors(F1–F3).

ThusNinEq.(15)isequaltothenumberofsamplescontainedinthe RFsvectorestimatedwiththethreemethodsandxicorrespondsto

theRFsvaluesconsideredforthesynthesizedsignals(F1=1100Hz,

F2=3000Hz,F3=5500Hz).

3. Results

Table2showstheRMSEvaluesfor eachmethodandthe20 syntheticmelodicshapes:complex,falling,rising,symmetricand plateauwith:noaddednoise,1%,5%and10%addednoise.Best resultsarehighlightedinbold.

Toprovideanoverallpicture,Table3summarizesthemean val-uesofRMSEoverallthemelodyshapesandallnoiselevelsforthe threemethods.

4. Discussion

Thecomparisonamongthethreemethodsreportedinthispaper is made according to synthetic signals that simulate the main melodicshapesofnewborncry.Totesttherobustnessofthe meth-odsincreasingnoiselevelsareaddedtothesignals.Theproposed

Table3

OverallRMSEofF0–F3estimateswiththethreemethods.Bestresultsarehighlightedinbold.

RMSEF0[Hz] RMSEF1[Hz] RMSEF2[Hz] RMSEF3[Hz]

WInCA 7.38±1.16 80.17±36.13 334.92±135.48 600±0.00

PRAAT 1.88±0.89 199.40±144.11 92.33±6.28 1135.20±164.39

(6)

synthesizerrepresentsabreakthroughwithrespecttoexisting lit-erature.

Togiveameasureofthematchingcapabilityofthethree

meth-odstheRMSEiscomputedbetweenthesyntheticsignalsandthe

estimatedones.Results(Tables2and3)showthat:

–PRAATgivesslightlybetterresultsasfarasF0estimationis

con-cerned,withameanerrorlessthan2Hz.However,itfailsinthe RFsestimation(exceptforplateau)withameanerrorranging from92Hz(forF2)upto1135HzforF3.

Wepointoutagainthattheseresultsareobtainedaftera care-fulmanualselectionofrangesandthresholdsforF0andforthe

RFs,accordingtoliterature.Muchworseresultsareobtainedifthe defaultvalueswereused(40–500HzforF0,upto5500HzforRFs,

considering5formants)[16].ThusPRAATmustbeusedcarefully. AnadditionaladvantageofPRAATisundoubtedlythelow com-putingtime:theanalysisofasignalof1sofdurationrequiresless than0.5sforF0estimationandabout2sfortheRFs.

–WInCA: F0 gives an acceptable errorbut higher than PRAAT.

ConcerningRFs,quitegoodresultsareobtainedforF1

estima-tiononly.ThecomputingtimeiscomparabletoPRAAT:for1s ofrecordingWInCArequires0.9sfortheestimationofF0 and

2.8sfortheestimationofRFs.AnotheradvantageisthatWInCA doesnotrequireanymanualsetting.IfintegratedintoBioVoice (orprovidedwiththepre-processingstep)itcouldbeasuitable alternativetoothertools.

–BioVoice:QuitegoodresultsforF0(meanerror<6Hz)andbetter

resultsforRFsascomparedtoPRAAT.ForF2andF3itgivesthe

bestresultsamongthethreetools(errorslessthan100Hz),with anerrorsimilartothatofWInCAforF1(82Hzagainst80Hzwith

WInCA).Moreoveritdoesnotrequireanymanualsetting param-etersandimplementsthepre-processingstep.Finally,itallows theanalysisofmultiplesignalssimultaneously,buildingfolders withexportablegraphsandtables.Theseadvantagesalongwith thesophisticatedimplementedtechniquesforF0 andRFs

esti-mationrepresentitsmaindrawbackthatisthecomputational complexity(for1sofrecodingitrequires3sforF0 estimation

and4.7sfortheRFs).

Thus,theuseofonetechniqueoranotheroneshouldtherefore becarefullyevaluatedaccordingtothespecificneedsinorderto avoiderrorsandunreliableresults.

Furthersuggestionsincludethechoiceoftheanalysiswindow andtheminimumdurationofcryunitstoanalyze.Thetime dura-tionoftheanalysiswindowshouldbesetasshortaspossible.In fact,weappliedawindowof10msinWInCAfortheestimationof F0 andRFs.However,mostofthetraditionalanalysistechniques

do not allowadequatefrequency resolution for frames of such shortdurationandtypicallymuchlongerwindows(even50ms) areusedleadingtoapoortemporalresolution.Asacompromiseit isthereforesuggestedtousewindowsuptoamaximumof20ms ofduration(usedinBioVoice).

AmongthetechniquespresentedherePRAATandBioVoicehave alreadybeenappliedtorealsignals[7,25,44],whiletheapplication ofWInCAwillbethesubjectoffuturestudies.

5. Conclusions

Thispaperpresentsacomparisonofthreedifferenttechniques fortheautomatedacousticalanalysisofnewborncry.Specifically twoexistingtools(PRAATandBioVoice)arecomparedtoanewly developedone,namedWInCA(WaveletInfantCryAnalyzer).We pointoutthattheinfantcryisasignalextremelydifficulttoanalyze

withstandardtechniquesduetoitsquasi-stationarityandtothe highrangeoffrequenciesofinterest.Thisstudycouldrepresent aguidelinefortheautomatedacousticalanalysisofinfantcry,in ordertoincludetheseobjectivemethodsintheclinicalpractice alongwithperceptualassessments.

Thecryingofnewbornsisafunctionalexpressionofbasic bio-logicalneeds.Itisbasedonacoordinatedeffortofseveralbrain regions,mainlybrainstemandlimbicsystemandislinkedtothe breathsystem.Thereforecryfeaturesreflectthedevelopmentand theintegrityofthecentralnervoussystem.

In addition to other clinical tests the automated infant cry analysis,whenproperlyapplied,isasuitablecheapand contact-lessapproachforanearlyassessmentoftheneurologicalstateof infants.Areliableautomatedmethodfortheestimationofcrying acousticalcharacteristicscouldprovideasupporttotheperceptive analysismadebycliniciansreducingtherequiredamountoftime oftenprohibitiveindailyclinicalpractice.

Inthis work,a newborncrysynthesizerandanew wavelet-basedmethodfornewborncryanalysisarepresented.Synthetic signals are obtained implementing a new accurate simulation modelthattakesintoaccountthevariabilityofthesignalunder study as wellas the basic shapesof thenewborn crymelody. Adatabaseofthesyntheticsignalisavailableonrequesttothe authors.

ThenewanalysistoolnamedWInCAhasbeendeveloped specif-icallyfortheacousticalanalysisofnewborncry,characterizedby highfundamentalfrequencyF0andforbeingquasi-stationary.The

waveletapproachisinfactparticularlysuitedtothestudyofinfant crythankstoitstimefrequencyhighresolutioncharacteristicsand lowcomputingtime.ToassessitsperformanceWInCAiscompared tothealreadyexistingtoolsPRAATandBioVoiceonasetof syn-theticmelodicF0shapescorruptedbyincreasingnoiselevels.

Results showthat PRAATgives slightly betterF0 resultsbut

requirespropersettings.WInCAisbestsuitedforF1estimationand

BioVoicegivesbestresultsforF2andF3.Itisalsothemostrobust

againstincreasingnoise.

Asspecifiedintheintroduction,theaimofthisworkisto pro-poseforthefirsttimeacomparisonamongobjectivemethodsfor newborncryanalysis.Thisstudywasperformedonsynthetic sig-nalsbecauseweneedtoknowtheexactreferencevaluesofsignals toobtainanaccuratecomparison.

Thesynthesizerproposedinthisworkiscapabletogeneratethe fivebasicwaveformsofthenewborncrythatarepointedoutinthe literature:rising,falling,symmetric,complexandplateau.The syn-theticsignalshaveagoodmatchwiththerealcriesalsofromthe perceptualpointofview.Ofcourse,theydonotrepresentthewide varietyofrealcries,whichincludetensofshapes,butare never-thelessuseful,sincenoreferencesexisttodate.Uponrequestthe authorswillprovidethesynthesizedsignalstotestnewmethods forinfantcryanalysis.

Theapplicationtorealsignalswillbethesubjectoffutureworks. Inparticular,theadvantagesofBiovoiceandWInCAwillbemerged for theanalysisofinfantcryin ordertodetectclinically useful parametersandranges.Moreover,itcouldbeappliedtoany cate-goryofsicknewbornsortocomparepretermandfullterminfants aswell.

Acknowledgements

This workwas partially carried onunder Project PGR00202 “Analysisandclassificationofvoiceandfacialexpression: appli-cation toneurologicaldisordersin neonatesandadults”, Italian Ministry of Foreign Affairs − Progetti Grande Rilevanza Italy-Mexico.

(7)

References

[1]O.Wasz-Höckert,E.Valanne,V.Vuorenkoski,K.Michelsson,A.Sovijarvi, Analysisofsometypesofvocalizationinthenewbornandinearlyinfancy, Ann.Paediatr.Fenn.9(1963)1–10.

[2]O.Wasz-Höckert,T.J.Partanen,V.Vuorenkoski,K.Michelsson,E.Valanne,The identificationofsomespecificmeaningsininfantvocalization,Experientia20 (3)(1964)154.

[3]K.Michelsson,K.Eklund,P.Leppanen,H.Lyytinen,Crycharacteristicsof172 healthy1-to7-day-oldinfantsm,FoliaPhoniatr.Logop.5(2002)190–200.

[4]H.E.Baeck,M.N.deSouza,Longitudinalstudyofthefundamentalfrequencyof hungercriesalongthefirst6monthsofhealthybabies,J.Voice5(2007) 551–559.

[5]H.Rothganger,Analysisofthesoundsofthechildinthefirstyearofageanda comparisontothelanguage,EarlyHum.Dev.75(2003)55–69.

[6]O.F.Reyes-Galaviz,E.A.Tirado,C.A.Reyes-Garcia,etal.,Classificationofinfant cryingtoidentifypathologiesinrecentlybornbabieswithANFIS,in:Klaus Miesenberger(Ed.),ComputersHelpingPeoplewithSpecialNeeds,Springer, Berlin,2004,pp.408–415.

[7]C.Manfredi,L.Bocchi,S.Orlandi,L.Spaccaterra,G.P.Donzelli,

High-resolutioncryanalysisinpretermnewborninfants,Med.Eng.Phys.31 (5)(2009)528–532.

[8]J.R.Deller,J.H.Hansen,J.G.Proakis,Discrete-TimeProcessingofSpeechSignal, McMillanPublishingCompany,NewYork,1993,pp.724–728.

[9]P.S.Zeskind,Infantcryingandthesynchronyofarousal,in:TheEvolutionof EmotionalCommunication:FromSoundsinNonhumanMammalstoSpeech andMusicinMan,OxfordUniversityPress,Oxford,2013,pp.155–174.

[10]A.Fort,A.Ismaelli,C.Manfredi,P.Bruscaglioni,Parametricand non-parametricestimationofspeechformants:applicationtoinfantcry, Med.Eng.Phys.18(8)(1996)677–691.

[11]A.Fort,C.Manfredi,Acousticanalysisofnewborninfantcrysignals,Med.Eng. Phys.20(1998)432–442.

[12]S.J.Sheinkopf,J.M.Iverson,B.M.Lester,Atypicalcryacousticsin6-month-old infantsatriskforautismspectrumdisorder,AutismRes.5(5)(2012)331–339.

[13]P.Sirviö,K.Michelsson,Sound-spectrographiccryanalysisofnormaland abnormalnewborninfants,FoliaPhoniatr.Logop.28(3)(1976)161–173.

[14]CorpKayElemetrics,Multi-DimensionalVoiceProgram(MDVP):Software InstructionManual,KayElemetrics,LinconPark,2003.

[15]P.Boersma,D.Weenink,Praat:DoingPhoneticsbyComputer[Computer Program]2014Version5.374,2014,http://www.praat.org/,Accessed:07 June2016.

[16]P.Boersma,Praat,asystemfordoingphoneticsbycomputer,Glot.Int.5 (9/10)(2002)341–345.

[17]M.Z.Laufer,Y.Horii,Fundamentalfrequencycharacteristicsofinfant non-distressvocalizationduringthefirsttwenty-fourweeks,J.ChildLang.4 (02)(1977)171–184.

[18]K.Michelsson,O.Michelsson,Phonationinthenewborn,infantcry,Int.J. Pediatr.Otorhinol.49(1999)S297–S301.

[19]M.P.Robb,D.H.Crowell,P.Dunn-Rankin,Suddeninfantdeathsyndrome:cry characteristics,Int.J.Pediatr.Otorhinol.77(8)(2013)1263–1267.

[20]B.Reggiannini,S.J.Sheinkopf,H.F.Silverman,X.Li,B.M.Lester,Aflexible analysistoolforthequantitativeacousticassessmentofinfantcry,J.Speech Lang.Hear.Res.6(5)(2013)1416–1428.

[21]Y.Kheddache,C.Tadj,Resonancefrequenciesbehaviorinpathologiccriesof newborns,J.Voice29(1)(2014)1–12.

[22]M.P.Robb,A.T.Cacace,Estimationofformantfrequenciesininfantcry,Int.J. Pediatr.Otorhinol.32(1)(1995)57–67.

[23]K.Wermke,W.Mende,C.Manfredi,P.Bruscaglioni,Developmentalaspectsof infant’scrymelodyandformants,Med.Eng.Phys.24(2002)501–514.

[24]C.Manfredi,L.Bocchi,G.Cantarella,Amultipurposeuser-friendlytoolfor voiceanalysis:applicationtopathologicaladultvoices,Biomed.Signal Process.Control4(2009)212–220.

[25]S.Orlandi,L.Bocchi,G.P.Donzelli,C.Manfredi,Centralbloodoxygen saturationvscryinginpretermnewborns,Biomed.Sig.Proc.Control7(2012) 88–92.

[26]S.Orlandi,P.H.Dejonckere,J.Schoentgen,J.Lebacq,N.Rruqja,C.Manfredi, Effectivepre-processingoflongtermnoisyaudiorecordings.Anaidtoclinical monitoring,Biomed.SignalProcess.Control8(6)(2013)799–810.

[27]C.Manfredi,V.Tocchioni,L.Bocchi,Arobusttoolfornewborninfantcry analysis,in:Conf.Proc.IEEEEng.Med.Biol.Soc.,NewYork,NY,2006,pp. 509–512.

[28]N.Rruqja,P.H.Dejonckere,G.Cantarella,J.Schoentgen,S.Orlandi,S.D. Barbagallo,C.Manfredi,Testingsoftwaretoolswithsynthesizeddeviant voicesformedicolegalassessmentofoccupationaldysphonia,Biomed.Signal Process.Control13(2014)71–78.

[29]L.Cnockaert,J.Schoentgen,P.Auzou,C.Ozsancak,L.Defebvre,F.Grenez, Low-frequencyvocalmodulationsinvowelsproducedbyParkinsonian subjects,SpeechCommun.50(4)(2003)288–300.

[30]L.Falek,A.Amrouche,L.Fergani,H.Teffahi,A.Djeradi,Formanticanalysisof speechsignalbywavelettransform,Conf.Proc.WorldCongr.Eng.vol.2 (2011)1572–1576.

[31]G.DePoli,C.Drioli,F.Avanzini,SintesiDeiSegnaliAudio,1999,http://www. dei.unipd.it/∼musica/Dispense/cap5.pdf.Accessed:07June2016.

[32]J.D.Markel,A.H.Gray,LinearPredictionofSpeech,Springer–Verlag,Berlin, Heidelberg,NewYork,2011.

[33]G.DePoli,F.Avanzini,Soundmodeling:signal-basedapproaches.In: AlgorithmsforSoundandMusicComputing,2009.

[34]K.Wermke,W.Mende,Musicalelementsinhumaninfants’cries:inthe beginningisthemelody,MusicaeScientiae13(Suppl.2)(2009)151–175.

[35]G.Varallyay,Themelodyofcrying,Int.J.Pediatr.Otorhinol.71(11)(2007) 1699–1708.

[36]E.Amaro-Camargo,C.A.Reyes-García,Applyingstatisticalvectorsofacoustic characteristicsfortheautomaticclassificationofinfantcry,in:Advanced IntelligentComputingTheoriesandApplications.WithAspectsofTheoretical andMethodologicalIssues,SpringerBerlin,Heidelberg,2007,pp.1078–1085.

[37]K.Wermke,D.Leising,A.Stellzig-Eisenhauer,Relationofmelodycomplexity ininfants’criestolanguageoutcomeinthesecondyearoflife:alongitudinal study,Clin.Linguist.Phonet.21(2007)961–973.

[38]S.Orlandi,C.A.Reyes-Garcia,A.Bandini,G.P.Donzelli,C.Manfredi, Applicationofpatternrecognitiontechniquestotheclassificationof full-Termandpreterminfantcry,J.Voice30(6)(2016)656–663.

[39]http://www.audacityteam.org/,Accessed:07June2016.

[40]C.Chui,AnIntroductiontoWavelets,AcademicPress,SanDiego,CA,USA, 1995.

[41]P.S.Addison,TheIllustratedWaveletTransformHandbook:Introductory TheoryandApplicationsinScience,Engineering,MedicineandFinance, InstituteofPhysicsPublishing,London,2002.

[42]C.Manfredi,G.Peretti,Anewinsightintopost-surgicalobjectivevoicequality evaluation.Applicationtothyroplasticmedialisation,IEEETrans.Biomed. Eng.53(3)(2006)442–451.

[43]C.Manfredi,Adaptivenoiseenergyestimationinpathologicalspeechsignals, IEEETrans.Biomed.Eng.47(2000)1538–1542.

[44]G.Esposito,M.delCarmenRostagno,P.Rostagno,P.Venuti,J.D.Haltigan,D.S. Messinger,Briefreport:atypicalexpressionofdistressduringtheseparation phaseofthestrangesituationprocedureininfantsiblingsathighriskforASD, J.AutismDev.Disord.(2013)1–6.

Riferimenti

Documenti correlati

However, before HU treatment is stopped, other risk factors for leg ulcers development should be carefully ruled out since cases of venous ulcers mimicking HU-induced lesions

Section 3 presents our data on institutions quality, the Ibrahim index of African Governance, and on resources endowment, where we construct a synthetic indicator of

In this article we draw on data from Edgeryders, a recent youth policy research project, to reflect on the extent to which online discussion platforms are useful instruments for

pathologic free iron (non-transferrin bound iron, NTBI) has been suggested in non-transfusion-dependent and transfusion-dependent hemolytic anemias [45,46] , our data highlight a

Quale sia il rapporto tra le discipline filosofiche e quelle teologiche è una questione dibattuta in ambito cristiano da quando è nata la teologia stessa (Kretzmann

Given that we, as well as others (29, 38), have found that mechanical and cold hypersensitivity evoked by paclitaxel was mediated by both TRPA1- and TRPV4-dependent mechanisms,

We computed the entanglement between an arbitrary interval and the rest of the system, obtaining analytic formulae in two prototypical examples: (i) the two initial macrostates