Voice dosimetry in 92 call center operators

(1)

Abstract* - The voice is a primary work tool for call center operators, but the main risk factors for voice disorders in this category have not yet been clarified. This study aimed to analyse the vocal behaviour in call center operators and to search for correlations between the daily voice dose and self-perceived voice-related handicap.

Ninety-three subjects (25 males, 68 females, aged 24-50) underwent ambulatory phonation monitoring during a working day and were administered a general questionnaire (concerning smoking habits, symptoms, extra-work activities) and the Voice Handicap Index questionnaire (VHI).

The recorded vocal doses showed wide inter-subject variability, both at work and during non-work hours. The mean percentage phonation time (PT) during work was 14.74 and ranged from 4 to 31%. The average voice amplitude was higher in subjects with longer phonation time and higher F0. This finding indicates that “intensive talkers” also tend to use a higher voice volume.

The VHI score (mean 13.6 ± 12.2) was not related to the number of work hours, indicating that work time is not a critical factor in causing the perception of voice fatigue. The mean PT was 87.5 minutes (range 17-186 minutes) and was not correlated with age, gender, number of work hours, symptoms, extra-professional voice use and VHI scores. The mean amplitude was significantly higher in subjects with longer PT (p < 0.001). PT during work was related to the number of work hours, but no correlation was found between the PT of the whole recording day and the number of work hours.

In conclusion, our study demonstrates that the number of work hours and the percentage PT are not statistically related to the perception of voice disturbances.

Our data show that “safety” limits of vocal load in the call center setting cannot be clearly defined.

In analogy with previous findings on teachers, we postulate that constitutional and psycho-emotional

features might be relevant risk factors for the development of voice pathologies.

Keywords – voice dosimetry, occupational voice, voice handicap index

REFERENCES

[1] Hillman RE, Heaton JT, Masaki A, et al. Ambulatory monitoring of disordered voices. Ann Otol Rhinol Laryngol 2006;115:795-801.

[2] Cheyne HA, Hanson HM, Genereux RP, et al. Development and testing of a portable vocal accumulator. J Speech Lang Hear Res 2003;46:1457-67.

[3] Titze IR, Hunter EJ, Svec JG. Voicing and silence periods in daily and weekly vocalizations of teachers. J Acoust Soc Am 2007;121:469-78.

[4] Titze IR, Lemke J, Montequin D. Populations in the U.S. workforce who rely on voice as a primary tool of trade: a preliminary report. J Voice 1997;11:254-9. *Full paper withheld by authors’ request.

VOICE DOSIMETRY IN 92 CALL CENTER OPERATORS

Giovanna Cantarella

1

, Elisabetta Iofrida

1

, Paola Boria

2

,Simone Giordano

2

, Oriana Binatti

2

,

Lorenzo Pignataro

1,3

_,

_{Claudia Manfredi}

4

_{, Stella Forti}

5

_{, Philippe Dejonckere}

6

1_{Otolaryngology Department and}5_{Audiology Unit, Fondazione IRCCS Ca’ Granda Ospedale Maggiore Policlinico,} Milan, Italy

2_{Occupational Medicine, Private Practice, Milan, Italy}

3 _{Dipartimento di Scienze Chirurgiche Specialistiche, Università degli Studi di Milano, Milan, Italy} 4 _{Department of Information Engineering, Università degli Studi di Firenze, Florence, Italy}

6 _{Neurosciences,University of Leuven & Federal Institute for Occupational Diseases, Brussels, Belgium.}

Claudia Manfredi (edited by), Models and analysis of vocal emissions for biomedical applications : 8 th international workshop : December 16-18, 2013 ISBN 978-88-6655-469-1 (print) ISBN 978-88-6655-470-7 (online) © 2013 Firenze University Press

(2)

(3)

Session V:

MODELS AND ANALYSIS (II)

(4)

Abstract: In this paper, empirical mode decomposition

(EMD) is proposed as an alternative to decompose the log magnitude spectrum of the speech signal into its harmonic, envelope and noise components. The EMD-based approach used in this study incorporates an appropriate procedure that estimates automatically the thresholds used by the clustering algorithm without knowledge of the fundamental frequency. The frequency range of the harmonic and noise components is divided into ten equally-spaced intervals and the harmonics-to-noise ratios (HNRs) within each interval are used as independent variables to summarize the amount of perceived hoarseness. The proposed method is evaluated on a corpus comprising 251 normophonic and dysphonic speakers.

Keywords: vocal dysperiodicities, empirical mode

decomposition, multiband analysis, disordered speech. I. INTRODUCTION

Acoustic analyses of disordered speech are of great importance for clinical evaluation of voice disorders because they are noninvasive and enable clinicians to monitor the progress of patients and document quantitatively the perceived degree of hoarseness. Despite the number of acoustic markers that have been proposed in the literature to characterize the speech of dysphonic speakers, finding reliable and accurate descriptors of voice function and voice quality is still an issue.

Recent approaches for vocal dysperiodicities

estimation have focused on continuous speech. In [1], the

performance of multi-band segmental

signal-to-dysperiodicity ratio has been investigated in terms of the correlation with scores of perceived hoarseness. It has been concluded that multi-band segmental signal-to-dysperiodicity ratio correlates more strongly with the perceptual assessment of the degree of hoarseness than the full-band analysis.

In [2], we proposed the empirical mode

decomposition (EMD) algorithm [3] as an alternative to decompose the log of the spectrum magnitude of the speech signal into its harmonic, envelope and noise

components. The acoustic cue named harmonic-to-noise ratio (HNR) has been used to summarize the degree of disturbance in the speech signal and consequently to evaluate the overall quality of the disordered voices produced by dysphonic speakers. The performance of acoustic analysis of speech by means of spectral acoustic cues obtained via empirical mode decomposition (EMD) of the log of the magnitude spectrum of the speech signal has been investigated in [4]. Experimental results have shown that the EMD-based approach results in a high correlation between HNR estimates and average perceived grade scores. In the method proposed in [2], the thresholds involved in the algorithm for IMF clustering have been fixed empirically. These thresholds are f0-dependent, so that, the method requires the estimation of the average fundamental frequency for each stimulus.

In the present study, the performance of multi-band vocal dysperiodicities analysis based on the empirical mode decomposition in terms of correlation of HNR estimates with the perceived degree of hoarseness is investigated. The empirical mode decomposition is applied to the log of the spectrum magnitude of the normalized speech signal to decompose it into its harmonic, envelope and noise components and multi-band analysis is carried out on the HNR. Compared to the method proposed in [2], the EMD-based approach used in this study incorporates an appropriate procedure that estimates automatically the thresholds used by the EMD algorithm without knowledge of the fundamental frequency.

II. METHODS A. Speech Components Separation

A voiced speech frame x(t) can be modeled as a periodic source component, e(t) convolved with the impulse response of the vocal tract, v(t) [5]:

x(t)=e(t)*v(t) (1)

where * denotes the convolution.

MULTIBAND VOCAL DYSPERIODICITIES ANALYSIS USING EMPIRICAL

MODE DECOMPOSITION IN THE LOG-SPECTRAL DOMAIN

A. Kacha

1

, F. Grenez

2

, J. Schoentgen

2,3

1

Laboratoire de Physique de Rayonnement et Applications, Université de Jijell, Jijel, Algeria 2

LIST Department, Universite´ Libre de Bruxelles, Brussels, Belgium 3

National Fund for Scientific Research, Belgium [email protected], [email protected], [email protected]