<  Retour au portail Polytechnique Montréal

Sigma-lognormal modeling of speech

C. Carmona-Duarte, M. A. Ferrer, Réjean Plamondon, A. Gómez-Rodellar et P. Gómez-Vilda

Article de revue (2021)

Document en libre accès dans PolyPublie et chez l'éditeur officiel
[img]
Affichage préliminaire
Libre accès au plein texte de ce document
Version officielle de l'éditeur
Conditions d'utilisation: Creative Commons: Attribution (CC BY)
Télécharger (1MB)
Afficher le résumé
Cacher le résumé

Abstract

Human movement studies and analyses have been fundamental in many scientific domains, ranging from neuroscience to education, pattern recognition to robotics, health care to sports, and beyond. Previous speech motor models were proposed to understand how speech movement is produced and how the resulting speech varies when some parameters are changed. However, the inverse approach, in which the muscular response parameters and the subject's age are derived from real continuous speech, is not possible with such models. Instead, in the handwriting field, the kinematic theory of rapid human movements and its associated Sigma-lognormal model have been applied successfully to obtain the muscular response parameters. This work presents a speech kinematics-based model that can be used to study, analyze, and reconstruct complex speech kinematics in a simplified manner. A method based on the kinematic theory of rapid human movements and its associated Sigma-lognormal model are applied to describe and to parameterize the asymptotic impulse response of the neuromuscular networks involved in speech as a response to a neuromotor command. The method used to carry out transformations from formants to a movement observation is also presented. Experiments carried out with the (English) VTR-TIMIT database and the (German) Saarbrucken Voice Database, including people of different ages, with and without laryngeal pathologies, corroborate the link between the extracted parameters and aging, on the one hand, and the proportion between the first and second formants required in applying the kinematic theory of rapid human movements, on the other. The results should drive innovative developments in the modeling and understanding of speech kinematics.

Mots clés

Speech processing, Kinematic theory of rapid human movements, Sigma-lognormal model, Speech kinematics, Aging, Modeling of the neuromotor system

Sujet(s): 2500 Génie électrique et électronique > 2500 Génie électrique et électronique
2800 Intelligence artificielle > 2801 Langage naturel et reconnaissance de la parole
Département: Département de génie électrique
Centre de recherche: Laboratoire Scribens
Organismes subventionnaires: CRSNG/NSERC, Spanish government’s MIMECO TEC2016-77791 - Research project, European Union FEDER program/funds, Teca-Park/MonParLoc FGCSIC CENIE-0348_CIE_6_E (InterReg Programme), Juan de la Cierva contract, ULPGC - Viera y Clavijo Grant, Spanish government - José Castillejo” Mobility Grant
Numéro de subvention: RGPIN-2015-06409, IJCI-2016-27682, CAS18/00315
URL de PolyPublie: https://publications.polymtl.ca/9264/
Titre de la revue: Cognitive Computation (vol. 13, no 2)
Maison d'édition: Springer Nature
DOI: 10.1007/s12559-020-09803-8
URL officielle: https://doi.org/10.1007/s12559-020-09803-8
Date du dépôt: 24 mars 2022 11:44
Dernière modification: 05 avr. 2024 15:23
Citer en APA 7: Carmona-Duarte, C., Ferrer, M. A., Plamondon, R., Gómez-Rodellar, A., & Gómez-Vilda, P. (2021). Sigma-lognormal modeling of speech. Cognitive Computation, 13(2), 488-503. https://doi.org/10.1007/s12559-020-09803-8

Statistiques

Total des téléchargements à partir de PolyPublie

Téléchargements par année

Provenance des téléchargements

Dimensions

Actions réservées au personnel

Afficher document Afficher document