Citation Guidelines#

Citing pyLeTalker#

To cite pyLeTalker as a software:

Ikuma, T. (2025). pyLeTalker: wave-reflection voice synthesis framework [Computer program]. Version 0.1, tikuma-lsuhsc/pyLeTalker

@misc{pyletalker,
author = {Takeshi Ikuma},
title = {pyLeTalker: wave-reflection voice synthesis framework (v.|version|)},
year  = {2025},
url   = {url{tikuma-lsuhsc/pyLeTalker}},
}

There is a manuscript introducing pyLeTalker currently under preparation.

Citing computational models used in pyLeTalker#

The various numerical models that are included in this library were developed mostly by other researchers. It is more important for academic studies to cite their papers as pyLeTalker is merely a tool. Please use the list below as a guideline to choose appropriate papers for the pyLeTatker element classes that you employed.

LeTalkerVocalTract (the default vocal tract model)#

[Lil85]

J. Liljencrants. Speech Synthesis with Reflection-Type Line Analog. PhD thesis, Royal Institute of Technology, Stockholm, Sweden, 1985.

[Sto95]

Brad Story. Physiologically-Based Speech Simulation Using an Enhanced Wave-Reflection Model of the Vocal Tract. PhD thesis, University of Iowa, Iowa City, IA, May 1995.

[STH96]

Brad H. Story, Ingo R. Titze, and Eric A. Hoffman. Vocal tract area functions from magnetic resonance imaging. J. Acoust. Soc. Am., 100(1):537–554, July 1996. doi:10.1121/1.415960.

Note#

  • Cite Story et al. [STH96] only if the default vocal tract cross-sectional areas are used (i.e., specified the default vowels).

LeTalkerVocalFolds#

[ST95]

Brad H. Story and Ingo R. Titze. Voice simulation with a body-cover model of the vocal folds. J. Acoust. Soc. Am., 97(2):1249–1260, February 1995. doi:10.1121/1.412234.

[Tit02]

Ingo R. Titze. Regulating glottal airflow in phonation: Application of the maximum power transfer theorem to a low dimensional phonation model. J. Acoust. Soc. Am., 111(1):367–376, January 2002. doi:10.1121/1.1417526.

[TS02]

Ingo R. Titze and Brad H. Story. Rules for controlling low-dimensional vocal fold models with muscle activation. J. Acoust. Soc. Am., 112(3):1064–1076, September 2002. doi:10.1121/1.1496080.

KinematicVocalFolds or sim_kinematic()#

[Tit84]

Ingo R. Titze. Parameterization of the glottal area, glottal flow, and vocal fold contact area. J. Acoust. Soc. Am., 75(2):570–580, 1984. doi:10.1121/1.390530.

[Tit89a]

Ingo R. Titze. On the relation between subglottal pressure and fundamental frequency in phonation. J. Acoust. Soc. Am., 85(2):901–906, February 1989. doi:10.1121/1.397562.

[Tit89b]

Ingo R. Titze. Physiologic and acoustic differences between male and female voices. J. Acoust. Soc. Am., 85(4):1699–1707, April 1989. doi:10.1121/1.397959.

VocalFoldsUg & VocalFoldsAg#

[Tit84]

Ingo R. Titze. Parameterization of the glottal area, glottal flow, and vocal fold contact area. J. Acoust. Soc. Am., 75(2):570–580, 1984. doi:10.1121/1.390530.

LeTalkerAspirationNoise#

[SS11]

Robin A. Samlan and Brad H. Story. Relation of structural and vibratory kinematics of the vocal folds to two acoustic measures of breathy voice based on computational modeling. J. Speech Lang. Hear. Res., 54(5):1267–1283, October 2011. doi:10.1044/1092-4388(2011/10-0195).

Note#

This aspiration noise model automatically used if a vocal-fold model is simulated with aspiration_noise=True.

KlattAspirationNoise#

[KK90]

Dennis H. Klatt and Laura C. Klatt. Analysis, synthesis, and perception of voice quality variations among female and male talkers. J. Acoust. Soc. Am., 87(2):820–857, 1990. doi:10.1121/1.398894.

FlutterGenerator#

[KK90]

Dennis H. Klatt and Laura C. Klatt. Analysis, synthesis, and perception of voice quality variations among female and male talkers. J. Acoust. Soc. Am., 87(2):820–857, 1990. doi:10.1121/1.398894.

Use of ModulatedSineGenerator for subharmonic vocal fold modulation with KinematicVocalFolds#

[IKSM25]

Takeshi Ikuma, Melda Kunduk, Brad Story, and Andrew J. McWhorter. Towards detecting the pathological subharmonic voicing with fully convolutional neural networks. January 2025. arXiv:2501.09159, doi:10.48550/arXiv.2501.09159.