Steven  Lulich Profile Picture

Steven Lulich

  • slulich@indiana.edu
  • 200 S. Jordan Ave, C-100
  • (812) 855-4202
  • Assistant Professor
    Speech and Hearing Sciences

Field of study

  • Speech sciences

Representative publications

Subglottal resonances and distinctive features (2010)
Steven M Lulich
Journal of Phonetics, 38 (1), 20-32

This paper addresses the phonetic basis of the distinctive feature [±back]. The second subglottal resonance (Sg2) is known to fall near the boundary between [−back] and [+back] vowels, and it has been claimed that Sg2 actually defines this distinction. In this paper, new evidence in support of this hypothesis is presented from 14 adult and 9 child speakers of American English, in which accelerometer recordings of subglottal acoustics were made simultaneously with speech recordings. The first three formants and the second subglottal resonance were measured, and both Sg2 and F3–3.5 bark were tested as boundaries between front and back vowels in the F2-dimension. It was found that Sg2 provides a reliable boundary between front and back vowels for children of all ages, as well as for adults, whereas F3–3.5 bark provides a similarly reliable boundary only for older children and adults. Furthermore, a study of …

Subglottal resonances and vowel formant variability: A case study of high German monophthongs and Swabian diphthongs (2008)
Andreas Madsack, Steven M Lulich, Wolfgang Wokurek and Grzegorz Dogil
Proc. LabPhon, 11 91-92

Recent studies have shown that subglottal resonances can cause discontinuities in formant trajectories (Chi & Sonderegger, 2007), are salient in speech perception (Lulich, Bachrach & Malyska, 2007), and useful in speaker normalization (Wang, Alwan & Lulich, 2008), suggesting that variability in the spectral characteristics of speech is constrained in ways not previously noticed. Specifically, it is argued that 1) for the same sound produced in different contexts or at different times, formants are free to vary, but only within frequency bands that are defined by the subglottal resonances; and 2) for sounds which differ by certain distinctive features, certain formants must be in different frequency bands. For instance, given several productions of the front vowel [æ], the second formant (F2) is free to vary only within the band between the second and third subglottal resonances (Sg2 and Sg3), but in the back vowel [a] F2 must …

Subglottal resonances of adult male and female native speakers of American English (2012)
Steven M Lulich, John R Morton, Harish Arsikere, Mitchell S Sommers, Gary KF Leung and Abeer Alwan
The Journal of the Acoustical Society of America, 132 (4), 2592-2602

This paper presents a large-scale study of subglottal resonances (SGRs) (the resonant frequencies of the tracheo-bronchial tree) and their relations to various acoustical and physiological characteristics of speakers. The paper presents data from a corpus of simultaneous microphone and accelerometer recordings of consonant-vowel-consonant (CVC) words embedded in a carrier phrase spoken by 25 male and 25 female native speakers of American English ranging in age from 18 to 24 yr. The corpus contains 17 500 utterances of 14 American English monophthongs, diphthongs, and the rhotic approximant [ɹ] in various CVC contexts. Only monophthongs are analyzed in this paper. Speaker height and age were also recorded. Findings include (1) normative data on the frequency distribution of SGRs for young adults, (2) the dependence of SGRs on height, (3) the lack of a correlation between SGRs and …

Resonances and wave propagation velocity in the subglottal airways (2011)
Steven M Lulich, Abeer Alwan, Harish Arsikere, John R Morton and Mitchell S Sommers
The Journal of the Acoustical Society of America, 130 (4), 2108-2115

Previous studies of subglottal resonances have reported findings based on relatively few subjects, and the relations between these resonances, subglottal anatomy, and models of subglottal acoustics are not well understood. In this study, accelerometer signals of subglottal acoustics recorded during sustained [a:] vowels of 50 adult native speakers (25 males, 25 females) of American English were analyzed. The study confirms that a simple uniform tube model of subglottal airways, closed at the glottis and open at the inferior end, is appropriate for describing subglottal resonances. The main findings of the study are (1) whereas the walls may be considered rigid in the frequency range of Sg2 and Sg3, they are yielding and resonant in the frequency range of Sg1, with a resulting ∼ 4/3 increase in wave propagation velocity and, consequently, in the frequency of Sg1; (2) the “acoustic length” of the equivalent uniform …

Automatic detection of the second subglottal resonance and its application to speaker normalization (2009)
Shizhen Wang, Steven M Lulich and Abeer Alwan
The Journal of the Acoustical Society of America, 126 (6), 3268-3277

Speaker normalization typically focuses on inter-speaker variabilities of the supraglottal (vocal tract) resonances, which constitute a major cause of spectral mismatch. Recent studies have shown that the subglottal airways also affect spectral properties of speech sounds, and promising results were reported using the subglottal resonances for speaker normalization. This paper proposes a reliable algorithm to automatically estimate the second subglottal resonance (Sg2) from speech signals. The algorithm is calibrated on children’s speech data with simultaneous accelerometer recordings from which Sg2 frequencies can be directly measured. A cross-language study with bilingual Spanish-English children is performed to investigate whether Sg2 frequencies are independent of speech content and language. The study verifies that Sg2 is approximately constant for a given speaker and thus can be a good candidate …

Speaker normalization based on subglottal resonances (2008)
Shizhen Wang, Abeer Alwan and Steven M Lulich
IEEE. 4277-4280

Speaker normalization typically focuses on variabilities of the supra-glottal (vocal tract) resonances, which constitute a major cause of spectral mismatch. Recent studies show that the subglottal airways also affect spectral properties of speech sounds. This paper presents a speaker normalization method based on estimating the second and third subglottal resonances. Since the subglottal airways do not change for a specific speaker, the subglottal resonances are independent of the sound type (i.e., vowel, consonant, etc.) and remain constant for a given speaker. This context-free property makes the proposed method suitable for limited data speaker adaptation. This method is computationally more efficient than maximum-likelihood based VTLN, with performance better than VTLN especially for limited adaptation data. Experimental results confirm that this method performs well in a variety of testing conditions and …

A role for the second subglottal resonance in lexical access (2007)
Steven M Lulich, Asaf Bachrach and Nicolas Malyska
The Journal of the Acoustical Society of America, 122 (4), 2320-2327

Acoustic coupling between the vocal tract and the lower (subglottal) airway results in the introduction of pole-zero pairs corresponding to resonances of the uncoupled lower airway. If the second formant (F2) passes through the second subglottal resonance a discontinuity in amplitude occurs. This work explores the hypothesis that this F2 discontinuity affects how listeners perceive the distinctive feature [back] in transitions from a front vowel (high F2) to a labial stop (low F2). Two versions of the utterances “apter” and “up there” were synthesized with an F2 discontinuity at different locations in the initial VC transition. Subjects heard portions of the utterances with and without the discontinuity, and were asked to identify whether the utterances were real words or not. Results show that the frequency of the F2 discontinuity in an utterance influences the perception of backness in the vowel. Discontinuities of this sort are …

Automatic estimation of the first three subglottal resonances from adults’ speech signals with application to speaker height estimation (2013)
Harish Arsikere, Gary KF Leung, Steven M Lulich and Abeer Alwan
Speech Communication, 55 (1), 51-70

Recent research has demonstrated the usefulness of subglottal resonances (SGRs) in speaker normalization. However, existing algorithms for estimating SGRs from speech signals have limited applicability—they are effective with isolated vowels only. This paper proposes a novel algorithm for estimating the first three SGRs (Sg 1, Sg 2 and Sg 3) from continuous adults’ speech. While Sg 1 and Sg 2 are estimated based on the phonological distinction they provide between vowel categories, Sg 3 is estimated based on its correlation with Sg 2. The RMS estimation errors (approximately 30, 60 and 100 Hz for Sg 1, Sg 2 and Sg 3, respectively) are not only comparable to the standard deviations in the measurements, but also are independent of vowel content and language (English and Spanish). Since SGRs correlate with speaker height while remaining roughly constant for a given speaker (unlike vocal tract …

Error analysis of extracted tongue contours from 2D ultrasound images (2015)
Tamás Gábor Csapó and Steven M Lulich

The goal of this study was to characterize errors involved in obtaining midsagittal tongue contours from two-dimensional ultrasound image sequences. Toward that end, two basic experiments were conducted. First, manual tongue contours were obtained from 1,145 tongue ultrasound images recorded from four speakers during production of the sentence ‘I owe you a yoyo’, and the uncertainty associated with the contours was quantified. Second, tongue contours from the same images were obtained using the EdgeTrak, TongueTrack, and AutoTrace algorithms, and these were compared quantitatively with the manual tongue contours. Three basic error types associated with the tongue contours are identified, indicating areas in need of improvement in future algorithmic developments. Depending on the speaker, RMS errors for the algorithmically obtained contours ranged from 1.76 to 7.11 mm, and the standard …

A reliable technique for detecting the second subglottal resonance and its use in cross-language speaker adaptation (2008)
Shizhen Wang, Steven M Lulich and Abeer Alwan

In previous work [1], we proposed a speaker adaptation technique based on the second subglottal resonance (Sg2), which showed good performance relative to vocal tract length normalization (VTLN). In this paper, we propose a more reliable algorithm for automatically estimating Sg2 from speech signals. The algorithm is calibrated on children’s speech data collected simultaneously with accelerometer recordings from which Sg2 frequencies can be directly measured. To investigate whether Sg2 frequencies are independent of speech content and language, we perform a cross-language study with bilingual Spanish-English children. The study verifies that Sg2 is approximately constant for a given speaker and thus can be a good candidate for limited data speaker normalization and cross-language adaptation. We then present a cross-language speaker normalization method based on Sg2, which is …

Relation of formants and subglottal resonances in Hungarian vowels (2009)
Tamás Gábor Csapó, Zsuzsanna Bárkányi, Tekla Etelka Gráczi, Tamás Bőhm and Steven M Lulich

The relation between vowel formants and subglottal resonances (SGRs) has previously been explored in English, German, and Korean. Results from these studies indicate that vowel classes are categorically separated by SGRs. We extended this work to Hungarian vowels, which have not been related to SGRs before. The Hungarian vowel system contains paired long and short vowels as well as a series of front rounded vowels, similar to German but more complex than English and Korean. Results indicate that SGRs separate vowel classes in Hungarian as in English, German, and Korean, and uncover additional patterns of vowel formants relative to the third subglottal resonance (Sg3). These results have implications for understanding phonological distinctive features, and applications in automatic speech technologies.

Estimating Speaker Height and Subglottal Resonances Using MFCCs and GMMs (2014)
Harish Arsikere, Steven M Lulich and Abeer Alwan
IEEE. 21 (2), 159-162

This letter investigates the use of MFCCs and GMMs for 1) improving the state of the art in speaker height estimation, and 2) rapid estimation of subglottal resonances (SGRs) without relying on formant and pitch tracking (unlike our previous algorithm in [1]). The proposed system comprises a set of height-dependent GMMs modeling static and dynamic MFCC features, where each GMM is associated with a height value. Furthermore, since SGRs and height are correlated, each GMM is also associated with a set of SGR values (known a priori). Given a speech sample, speaker height and SGRs are estimated as weighted combinations of the values corresponding to the N most-likely GMMs. We assess the importance of using dynamic MFCC features and the weighted decision rule, and demonstrate the efficacy of our approach via experiments on height estimation (using TIMIT) and SGR estimation (using the Tracheal …

Automatic height estimation using the second subglottal resonance (2012)
Harish Arsikere, Gary KF Leung, Steven M Lulich and Abeer Alwan
IEEE. 3989-3992

This paper presents an algorithm for automatically estimating speaker height. It is based on: (1) a recently-proposed model of the subglottal system that explains the inverse relation observed between subglottal resonances and height, and (2) an improved version of our previous algorithm for automatically estimating the second subglottal resonance (Sg2). The improved Sg2 estimation algorithm was trained and evaluated on recently-collected data from 30 and 20 adult speakers, respectively. Sg2 estimation error was found to reduce by 29%, on average, as compared to the previous algorithm. The height estimation algorithm, employing the inverse relation between Sg2 and height, was trained on data from the above-mentioned 50 adults. It was evaluated on 563 adult speakers in the TIMIT corpus, and the mean absolute height estimation error was found to be less than 5.6cm.

Tracheo-bronchial soft tissue and cartilage resonances in the subglottal acoustic input impedance (2015)
Steven M Lulich and Harish Arsikere
The Journal of the Acoustical Society of America, 137 (6), 3436-3446

This paper offers a re-evaluation of the mechanical properties of the tracheo-bronchial soft tissues and cartilage and uses a model to examine their effects on the subglottal acoustic input impedance. It is shown that the values for soft tissue elastance and cartilage viscosity typically used in models of subglottal acoustics during phonation are not accurate, and corrected values are proposed. The calculated subglottal acoustic input impedance using these corrected values reveals clusters of weak resonances due to soft tissues (SgT) and cartilage (SgC) lining the walls of the trachea and large bronchi, which can be observed empirically in subglottal acoustic spectra. The model predicts that individuals may exhibit SgT and SgC resonances to variable degrees, depending on a number of factors including tissue mechanical properties and the dimensions of the trachea and large bronchi. Potential implications for voice …

Analysis and automatic estimation of children's subglottal resonances (2011)
Steven M Lulich, Harish Arsikere, John R Morton, Gary KF Leung, Abeer Alwan and Mitchell S Sommers

Models and measurements of subglottal resonances are generally made from adult data, but there are several applications in which it would be useful to know about subglottal resonances in children. We therefore conducted an analysis of both new and old recordings of children’s subglottal acoustics in order 1) to produce a fuller picture of the variability of children’s subglottal resonances, and 2) to confirm that existing models of subglottal acoustics can be reasonably applied to children. We also tested the effectiveness of recent algorithms for estimating children’s subglottal resonances from speech formants and the fundamental frequency, which were originally formulated based on adult data. It was found that these algorithms are effective for children at least 150cm tall.

Edit your profile