A joint model of word segmentation and phonological variation for English word-final /t/-deletion
Benjamin Börschinger, Mark Johnson and Katherine Demuth
The 51st Annual Meeting of the Association for Computational Linguistics (ACL 2013)
Sofia, Bulgaria, August 4-9, 2013
Word-final /t/-deletion refers to a common phenomenon in spoken English where words such as /wEst/ “west” are pronounced as [wEs] “wes” in certain contexts. Phonological variation like this is common in naturally occurring speech. Current computational models of unsupervised word segmentation usually assume idealized input that is devoid of these kinds of variation. We extend a non-parametric model of word segmentation by adding phonological rules that map from underlying forms to surface forms to produce a mathematically well-defined joint model as a first step towards handling variation and segmentation in a single model. We analyse how our model handles /t/-deletion on a large corpus of transcribed speech, and show that the joint model can perform word segmentation and recover underlying /t/s. We find that Bigram dependencies are important for performing well on real data and for learning appropriate deletion probabilities for different contexts.
Conference Manager (V2.61.0 - Rev. 2792M)