Dissertation/Thesis Abstract

Statistical methods of latent structure discovery in child-directed speech
by Panteleyeva, Natalya B., Ph.D., Indiana University, 2010, 161; 3439587
Abstract (Summary)

This dissertation investigates how distributional information in the speech stream can assist infants in the initial stages of acquisition of their native language phonology. An exploratory statistical analysis derives this information from the adult speech data in the corpus of conversations between adults and young children in Russian. Because the data were recorded in the traditional Russian orthography and spelling, the statistical analysis was performed on the symbolic data. International Phonetic Alphabet ensured uniformity of the speech sound representation as well as enabled the interpretation of the symbols as unique sets of distinctive feature values.

Although the mechanism of information extraction employed in this work may vary with the representation, the objective of the study was to demonstrate that linguistically relevant information available to infants in the speech signal is not limited to the surface form, which they observe directly. The regularities in the distribution of the bigrams in the data reveal considerable predictability of occurrence in those characteristics of the speech sounds that information theory treats as unpredictable. Co-occurrence of distinctive features prominent in Russian provides a perceptually salient context for their identification and learning.

In addition to suggesting a progression for the initial stages in the acquisition of the phonology of the Russian language, this dissertation contributes to the discussion of the degree to which human language is innate. While the innateness hypothesis is consistent with the data, the results suggest that it may be too strong. If the phonological features were inborn rather than acquired, humans would have had a specialized mechanism enabling their perception and production with a high degree of consistency across different languages. We would have been predisposed to recognize them in any phonetic environment, which means that their distribution in the speech signal can be quite random. An observed pattern is necessary from an empirical perspective, in which the distinctive features are learned from the input possibly through association.

Indexing (document details)
Advisor: Paolillo, John C., Gasser, Michael E.
Commitee: Cavar, Damir, Jones, Michael N.
School: Indiana University
Department: Computer Sciences
School Location: United States -- Indiana
Source: DAI-B 72/03, Dissertation Abstracts International
Subjects: Linguistics, Artificial intelligence, Computer science
Keywords: Computational phonology, Corpus statistics, Hierarchical clustering, Language acquisition, Pattern recognition
Publication Number: 3439587
ISBN: 978-1-124-44797-1
Copyright © 2021 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy