This dissertation investigates how distributional information in the speech stream can assist infants in the initial stages of acquisition of their native language phonology. An exploratory statistical analysis derives this information from the adult speech data in the corpus of conversations between adults and young children in Russian. Because the data were recorded in the traditional Russian orthography and spelling, the statistical analysis was performed on the symbolic data. International Phonetic Alphabet ensured uniformity of the speech sound representation as well as enabled the interpretation of the symbols as unique sets of distinctive feature values.
Although the mechanism of information extraction employed in this work may vary with the representation, the objective of the study was to demonstrate that linguistically relevant information available to infants in the speech signal is not limited to the surface form, which they observe directly. The regularities in the distribution of the bigrams in the data reveal considerable predictability of occurrence in those characteristics of the speech sounds that information theory treats as unpredictable. Co-occurrence of distinctive features prominent in Russian provides a perceptually salient context for their identification and learning.
In addition to suggesting a progression for the initial stages in the acquisition of the phonology of the Russian language, this dissertation contributes to the discussion of the degree to which human language is innate. While the innateness hypothesis is consistent with the data, the results suggest that it may be too strong. If the phonological features were inborn rather than acquired, humans would have had a specialized mechanism enabling their perception and production with a high degree of consistency across different languages. We would have been predisposed to recognize them in any phonetic environment, which means that their distribution in the speech signal can be quite random. An observed pattern is necessary from an empirical perspective, in which the distinctive features are learned from the input possibly through association.
|Advisor:||Paolillo, John C., Gasser, Michael E.|
|Commitee:||Cavar, Damir, Jones, Michael N.|
|School Location:||United States -- Indiana|
|Source:||DAI-B 72/03, Dissertation Abstracts International|
|Subjects:||Linguistics, Artificial intelligence, Computer science|
|Keywords:||Computational phonology, Corpus statistics, Hierarchical clustering, Language acquisition, Pattern recognition|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be