Dissertation/Thesis Abstract

Data-Driven Crosslinguistic Modeling of Constituent Ordering Preferences
by Liu, Zoey (Ying), Ph.D., University of California, Davis, 2020, 193; 28091868
Abstract (Summary)

Why are languages the way they are? In this dissertation, I take up this question with a focus on crosslinguistic constituent orderings. Specifically, borrowing insights from language processing and language evolution, I ask what abstract constraints as well as idiosyncratic biases govern language users’ choice among grammatical alternatives of the same syntactic constructions across genres and languages. Adopting a data-driven approach, I explore three directions in particular. First, from Chapter 3 to Chapter 6, taking advantage of large-scale multilingual corpora, I investigate and quantify the roles of numerous factors that are motivated by long-standing linguistic theories as well as previous empirical findings in word order preferences. I show that while the effect of individual factors depends on the ordering structures of different languages, generally the predictive power and direction of these constraints are more dependent on whether the orderings are in the preverbal or the postverbal domains. In addition, besides these abstract constraints that yield probabilistic typological tendencies, in Chapter 7 I ask why language users have idiosyncratic ordering preferences and how regularization of this idiosyncrasy arises diachronically, using Bayesian iterated learning models that simulate the process of language change. Lastly, I adopt the theoretical framework of dependency syntax to develop a dependency treebank for Hupa, an endangered Dene language of northwestern California, as a way to formalize and model the syntax of indigenous languages.

Indexing (document details)
Advisor: Sagae, Kenji
Commitee: Hawkins, John A., Morgan, Emily
School: University of California, Davis
Department: Linguistics
School Location: United States -- California
Source: DAI-A 82/5(E), Dissertation Abstracts International
Source Type: DISSERTATION
Subjects: Linguistics
Keywords: Computational methods, Multilingual corpora, Syntactic typology
Publication Number: 28091868
ISBN: 9798698524397
Copyright © 2021 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy
ProQuest