The explosion of data has made it crucial to analyze the data and distill important information effectively and efficiently. A significant part of such data is presented in unstructured and free-text documents. This has prompted the development of the techniques for information extraction that allow computers to automatically extract structured information from the natural free-text data. Information extraction is a branch of natural language processing in artificial intelligence that has a wide range of applications, including question answering, knowledge base population, information retrieval etc. The traditional approach for information extraction has mainly involved hand-designing large feature sets (feature engineering) for different information extraction problems, i.e, entity mention detection, relation extraction, coreference resolution, event extraction, and entity linking. This approach is limited by the laborious and expensive effort required for feature engineering for different domains, and suffers from the unseen word/feature problem of natural languages.
This dissertation explores a different approach for information extraction that uses deep learning to automate the representation learning process and generate more effective features. Deep learning is a subfield of machine learning that uses multiple layers of connections to reveal the underlying representations of data. I develop the fundamental deep learning models for information extraction problems and demonstrate their benefits through systematic experiments.
First, I examine word embeddings, a general word representation that is produced by training a deep learning model on a large unlabelled dataset. I introduce methods to use word embeddings to obtain new features that generalize well across domains for relation extraction. This is done for both the feature-based method and the kernel-based method of relation extraction.
Second, I investigate deep learning models for different problems, including entity mention detection, relation extraction and event detection. I develop new mechanisms and network architectures that allow deep learning to model the structures of information extraction problems more effectively. Some extensive experiments are conducted on the domain adaptation and transfer learning settings to highlight the generalization advantage of the deep learning models for information extraction.
Finally, I investigate the joint frameworks to simultaneously solve several information extraction problems and benefit from the inter-dependencies among these problems. I design a novel memory augmented network for deep learning to properly exploit such inter-dependencies. I demonstrate the effectiveness of this network on two important problems of information extraction, i.e, event extraction and entity linking.
|Commitee:||Cho, Kyunghyun, Davis, Ernest, Florian, Radu, LeCun, Yann, Meyers, Adam|
|School:||New York University|
|School Location:||United States -- New York|
|Source:||DAI-B 79/08(E), Dissertation Abstracts International|
|Subjects:||Information science, Artificial intelligence, Computer science|
|Keywords:||Deep learning, Domain adaptation, Entity linking, Event extraction, Information extraction, Relation extraction|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be