The rate at which humanity is producing visual data from both large-scale scientific imaging and consumer photography has been greatly accelerating in the past decade. This thesis is motivated by the hypothesis that this trend will necessarily change the face of observational science and the humanities, requiring the development of automated methods capable of distilling vast image collections to produce meaningful analyses. Such methods are needed to empower novel science both by improving throughput in traditionally quantitative disciplines and by developing new techniques to study culture through large scale image datasets.
When computer vision or machine learning in general is leveraged to aid academic inquiry, it is important to consider the impact of erroneous solutions produced by implicit ambiguity or model approximations. To that end, we argue for the importance of algorithms that are capable of generating multiple solutions and producing measures of confidence. In addition to providing solutions to a number of multi-disciplinary problems, this thesis develops techniques to address these overarching themes of confidence estimation and solution diversity.
This thesis investigates a diverse set of problems across a broad range of studies including glaciology, developmental psychology, architectural history, and demography to develop and adapt computer vision algorithms to solve these domain-specific applications. We begin by proposing vision techniques for automatically analyzing aerial radar imagery of polar ice sheets while simultaneously providing glaciologists with point-wise estimates of solution confidence. We then move to psychology, introducing novel recognition techniques to produce robust hand localizations and segmentations in egocentric video to empower psychologists studying child development with automated annotations of grasping behaviors integral to learning. We then investigate novel large-scale analysis for architectural history, leveraging tens of thousands of publicly available images to identify and track distinctive architectural elements. Finally, we show how rich estimates of demographic and geographic properties can be predicted from a single photograph.
|Advisor:||Crandall, David J.|
|Commitee:||Huang, Chunfeng, Radivojac, Predrag, Ryoo, Michael|
|School Location:||United States -- Indiana|
|Source:||DAI-B 78/03(E), Dissertation Abstracts International|
|Subjects:||Artificial intelligence, Computer science|
|Keywords:||Computer vision, Convolutional neural networks, Probabilistic graphical models|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be