Dissertation/Thesis Abstract

Data-driven computer vision for science and the humanities
by Lee, Stefan, Ph.D., Indiana University, 2016, 130; 10153534
Abstract (Summary)

The rate at which humanity is producing visual data from both large-scale scientific imaging and consumer photography has been greatly accelerating in the past decade. This thesis is motivated by the hypothesis that this trend will necessarily change the face of observational science and the humanities, requiring the development of automated methods capable of distilling vast image collections to produce meaningful analyses. Such methods are needed to empower novel science both by improving throughput in traditionally quantitative disciplines and by developing new techniques to study culture through large scale image datasets.

When computer vision or machine learning in general is leveraged to aid academic inquiry, it is important to consider the impact of erroneous solutions produced by implicit ambiguity or model approximations. To that end, we argue for the importance of algorithms that are capable of generating multiple solutions and producing measures of confidence. In addition to providing solutions to a number of multi-disciplinary problems, this thesis develops techniques to address these overarching themes of confidence estimation and solution diversity.

This thesis investigates a diverse set of problems across a broad range of studies including glaciology, developmental psychology, architectural history, and demography to develop and adapt computer vision algorithms to solve these domain-specific applications. We begin by proposing vision techniques for automatically analyzing aerial radar imagery of polar ice sheets while simultaneously providing glaciologists with point-wise estimates of solution confidence. We then move to psychology, introducing novel recognition techniques to produce robust hand localizations and segmentations in egocentric video to empower psychologists studying child development with automated annotations of grasping behaviors integral to learning. We then investigate novel large-scale analysis for architectural history, leveraging tens of thousands of publicly available images to identify and track distinctive architectural elements. Finally, we show how rich estimates of demographic and geographic properties can be predicted from a single photograph.

Indexing (document details)
Advisor: Crandall, David J.
Commitee: Huang, Chunfeng, Radivojac, Predrag, Ryoo, Michael
School: Indiana University
Department: Computer Sciences
School Location: United States -- Indiana
Source: DAI-B 78/03(E), Dissertation Abstracts International
Subjects: Artificial intelligence, Computer science
Keywords: Computer vision, Convolutional neural networks, Probabilistic graphical models
Publication Number: 10153534
ISBN: 978-1-369-08669-0
Copyright © 2020 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy