How does the visual system determine when changes to an image are unnatural (image distortions), how does it weight different types of distortions, and where are these computations carried out in the brain? These questions have plagued neuroscientists, psychologists, and engineers alike for several decades. Different academic communities have approached the problem from different directions, with varying degrees of success. The one thing that all groups agree on is that there is value in knowing the answer to the question. Models that appropriately capture human sensitivity to image distortions can be used as a stand in for human observers in order to optimize any algorithm in which fidelity to human perception is necessary (i.e. image and video compression).
In this thesis, we approach the problem by building models informed and constrained by both visual physiology, and the statistics of natural images, and train them to match human psychophysical judgments about image distortions. We then develop a novel synthesis method that forces the models to make testable predictions, and quantify the quality of those predictions with human psychophysics. Because our approach links physiology and perception, it allows us to pinpoint what elements of physiology are necessary to capture human sensitivity to image distortions. We consider several different models of the visual system, some developed from known neural physiology, and some inspired by recent breakthroughs in artificial intelligence (deep neural networks trained to recognize objects within images at human performance levels). We show that models inspired by early brain areas (retina and LGN) consistently capture human sensitivity to image distortions better than both the state of the art, and better than competing models of the visual system. We argue that divisive normalization, a ubiquitous computation in the visual system, is integral to correctly capturing human sensitivity.
After establishing that our models of the retina and the LGN outperform all other tested models, we develop a novel framework for optimally rendering images on any display for human observers. We show that a model of this kind can be used as a stand in for human observers within this optimization framework, and produces images that are better than other state of the art algorithms. We also show that other tested models fail as a stand in for human observers within this framework.
Finally, we propose and test a normative framework for thinking about human sensitivity to image distortions. In this framework, we hypothesize that the human visual system decomposes images into structural changes (those that change the identity of objects and scenes), and non-structural changes (those that preserve object and scene identity), and weights these changes differently. We test human sensitivity to distortions that fall into each of these categories, and use this data to identify potential weaknesses of our model that can be improved in further work.
|Advisor:||Simoncelli, Eero P., Kiani, Roozbeh|
|Commitee:||Movshon, J. Anthony, Paninski, Liam, Winawer, Jonathan|
|School:||New York University|
|School Location:||United States -- New York|
|Source:||DAI-B 79/12(E), Dissertation Abstracts International|
|Subjects:||Neurosciences, Applied Mathematics, Computer science|
|Keywords:||Fisher information, Image rendering, Neural networks, Normalization, Perceptual quality, Vision|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be