Dissertation/Thesis Abstract

Towards a Perceptual Distance Metric for Audio
by Oh, Sarah, M.S., Dartmouth College, 2019, 77; 13886580
Abstract (Summary)

The question “What makes sensory stimuli seem alike or different?” is of fundamental importance to our understanding of the information processing underlying perception and cognition. Although perceptual (dis)similarity seems akin to distance, measuring the Euclidean distance between points in stimulus space is a poor estimator of subjective dissimilarity. Nonlinear response patterns, interactions between stimulus components, temporal effects, and top-down modulation transform the information contained in incoming stimuli in a way that seems to preserve some notion of distance, but not the one we are used to. This thesis proposes that transformations applied to stimuli during bottom-up stages of perception can be modeled as a function mapping points in stimulus space to their representations in perceptual space, inducing a Riemannian distance metric. A dataset was collected in a subjective listening experiment, the results of which were used to explore possible approaches to approximating the perceptual transformation.

The first method is based on physiology and estimates the function transforming stimulus vectors into basilar membrane (BM) vibration vectors based on experimental mammalian BM response curves. The second method is data-driven, optimizing the weights of a single matrix using the subjective listening experiment as training data. The third method combines the physiological footing of the first method with the second method’s ability to leverage actual perceived dissimilarities to improve performance. Each of the proposed measures achieved comparable or stronger correlations with subjective ratings (r ⪆ 0.8) compared to state-of-the-art objective audio quality measures.

Indexing (document details)
Advisor: Ray, Laura, Granger, Richard
Commitee: Hansen, Eric
School: Dartmouth College
Department: Engineering
School Location: United States -- New Hampshire
Source: MAI 81/2(E), Masters Abstracts International
Source Type: DISSERTATION
Subjects: Psychology, Computer science, Acoustics
Keywords: Differential geometry, Hearing, Machine learning, Perception, Psychoacoustics, Psychological space
Publication Number: 13886580
ISBN: 9781085606738
Copyright © 2019 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy
ProQuest