Dissertation/Thesis Abstract

Spatial acoustic signal processing for immersive communication
by Atkins, Joshua, Ph.D., The Johns Hopkins University, 2011, 201; 3496161
Abstract (Summary)

Computing is rapidly becoming ubiquitous as users expect devices that can augment and interact naturally with the world around them. In these systems it is necessary to have an acoustic front-end that is able to capture and reproduce natural human communication. Whether the end point is a speech recognizer or another human listener, the reduction of noise, reverberation, and acoustic echoes are all necessary and complex challenges. The focus of this dissertation is to provide a general method for approaching these problems using spherical microphone and loudspeaker arrays.

In this work, a theory of capturing and reproducing three-dimensional acoustic fields is introduced from a signal processing perspective. In particular, the decomposition of the spatial part of the acoustic field into an orthogonal basis of spherical harmonics provides not only a general framework for analysis, but also many processing advantages. The spatial sampling error limits the upper frequency range with which a sound field can be accurately captured or reproduced. In broadband arrays, the cost and complexity of using multiple transducers is an issue. This work provides a flexible optimization method for determining the location of array elements to minimize the spatial aliasing error. The low frequency array processing ability is also limited by the SNR, mismatch, and placement error of transducers. To address this, a robust processing method is introduced and used to design a reproduction system for rendering over arbitrary loudspeaker arrays or binaurally over headphones.

In addition to the beamforming problem, the multichannel acoustic echo cancellation (MCAEC) issue is also addressed. A MCAEC must adaptively estimate and track the constantly changing loudspeaker-room-microphone response to remove the sound field presented over the loudspeakers from that captured by the microphones. In the multichannel case, the system is overdetermined and many adaptive schemes fail to converge to the true impulse response. This forces the need to track both the near and far end room responses. A transform domain method that mitigates this problem is derived and implemented. Results with a real system using a 16-channel loudspeaker array and 32-channel microphone array are presented.

Indexing (document details)
Advisor: West, James
Commitee:
School: The Johns Hopkins University
School Location: United States -- Maryland
Source: DAI-B 73/05, Dissertation Abstracts International
Source Type: DISSERTATION
Subjects: Electrical engineering, Acoustics
Keywords: Acoustic signal processing, Beamforming, Echo cancellation, Immersive communication, Microphone arrays
Publication Number: 3496161
ISBN: 9781267150196
Copyright © 2019 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy
ProQuest