Computing is rapidly becoming ubiquitous as users expect devices that can augment and interact naturally with the world around them. In these systems it is necessary to have an acoustic front-end that is able to capture and reproduce natural human communication. Whether the end point is a speech recognizer or another human listener, the reduction of noise, reverberation, and acoustic echoes are all necessary and complex challenges. The focus of this dissertation is to provide a general method for approaching these problems using spherical microphone and loudspeaker arrays.
In this work, a theory of capturing and reproducing three-dimensional acoustic fields is introduced from a signal processing perspective. In particular, the decomposition of the spatial part of the acoustic field into an orthogonal basis of spherical harmonics provides not only a general framework for analysis, but also many processing advantages. The spatial sampling error limits the upper frequency range with which a sound field can be accurately captured or reproduced. In broadband arrays, the cost and complexity of using multiple transducers is an issue. This work provides a flexible optimization method for determining the location of array elements to minimize the spatial aliasing error. The low frequency array processing ability is also limited by the SNR, mismatch, and placement error of transducers. To address this, a robust processing method is introduced and used to design a reproduction system for rendering over arbitrary loudspeaker arrays or binaurally over headphones.
In addition to the beamforming problem, the multichannel acoustic echo cancellation (MCAEC) issue is also addressed. A MCAEC must adaptively estimate and track the constantly changing loudspeaker-room-microphone response to remove the sound field presented over the loudspeakers from that captured by the microphones. In the multichannel case, the system is overdetermined and many adaptive schemes fail to converge to the true impulse response. This forces the need to track both the near and far end room responses. A transform domain method that mitigates this problem is derived and implemented. Results with a real system using a 16-channel loudspeaker array and 32-channel microphone array are presented.
|School:||The Johns Hopkins University|
|School Location:||United States -- Maryland|
|Source:||DAI-B 73/05, Dissertation Abstracts International|
|Subjects:||Electrical engineering, Acoustics|
|Keywords:||Acoustic signal processing, Beamforming, Echo cancellation, Immersive communication, Microphone arrays|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be