Dissertation/Thesis Abstract

Score-informed musical source separation and reconstruction
by Han, Yushen, Ph.D., Indiana University, 2013, 152; 3609061
Abstract (Summary)

A systematic approach to retrieve individual parts in a monaural music recording with its score is introduced. We are interested in isolating the accompaniment part by removing the solo part from a recording of concerto music in which a solo instrument is accompanied by an orchestra. We require the music audio, the score, and optionally a sample library of individual notes played in isolation. Our approach is based on explicit knowledge of the musical audio at the semantic level (notes or chords) from an audio-score alignment. Such knowledge allows the spectrogram energy to be decomposed into note-based models that could be trained with the sample library. Our approach can be divided into: (1) "masking" to estimate a solo mask to remove the solo and (2) "reconstruction" to impute the missing harmonics of the orchestra notes that have been inevitably damaged in masking.

In "masking," we estimate a 2-dimensional binary mask to classify each time-frequency cell of the short-time Fourier Transform (STFT) spectrogram as either solo or accompaniment in STFT domain. We mainly employ an Expectation Maximization (EM) algorithm to decompose spectrogram magnitude into note-based models. In this process of "erasing" the soloist’s contribution to the mixture by applying the mask, the remaining orchestra is degraded. In "reconstruction," we propose a novel technique to repair such degradation. We use a state-space model for each note partial which is represented by a slowing-changing amplitude envelope and an "unwrapped" phase sequence. Such amplitude-phase representation can be computed by Kalman smoothing. It allows us to "transpose" intact partials of the orchestra part onto the degraded time-frequency region. Objective metrics and subjective listening are used on real and synthesized musical audio data for evaluation and parameter optimization.

Indexing (document details)
Advisor: Raphael, Christopher
Commitee: Crandall, David, Myers, Steven, Trosset, Michael
School: Indiana University
Department: Informatics
School Location: United States -- Indiana
Source: DAI-A 75/04(E), Dissertation Abstracts International
Subjects: Music, Statistics, Information science
Keywords: Acoustics and statistics, Expectation maximization, Kalman smoothing, Music informatics, Phase estimation, Source separation
Publication Number: 3609061
ISBN: 9781303677601
Copyright © 2019 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy