Guitar audio transcription is the process of generating a human-interpretable musical score from guitar audio. The musical score is presented as guitar tablature, which indicates not only what notes are played, but where they are played on the guitar fretboard. Automatic transcription remains a challenge when dealing with polyphonic sounds. The guitar adds further ambiguity to the transcription problem because the same note can often be played in many ways.
In this thesis work, a portable software architecture is presented for processing guitar audio in real time and providing a set of highly probable transcription solutions. Novel algorithms for performing polyphonic pitch detection and generating confidence values for transcription solutions (by which they are ranked) are also presented. Transcription solutions are generated for individual signal windows based on the output of the polyphonic pitch detection algorithm. Confidence values are generated for solutions by analyzing signal properties, fingering difficulty, and proximity to previous highest confidence solutions. The rules used for generating confidence values are based on expert knowledge of the instrument.
Performance is measured in terms of algorithm accuracy, latency, and throughput. The correct result is ranked 2.08 (with the top rank being 0) for chords. The general case of various notes over time presents results that require qualitative analysis; the system in general is very susceptible to noise and has a difficult time distinguishing harmonics from actual fundamentals. By allowing the user to seed the system with a ground truth, correct recognition of future states is improved significantly in some cases. The sampling time is 250 ms with an average processing time of 110 ms, giving an average total latency of 360 ms. Throughput is 62.5 sample windows per second. Performance is not processor-bound, enabling high performance on a wide variety of personal computers.
|Commitee:||Czernikowski, Roy, Yang, Shanchieh J.|
|School:||Rochester Institute of Technology|
|School Location:||United States -- New York|
|Source:||MAI 49/06M, Masters Abstracts International|
|Subjects:||Computer Engineering, Electrical engineering, Computer science|
|Keywords:||Digital signal processing, Human computer interaction, Music transcription, Real-time|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be