Catalyzed by the invention of magnetic tape recording, audio production has transformed from technical to artistic, and the roles of producer, engineer, composer, and performer have merged for many forms of music. However, while these roles have changed, the way we interact with audio production tools has not and still relies on the conventions established in the 1970s for audio engineers. Users communicate their audio concepts to these complex tools using knobs and sliders that control low-level technical parameters. Musicians currently need technical knowledge of signals in addition to their musical knowledge to make novel music. However, many experienced and casual musicians simply do not have the time or desire to acquire this technical knowledge. While simpler tools (e.g. Apple's GarageBand) exist, they are limiting and frustrating to users.
To support these audio-production novices, we must build audio-production tools with affordances for them. We must identify interactions that enable novices to communicate their audio concepts without requiring technical knowledge and develop systems that can understand these interactions.
This dissertation advances our understanding of this problem by investigating three interaction types which are inspired by how novices communicate audio concepts to other people: language, vocal imitation, and evaluation. Because learning from an individual can be time consuming for a user, much of this dissertation focuses on how we can learn general audio concepts offline using crowdsourcing rather than user-specific audio concepts. This work introduces algorithms, frameworks, and software for learning audio concepts via these interactions and investigates the strengths and weaknesses of both the algorithms and the interaction types. These contributions provide a research foundation for a new generation of audio-production tools.
This problem is not limited to audio production tools. Other media production tools—such as software for graphics, image, and video design and editing—are also controlled by low-level technical parameters which require technical knowledge and experience to use effectively. The contributions in this dissertation to learn mappings from descriptive language and feedback to low-level control parameters may also be adapted for creative production tools in these other mediums. The contributions in this dissertation can unlock the creativity trapped in everyone who has the desire to make music and other media but does not have the technical skills required for today's tools.
|Commitee:||Downey, Doug, Gergle, Darren|
|School Location:||United States -- Illinois|
|Source:||DAI-B 78/05(E), Dissertation Abstracts International|
|Subjects:||Music, Computer science|
|Keywords:||Audio production, Creativity support tools, Crowdsourcing, Human computer interaction, Intelligent user interfaces, Music|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be