Audio production interfaces that learn from user interaction

Picture of the SynthAssist user interface

Bryan Pardo, Prem Seetharaman, Bongjun Kim, Mark Cartwright

Building Audio Interfaces with Crowdsourced Concept Maps and Active Transfer Learning

This work supported by NSF Award 1116384

We use metaphors and techniques familiar to musicians to produce customizable environments for music creation, with a focus on bridging the gap between the intentions of both amateur and professional musicians and the audio manipulation tools available through software.

Rather than force nonintuitive interactions, or remove control altogether, we reframe the controls to work within the interaction paradigms identified by research done on how audio engineers and musicians communicate auditory concepts to each other. Click on the examples below to learn how we built interfaces based on the following interaction paradigms:

Videos

An Evaluative Interface (e.g., “I like the first equalization setting better than the second one.”)
Programming a synthesizer with vocal imitation (e.g., vocally making a “whoosh” sound to illustrate a desired synth patch)
Mixing Music through Exploration (e.g., “What other mixes are like this, but different?”)

Downloads and Demos

The Audealize web demo (e.g., “Make the reverb sound ‘watery’”)
The Audealize reverberation and EQ plugin for MacOs

[pdf] B. Pardo, M. Cartwright, P. Seetharaman, and B. Kim, “Learning to Build Natural Audio Production Interfaces,” Arts, vol. 8, no. 3, 2019.

[pdf] M. Donovan, P. Seetharaman, and B. Pardo, “A Web Audio Node for the Fast Creation of Natural Language Interfaces for Audio Production,” 2017.

[pdf] P. Seetharaman and B. Pardo, “Audealize: Crowdsourced audio production tools,” Journal of the Audio Engineering Society, vol. 64, no. 9, pp. 683–695, 2016.

[pdf] M. Cartwright and B. Pardo, “Synthassist: an audio synthesizer programmed with vocal imitation,” in Proceedings of the 22nd ACM international conference on Multimedia, 2014, pp. 741–742.

[pdf] B. Kim and B. Pardo, “Speeding learning of personalized audio equalization,” in 2014 13th International Conference on Machine Learning and Applications, 2014, pp. 495–499.

[pdf] A. T. Sabin, Z. Rafii, and B. Pardo, “Weighted-function-based rapid mapping of descriptors to audio processing parameters,” Journal of the Audio Engineering Society, vol. 59, no. 6, pp. 419–430, 2011.

Bryan Pardo, Prem Seetharaman, Bongjun Kim, Mark Cartwright

Building Audio Interfaces with Crowdsourced Concept Maps and Active Transfer Learning

This work supported by NSF Award 1116384

Videos

Downloads and Demos

Related publications