Skip to main content

Projects


Interfaces

Improving audio production tools meaningful enhances the creative output of musicians, podcasters,producers and videographers. We focus on bridging the gap between the intentions of creators and the interfaces of audio recording and manipulation tools they use. Our work in this area has a strong human-centered machine learning component. Representative projects in the area are below. For further publications in this area, see our publications page.

  • Man with hands over his eyes

    Eyes Free Audio Production

    This project focuses on building novel accessible tools for creating audio-based content like music or podcasts. The tools should support the needs of blind creators, whether working independently or on teams with sighted collaborators.

  • Picture of the SynthAssist user interface

    Audio production interfaces that learn from user interaction

    We use metaphors and techniques familiar to musicians to produce customizable environments for music creation, with a focus on bridging the gap between the intentions of both amateur and professional musicians and the audio manipulation tools available through software.

Content-addressable search through collections of many audio files (thousands) or lengthy audio files (hours) is an ongoing research area. In this work, we develop and apply cutting edge techniques in machine learning, signal processing and interface design. This is part of a collaboration with the University of Rochester AIR lab and is supported by the National Science Foundation. Representative recent projects in this area are below. For further publications in this area, see our publications page.

  • ISED logo

    ISED

    Interactive Sound Event Detector (I-SED) is a human-in-the-loop interface for sound event annotation that helps users label sound events of interest within a lenghty recording quickly. The annotation is performed by a collaboration between a user and a machine.

  • Voogle logo

    Voogle

    Voogle is an audio search engine that lets users search a database of sounds by vocally imitating or providing an example of the sound they are searching for.


Separation

Audio source separation is the process of extracting a single sound (e.g. one violin) from a mixture of sounds (a string quartet). This is an ongoing research area in the lab. Source separation is the audio analog of scene segmentation in computer vision and is a foundational technology that improves or enables speech recogntion, sound object labeling, music transcription,hearing aids and other technologies. For further publications in this area, see our publications page.