Headed by Prof. Bryan Pardo, the Interactive Audio Lab is in the Computer Science Department of Northwestern University. We develop new methods in Machine Learning, Signal Processing and Human Computer Interaction to make new tools for understanding and manipulating sound.
Ongoing research in the lab is applied to audio scene labeling, audio source separation, inclusive interfaces, new audio production tools and machine audition models that learn without supervision. For more see our projects page.
$100K grant from Sony to fund speech generation
Oct 1, 2022
New NeurIPS paper
Sep 14, 2022
$1.8 million Future of Work award from NSF
Sep 12, 2022
Music Separation Enhancement with Generative Modeling accepted to ISMIR 2022
Aug 1, 2022
Erika Rumbold defends thesis
Aug 1, 2022
Ethan Manilow defends dissertation
Jul 6, 2022
Our tech inside Adobe's new AI-powered audio editor
Jan 1, 2022
Bose releases hearing aids using our tech
Dec 1, 2021
Best Paper at ISMIR 2021
Nov 12, 2021
We host the Bay Innovative Signal Hackers Bash
Oct 16, 2021
Best student paper award in DCASE 2020
Nov 1, 2020
Deep Learning Tools for Audacity
Hugo Flores Garcia, Aldo Aguilar, Ethan Manilow, Dmitry Vedenko and Bryan Pardo
We provide a software framework that lets deep learning practitioners easily integrate their own PyTorch models into the open-source Audacity DAW. This lets ML audio researchers put tools in the hands of sound artists without doing DAW-specific development work.
Controllable Speech Generation
Max Morrison and Bryan Pardo
Nuances in speech prosody (i.e., the pitch, timing, and loudness of speech) are a vital part of how we communicate. We utilize generative machine learning models to generate prosody with user control over these nuances and generate speech reflecting user-specified prosody.
Unsupervised Source Separation By Steering Pretrained Music Models
Ethan Manilow, Patrick O'Reilly, Prem Seetharaman, Bryan Pardo
We showcase an unsupervised method that repurposes deep models trained for music generation and music tagging for audio source separation, without any retraining.