Music Separation Enhancement with Generative Modeling

spectrogram display of improvement from MSG

Noah Schaffer, Boaz Cogan, Ethan Manilow, Max Morrison, Prem Seetharaman, Bryan Pardo

Audio examples

We introduce Make it Sound Good (MSG), a post-processor that enhances the output quality of source separation systems like Demucs, Wavenet, Spleeter, and OpenUnmix.


State-of-the-art music separation systems produce source estimates with significant perceptual shortcomings, such as adding extraneous noise (e.g. Demucs, Wavenet) or removing harmonics (e.g. Spleeter, OpenUnmix). In this work, we propose a post-processing generative model (Make it Sound Good, or MSG) to enhance the output of music source separation systems. Objective evaluation of artifacts as well as crowdsourced subjective listening studies show that the MSG post-processor reduces the frequency and severity of artifacts produced by many state-of-the-art separators.


[pdf] N. Schaffer, B. Cogan, E. Manilow, M. Morrison, P. Seetharaman, and B. Pardo, “Music Separation Enhancement with Generative Modeling,” in ISMIR, 2022.