Predicting Algorithm Efficacy for Adaptive Multi-Cue Source Separation

Ethan Manilow, Prem Seetharaman, Fatemeh Pishdadian, Bryan Pardo

This page shows an audio example illustrating multicue source separation.

The original mixture is divided into 2 segments:

  1. In the first 10 seconds, the vocals are panned to the left ear and the backgrounds are centered.
  2. In the last 20 seconds, the vocals are panned to the center and overlap spatially with the background.

Here's the mixture audio.

In [87]:
from audio_embed import utilities
utilities.apply_style()
mix = AudioSignal('audio/mix.wav')
utilities.audio(mix)
In [91]:
from nussl import AlgorithmSwitcher, AudioSignal, RepetSim, Melodia, Projet

separated = {'bg':{}, 'fg':{}}

def separate(mix, approach):
    if approach.__name__ == 'Projet':
        s = approach(mix, num_sources=2, num_iterations=100)
        separated['projet'] = s.run()
    else:
        s = approach(mix)
        s.high_pass_cutoff = 0
        s.run()
        separated['bg'][approach.__name__], separated['fg'][approach.__name__] = s.make_audio_signals()

approaches = [Melodia, RepetSim, Projet]

for a in approaches:
    separate(mix, a)

Three algorithms are applied to the mixture, each relying on a different cue:

  • MELODIA* (melody tracking using pitch proximity)
  • REPET-SIM (repetition)
  • PROJET (spatialization)

* The published MELODIA is a predominant pitch tracker, not a source separation algorithm, but we use it to build a harmonic mask to separate out the vocals based on the pitch track.

Here are the singing voice estimates output by each algorithm.

In [93]:
print 'PROJET estimate'
utilities.audio(separated['projet'][1])
print 'REPET-SIM estimate'
utilities.audio(separated['fg']['RepetSim'])
print 'MELODIA estimate'
utilities.audio(separated['fg']['Melodia'])
PROJET estimate
REPET-SIM estimate
MELODIA estimate

The PROJET estimate is very good for the first 10 seconds, when the vocals are spatially separated. Afterwards, separation fails, resulting in no audio from 10 seconds to the end. The MELODIA and REPET-SIM estimates have comparable performance. Ideally, we would use PROJET for the first 10 seconds of this mixture and then use a combination of MELODIA and REPET-SIM for the last 20 seconds of the mixture.

Our algorithm picks the best algorithm at every second and uses its output.

Below is the output of our system, which predicts SDR for each algorithm output for 1 second chunks and uses the algorithm that has the highest predicted SDR in that 1-second chunk. Our system correctly uses PROJET for the first 10 seconds of the mixture and then switches between REPET-SIM and MELODIA for the remainder of the mixture, predicting them to have comparable SDR.

In [94]:
separated['fg']['Projet'] = separated['projet'][1]

import matplotlib

font = {'family' : 'normal',
        'weight' : 'normal',
        'size'   : 18}

matplotlib.rc('font', **font)

import nussl
reload(nussl)
from audio_embed import utilities
import matplotlib.pyplot as plt
switcher = AlgorithmSwitcher(mix, 
                             [separated['fg'][a.__name__] for a in approaches], [a.__name__ for a in approaches],
                            model = '/home/prem/research/nussl/nussl/separation/models/vocal_sdr_predictor.model')
bg_s, fg_s = switcher.run()

print 'Proposed estimate: Using the algorithm with the best SDR, calculated every 1 second.'
utilities.audio((fg_s))

plt.figure(figsize=(20, 10))
plt.subplot(211)
switcher.plot(None)
plt.title('Predicted algorithm (best and worst) over time in the mixture')

plt.subplot(212)
for x in switcher.sdrs.T:
    plt.plot(x)
plt.xlim([0, 30])
plt.title('SDR estimate over time for each algorithm')
plt.legend([a.__name__ for a in approaches])

plt.tight_layout()
plt.show()
Proposed estimate: Using the algorithm with the best SDR, calculated every 1 second.
In [98]:
from IPython.display import HTML

HTML('''<script>
code_show=true; 
function code_toggle() {
 if (code_show){
 $('div.input').hide();
 } else {
 $('div.input').show();
 }
 code_show = !code_show
} 
$( document ).ready(code_toggle);
</script>''')
#<form action="javascript:code_toggle()"><input type="submit" value="Click here to toggle on/off the raw code."></form>''')
Out[98]: