The Northwestern University Source Separation Library (nussl) — Demo

nussl is an open source python library for audio source separation.

It is built to be easy to use existing source separation algorithms and to develop new algorithms. In this demo we will explore some basic functionality of nussl, including running and evaluating some source separation algorithms.

Let's get started by exploring how to import audio.

AudioSignal

The AudioSignal object is the entryway to using nussl. It provides an easy way to import an audio file into nussl. If you have ffmpeg installed, you can open many types of files in nussl.

In [2]:
import nussl  # this is kind of important to do first...
In [3]:
path_to_source1 = "demo_files/drums.wav"
source1 = nussl.AudioSignal(path_to_source1)
source1.label = "mixture"  # We can label this signal as a mixture or whatever else we want

utilities.audio(source1.audio_data.T, source1.sample_rate)

That's it! Now the audio is loaded into nussl and stored in an AudioSignal object. We can explore other aspects of this file with the AudioSignal object as well...

In [4]:
print("Path to file: {}"              .format(source1.path_to_input_file))
print("Filename: {}"                  .format(source1.file_name))
print("Label: {}"                     .format(source1.label))
print("Sample Rate: {} Hz"            .format(source1.sample_rate))
print("Length of file in samples: {}" .format(source1.signal_length))
print("Length of file in seconds: {}" .format(source1.signal_duration))
print("Number of channels: {}\n"      .format(source1.num_channels))

print("Audio is stored at source1.audio_data:\n {}\n".format(source1.audio_data))
print("source1.audio_data is a numpy array: type(source1.audio_data)={}".format(type(source1.audio_data)))
print("source1.audio_data.shape = {}".format(source1.audio_data.shape))
print("\t\t\t(# channels, # samples)")
Path to file: demo_files/drums.wav
Filename: drums.wav
Label: mixture
Sample Rate: 44100 Hz
Length of file in samples: 441000
Length of file in seconds: 10.0
Number of channels: 1

Audio is stored at source1.audio_data:
 [[ 0.0000000e+00 -3.0517578e-05  0.0000000e+00 ...  0.0000000e+00
   0.0000000e+00  0.0000000e+00]]

source1.audio_data is a numpy array: type(source1.audio_data)=<type 'numpy.ndarray'>
source1.audio_data.shape = (1, 441000)
			(# channels, # samples)

STFT

But, source1 has no stft data yet. Because we haven't actually computed the STFT. We can calculate an STFT very easily from our AudioSignal object. Let's do that:

In [5]:
# This returns stft data...
stft = source1.stft(window_length=1024, hop_length=512)

# ...but it's still stored in the AudioSignal object
print(source1.stft_data.shape)
print('(# FFT bins, # time bins, # channels)')
mag = source1.magnitude_spectrogram_data  # np.abs(source1.stft_data)
psd = source1.power_spectrogram_data      # np.pow(source1.stft_data, 2)
(1025, 863, 1)
(# FFT bins, # time bins, # channels)

Source separation

It's simple to run source separation algorithms on audio. The first argument to source separation classes is always an AudioSignal object. The process to use a source separation algorithm is the following:

  1. Initialize a source separation object.
  2. Run the source separation object using obj.run().
  3. The output is stored as AudioSignal objects. Call obj.make_audio_signals() to get the output.
  4. Write audio to disc or listen to it. (Optional)

Let's demonstrate.

In [6]:
mixture_path = 'demo_files/HistoryRepeating_PropellorHeads.wav'
In [7]:
# Step 0 - load a new mixture
mixture = nussl.AudioSignal(mixture_path)

print('Mixture')
utilities.audio(mixture.audio_data.T, mixture.sample_rate)

# Step 1 - initialize a Repet object
repet_sim = nussl.RepetSim(mixture)

# Step 2 - run the algorithm
repet_sim.run()

# Step 3 - get the output
bg, fg = repet_sim.make_audio_signals()

# Step 4 - hear the output
print('Estimated Foreground')
utilities.audio(fg.audio_data.T, fg.sample_rate)
print('Estimated Background')
utilities.audio(bg.audio_data.T, bg.sample_rate)
Mixture
Estimated Foreground
Estimated Background