nussl
) — Demo¶nussl
is an open source python library for audio source separation.¶It is built to be easy to use existing source separation algorithms and to develop new algorithms. In this demo we will explore some basic functionality of nussl
, including running and evaluating some source separation algorithms.
Let's get started by exploring how to import audio.
The AudioSignal
object is the entryway to using nussl
. It provides an easy way to import an audio file into nussl
. If you have ffmpeg
installed, you can open many types of files in nussl
.
import nussl # this is kind of important to do first...
path_to_source1 = "demo_files/drums.wav"
source1 = nussl.AudioSignal(path_to_source1)
source1.label = "mixture" # We can label this signal as a mixture or whatever else we want
utilities.audio(source1.audio_data.T, source1.sample_rate)
That's it! Now the audio is loaded into nussl
and stored in an AudioSignal
object. We can explore other aspects of this file with the AudioSignal
object as well...
print("Path to file: {}" .format(source1.path_to_input_file))
print("Filename: {}" .format(source1.file_name))
print("Label: {}" .format(source1.label))
print("Sample Rate: {} Hz" .format(source1.sample_rate))
print("Length of file in samples: {}" .format(source1.signal_length))
print("Length of file in seconds: {}" .format(source1.signal_duration))
print("Number of channels: {}\n" .format(source1.num_channels))
print("Audio is stored at source1.audio_data:\n {}\n".format(source1.audio_data))
print("source1.audio_data is a numpy array: type(source1.audio_data)={}".format(type(source1.audio_data)))
print("source1.audio_data.shape = {}".format(source1.audio_data.shape))
print("\t\t\t(# channels, # samples)")
But, source1
has no stft data yet. Because we haven't actually computed the STFT.
We can calculate an STFT very easily from our AudioSignal
object. Let's do that:
# This returns stft data...
stft = source1.stft(window_length=1024, hop_length=512)
# ...but it's still stored in the AudioSignal object
print(source1.stft_data.shape)
print('(# FFT bins, # time bins, # channels)')
mag = source1.magnitude_spectrogram_data # np.abs(source1.stft_data)
psd = source1.power_spectrogram_data # np.pow(source1.stft_data, 2)
It's simple to run source separation algorithms on audio. The first argument to source separation classes is always an AudioSignal
object. The process to use a source separation algorithm is the following:
obj.run()
.AudioSignal
objects. Call obj.make_audio_signals()
to get the output.Let's demonstrate.
mixture_path = 'demo_files/HistoryRepeating_PropellorHeads.wav'
# Step 0 - load a new mixture
mixture = nussl.AudioSignal(mixture_path)
print('Mixture')
utilities.audio(mixture.audio_data.T, mixture.sample_rate)
# Step 1 - initialize a Repet object
repet_sim = nussl.RepetSim(mixture)
# Step 2 - run the algorithm
repet_sim.run()
# Step 3 - get the output
bg, fg = repet_sim.make_audio_signals()
# Step 4 - hear the output
print('Estimated Foreground')
utilities.audio(fg.audio_data.T, fg.sample_rate)
print('Estimated Background')
utilities.audio(bg.audio_data.T, bg.sample_rate)