Repet

The original REpeating Pattern Extraction Technique (REPET).

class nussl.separation.repet.Repet(input_audio_signal, min_period=None, max_period=None, period=None, high_pass_cutoff=100.0, do_mono=False, use_find_period_complex=False, use_librosa_stft=False, matlab_fidelity=False, mask_type='soft', mask_threshold=0.5)

Bases: nussl.separation.mask_separation_base.MaskSeparationBase

Implements the original REpeating Pattern Extraction Technique algorithm using the beat spectrum.

REPET is a simple method for separating a repeating background from a non-repeating foreground in an audio mixture. It assumes a single repeating period over the whole signal duration, and finds that period based on finding a peak in the beat spectrum. The period can also be provided exactly, or you can give Repet a guess of the min and max period. Once it has a period, it “overlays” spectrogram sections of length period to create a median model (the background).

References

  • Zafar Rafii and Bryan Pardo. “Audio Separation System and Method,” US20130064379 A1, US 13/612,413, March 14, 2013
Parameters:
  • input_audio_signal (audio_signal.AudioSignal) – The audio_signal.AudioSignal object that REPET will be run on. This makes a copy of input_audio_signal
  • min_period (float, optional) – minimum time to look for repeating period in terms of seconds.
  • max_period (float, optional) – maximum time to look for repeating period in terms of seconds.
  • period (float, optional) – exact time that the repeating period is (in seconds).
  • high_pass_cutoff (float, optional) – value (in Hz) for the high pass cutoff filter.
  • do_mono (bool, optional) – Flattens audio_signal.AudioSignal to mono before running the
  • algorithm (does not effect the input audio_signal.AudioSignal object) –
  • use_find_period_complex (bool, optional) – Will use a more complex peak picker to find the repeating period.
  • use_librosa_stft (bool, optional) – Calls librosa’s stft function instead of nussl’s
  • matlab_fidelity (bool, optional) – If True, does repet with the same settings as the original MATLAB implementation of REPET, warts and all. This will override use_librosa_stft and set it to False.

Examples:

background

Calculated background. This is None until run() is called.

Type:audio_signal.AudioSignal
foreground

Calculated foreground. This is None until make_audio_signals() is called.

Type:audio_signal.AudioSignal
beat_spectrum

Beat spectrum calculated by Repet.

Type:np.array
use_find_period_complex

Determines whether to use complex peak picker to find the repeating period.

Type:bool
repeating_period

Repeating period in units of hops (stft time bins)

Type:int
stft

Local copy of the STFT input from input_audio_array

Type:np.ndarray
mangitude_spectrogram

Local copy of the magnitude spectrogram

Type:np.ndarray
run()

Runs the original REPET algorithm

Returns:masks (MaskBase) – A MaskBase-derived object with repeating background time-frequency data. (to get the corresponding non-repeating foreground run make_audio_signals())

Example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
signal = nussl.AudioSignal(path_to_input_file='input_name.wav')

# Set up and run Repet
repet = nussl.Repet(signal)  # Returns a soft mask by default
masks = repet.run() # or repet()

# Get audio signals
background, foreground = repet.make_audio_signals()

# output the background
background.write_audio_to_file('background.wav')
get_beat_spectrum(recompute_stft=False)

Calculates and returns the beat spectrum for the audio signal associated with this object

Parameters:recompute_stft (bool, Optional) – Recompute the stft for the audio signal
Returns:beat_spectrum (np.array) – beat spectrum for the audio file

Example:

1
2
3
4
5
6
7
8
# Set up audio signal
signal = nussl.AudioSignal('path_to_file.wav')

# Set up a Repet object
repet = nussl.Repet(signal)

# I don't have to run repet to get a beat spectrum for signal
beat_spec = repet.get_beat_spectrum()
static compute_beat_spectrum(power_spectrogram)

Computes the beat spectrum averages (over freq’s) the autocorrelation matrix of a one-sided spectrogram.

The autocorrelation matrix is computed by taking the autocorrelation of each row of the spectrogram and dismissing the symmetric half.

Parameters:power_spectrogram (np.array) – 2D matrix containing the one-sided power spectrogram of an audio signal
Returns:(np.array) – array containing the beat spectrum based on the power spectrogram

See also

J Foote’s original derivation of the Beat Spectrum: Foote, Jonathan, and Shingo Uchihashi. “The beat spectrum: A new approach to rhythm analysis.” Multimedia and Expo, 2001. ICME 2001. IEEE International Conference on. IEEE, 2001. (See PDF here)

static find_repeating_period_simple(beat_spectrum, min_period, max_period)
Computes the repeating period of the sound signal using the beat spectrum.
This algorithm just looks for the max value in the interval [min_period, max_period], inclusive. It discards the first value, and returns the period in units of stft time bins.
Parameters:
  • beat_spectrum (np.array) – input beat spectrum array
  • min_period (int) – minimum possible period value
  • max_period (int) – maximum possible period value
Returns:

period (int) – The period of the sound signal in stft time bins

static find_repeating_period_complex(beat_spectrum)

A more complicated approach to finding the repeating period. Use this by setting use_find_period_complex

Parameters:beat_spectrum (np.array) – input beat spectrum array
Returns:period (int) – The period of the sound signal in stft time bins
BINARY_MASK = 'binary'
SOFT_MASK = 'soft'
audio_signal

Copy of the audio_signal.AudioSignal object passed in upon initialization.

Type:(audio_signal.AudioSignal)
classmethod from_json(json_string)

Creates a new SeparationBase object from the parameters stored in this JSON string.

Parameters:json_string (str) – A JSON string containing all the data to create a new SeparationBase object.
Returns:(SeparationBase) A new SeparationBase object from the JSON string.

See also

to_json() to make a JSON string to freeze this object.

mask_threshold

PROPERTY

Threshold of determining True/False if mask_type is BINARY_MASK. Some algorithms will first make a soft mask and then convert that to a binary mask using this threshold parameter. All values of the soft mask are between [0.0, 1.0] and as such mask_threshold() is expected to be a float between [0.0, 1.0].

Returns:mask_threshold (float) – Value between [0.0, 1.0] that indicates the True/False cutoff when converting a soft mask to binary mask.
Raises:ValueError if not a float or if set outside [0.0, 1.0].
mask_type

PROPERTY

This property indicates what type of mask the derived algorithm will create and be returned by run(). Options are either ‘soft’ or ‘binary’. mask_type is usually set when initializing a MaskSeparationBase-derived class and defaults to SOFT_MASK.

This property, though stored as a string, can be set in two ways when initializing:

  • First, it is possible to set this property with a string. Only 'soft' and 'binary' are accepted (case insensitive), every other value will raise an error. When initializing with a string, two helper attributes are provided: BINARY_MASK and SOFT_MASK.

    It is HIGHLY encouraged to use these, as the API may change and code that uses bare strings (e.g. mask_type = 'soft' or mask_type = 'binary') for assignment might not be future-proof. BINARY_MASK` and SOFT_MASK are safe aliases in case these underlying types change.

  • The second way to set this property is by using a class prototype of either the separation.masks.binary_mask.BinaryMask or separation.masks.soft_mask.SoftMask class prototype. This is probably the most stable way to set this, and it’s fairly succinct. For example, mask_type = nussl.BinaryMask or mask_type = nussl.SoftMask are both perfectly valid.

Though uncommon, this can be set outside of __init__()

Examples of both methods are shown below.

Returns:mask_type (str) – Either 'soft' or 'binary'.
Raises:ValueError if set invalidly.

Example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
import nussl
mixture_signal = nussl.AudioSignal()

# Two options for determining mask upon init...

# Option 1: Init with a string (BINARY_MASK is a string 'constant')
repet_sim = nussl.RepetSim(mixture_signal, mask_type=nussl.MaskSeparationBase.BINARY_MASK)

# Option 2: Init with a class type
ola = nussl.OverlapAdd(mixture_signal, mask_type=nussl.SoftMask)

# It's also possible to change these values after init by changing the `mask_type` property...
repet_sim.mask_type = nussl.MaskSeparationBase.SOFT_MASK  # using a string
ola.mask_type = nussl.BinaryMask  # or using a class type
ones_mask(shape)
Parameters:shape

Returns:

sample_rate

Sample rate of audio_signal. Literally audio_signal.sample_rate.

Type:(int)
stft_params

spectral_utils.StftParams of audio_signal Literally audio_signal.stft_params.

Type:(spectral_utils.StftParams)
to_json()

Outputs JSON from the data stored in this object.

Returns:(str) a JSON string containing all of the information to restore this object exactly as it was when this was called.

See also

from_json() to restore a JSON frozen object.

zeros_mask(shape)

Creates a new zeros mask with this object’s type

Parameters:shape

Returns:

update_periods()

Will update periods for use with find_repeating_period_simple().

Updates from seconds to stft time bin values. Call this if you haven’t done run() or else you won’t get good results.

Example:

1
2
3
4
5
6
a = nussl.AudioSignal('path/to/file.wav')
r = nussl.Repet(a)

beat_spectrum = r.get_beat_spectrum()
r.update_periods()
repeating_period = r.find_repeating_period_simple(beat_spectrum, r.min_period, r.max_period)
plot(output_file, **kwargs)

Creates a plot of the beat spectrum and outputs to output_file.

Parameters:
  • output_file (string) – string representing a path to the desired output file to be created.
  • title – (string) Title to put on the plot
  • show_repeating_period – (bool) if True, then adds a vertical line where repet things the repeating period is (if the repeating period has been computed already)

Example:

1
2
3
4
signal = nussl.AudioSignal('Sample.wav')
repet = nussl.Repet(signal)

repet.plot('new_beat_spec_plot.png', title="Beat Spectrum of Sample.wav", show_repeating_period=True)
make_audio_signals()

Returns the background and foreground audio signals. You must have run run() prior to calling this function. This function will return None if run() has not been called.

Order of the list is [self.background, self.foreground]

Returns:(list) – List containing two audio_signal.AudioSignal objects, one for the calculated background and the next for the remaining foreground, in that order.

Example:

1
2
3
4
5
6
7
8
9
# set up AudioSignal object
signal = nussl.AudioSignal('path_to_file.wav')

# set up and run repet
repet = nussl.Repet(signal)
repet.run()

# get audio signals (AudioSignal objects)
background, foreground = repet.make_audio_signals()