stft_utils.py

nussl.stft_utils.plot_stft(signal, file_name, title=None, win_length=None, hop_length=None, window_type=None, sample_rate=44100, n_fft_bins=None, freq_max=None, show_interactive_plot=False)

Outputs an image of an stft plot of input audio, signal. This uses matplotlib to create the output file. You can specify the same all of the same parameters that are in e_stft(). By default, the StftParams defaults are used for any values not provided in (win_length, hop_length, and window_type). Title is settable by user and there is a flag to show an interactive matplotlib graph, as well.

Notes

To find out what output formats are available for your machine run the following code:

>>> import matplotlib.pyplot as plt
>>> fig = plt.figure()
>>> print(fig.canvas.get_supported_filetypes())

(From here: http://stackoverflow.com/a/7608273/5768001)

Parameters:
  • signal – (np.array) input time series signal that will be plotted
  • file_name – (str) path to file that will be output. Will overwrite any file that is already there. Uses mat
  • title – (string) (Optional) Title to go at top of graph. Defaults to ‘Spectrogram of [file_name]’
  • win_length – (int) (Optional) number of samples per window. Defaults to StftParams default.
  • hop_length – (int) (Optional) number of samples between the start of adjacent windows, or “hop”. Defaults to StftParams default.
  • sample_rate – (int) (Optional) sample rate of input signal. Defaults to StftParams default.
  • window_type – (string) (Optional) type of window to use. Using WindowType object is recommended. Defaults to StftParams default.
  • n_fft_bins – (int) (Optional) number of fft bins per time window. If not specified, defaults to next highest power of 2 above window_length. Defaults to StftParams default.
  • freq_max – (int) Max frequency to display. Defaults to 44100Hz
  • show_interactive_plot – (bool) (Optional) Flag indicating if plot should be shown when function is run. Defaults to False

Example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# Set up sine wave parameters
sr = nussl.Constants.DEFAULT_SAMPLE_RATE # 44.1kHz
n_sec = 3 # seconds
duration = n_sec * sr
freq = 300 # Hz

# Make sine wave array
x = np.linspace(0, freq * 2 * np.pi, duration)
x = np.sin(x)

# plot it and save it in path 'path/to/sine_wav.png'
nussl.plot_stft(x, 'path/to/sine_wav.png')
nussl.stft_utils.e_stft(signal, window_length, hop_length, window_type, n_fft_bins=None, remove_reflection=True, remove_padding=False)

This function computes a short time fourier transform (STFT) of a 1D numpy array input signal. This will zero pad the signal by half a hop_length at the beginning to reduce the window tapering effect from the first window. It also will zero pad at the end to get an integer number of hops.

By default, this function removes the FFT data that is a reflection from over Nyquist. There is an option to suppress this behavior and have this function include data from above Nyquist, but since the inverse STFT function, e_istft(), expects data without the reflection, the onus is on the user to remember to set the reconstruct_reflection flag in e_istft() input.

Additionally, this function assumes a single channeled audio signal and is not guaranteed to work on multichannel audio. If you want to do an STFT on multichannel audio see the AudioSignal object.

Parameters:
  • signal – 1D numpy array containing audio data. (REAL)
  • window_length – (int) number of samples per window
  • hop_length – (int) number of samples between the start of adjacent windows, or “hop”
  • window_type – (string) type of window to use. Using WindowType object is recommended.
  • n_fft_bins – (int) (Optional) number of fft bins per time window.
  • not specified, defaults to next highest power of 2 above window_length (If) –
  • remove_reflection – (bool) (Optional) if True, this will remove reflected STFT data above the Nyquist point.
  • not specified, defaults to True. (If) –
  • remove_padding – (bool) (Optional) if True, this will remove the extra padding added when doing the STFT.
  • to True. (Defaults) –
Returns:

2D numpy array with complex STFT data. Data is of shape (num_time_blocks, num_fft_bins). These numbers are determined by length of the input signal, on internal zero padding (explained at top), and n_fft_bins/remove_reflection input (see example below).

Example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# Set up sine wave parameters
sr = nussl.Constants.DEFAULT_SAMPLE_RATE # 44.1kHz
n_sec = 3 # seconds
duration = n_sec * sr
freq = 300 # Hz

# Make sine wave array
x = np.linspace(0, freq * 2 * np.pi, duration)
x = np.sin(x)

# Set up e_stft() parameters
win_type = nussl.WindowType.HANN
win_length = 2048
hop_length = win_length / 2

# Run e_stft()
stft = nussl.e_stft(x, win_length, hop_length, win_type)
# stft has shape (win_length // 2 + 1 , duration / hop_length)

# Get reflection
stft_with_reflection = nussl.e_stft(x, win_length, hop_length, win_type, remove_reflection=False)
# stft_with_reflection has shape (win_length, duration / hop_length)

# Change number of fft bins per hop
num_bins = 4096
stft_more_bins = e_stft(x, win_length, hop_length, win_type, n_fft_bins=num_bins)
# stft_more_bins has shape (num_bins // 2 + 1, duration / hop_length)
nussl.stft_utils.e_istft(stft, window_length, hop_length, window_type, reconstruct_reflection=True, remove_padding=True)

Computes an inverse_mask short time fourier transform (STFT) from a 2D numpy array of complex values. By default this function assumes input STFT has no reflection above Nyquist and will rebuild it, but the reconstruct_reflection flag overrides that behavior.

Additionally, this function assumes a single channeled audio signal and is not guaranteed to work on multichannel audio. If you want to do an iSTFT on multichannel audio see the AudioSignal object.

Parameters:
  • stft – complex valued 2D numpy array containing STFT data
  • window_length – (int) number of samples per window
  • hop_length – (int) number of samples between the start of adjacent windows, or “hop”
  • window_type – (deprecated)
  • reconstruct_reflection – (bool) (Optional) if True, this will recreate the removed reflection
  • above the Nyquist. If False, this assumes that the input STFT is complete. Default is True. (data) –
  • remove_padding – (bool) (Optional) if True, this function will remove the first and last (window_length - hop_length) number of samples. Defaults to False.
  • massage the output so that it is in a format that it expects. remove_reflection is still works in this (will) –
  • Note (mode.) – librosa’s works differently than nussl’s and may produce different output.
Returns:

1D numpy array containing an audio signal representing the original signal used to make stft

Example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
# Set up sine wave parameters
sr = nussl.Constants.DEFAULT_SAMPLE_RATE # 44.1kHz
n_sec = 3 # seconds
duration = n_sec * sr
freq = 300 # Hz

# Make sine wave array
x = np.linspace(0, freq * 2 * np.pi, duration)
x = np.sin(x)

# Set up e_stft() parameters
win_type = nussl.WindowType.HANN
win_length = 2048
hop_length = win_length / 2

# Get an stft
stft = nussl.e_stft(x, win_length, hop_length, win_type)

calculated_signal = nussl.e_istft(stft, win_length, hop_length)
nussl.stft_utils.e_stft_plus(signal, window_length, hop_length, window_type, sample_rate, n_fft_bins=None, remove_reflection=True)

Does a short time fourier transform (STFT) of the signal (by calling e_stft() ), but also calculates the power spectral density (PSD), frequency and time vectors for the calculated STFT. This function does not give you as many options as e_stft() (wrt removing the reflection and using librosa). If you need that flexibility, it is recommended that you either use e_stft() or use an AudioSignal object.

Use this is situations where you need more than just the STFT data. For instance, this is used in plot_stft() to get the frequency vector to graph. In situations where you don’t need this extra data it is more efficient to use e_stft().

Additionally, this function assumes a single channeled audio signal and is not guaranteed to work on multichannel audio. If you want to do an STFT on multichannel audio see the AudioSignal object.

Parameters:
  • signal – 1D numpy array containing audio data. (REAL?COMPLEX?INTEGER?)
  • window_length – (int) number of samples per window
  • hop_length – (int) number of samples between the start of adjacent windows, or “hop”
  • window_type – (string) type of window to use. Using WindowType object is recommended.
  • sample_rate – (int) the intended sample rate, this is used in the calculation of the frequency vector
  • n_fft_bins – (int) (Optional) number of fft bins per time window.
  • not specified, defaults to next highest power of 2 above window_length (If) –
  • remove_reflection (bool) –
Returns:

stft – (np.ndarray) a 2D matrix short time fourier transform data

nussl.stft_utils.librosa_stft_wrapper(signal, window_length, hop_length, window_type=None, remove_reflection=True, center=True, n_fft_bins=None)
Parameters:
  • signal
  • window_length
  • hop_length
  • window_type
  • remove_reflection
  • center
  • n_fft_bins

Returns:

nussl.stft_utils.librosa_istft_wrapper(stft, window_length, hop_length, window_type, remove_reflection=False, center=True, original_signal_length=None)

Wrapper for calling into librosa’s istft function.

Parameters:
  • stft
  • window_length
  • hop_length
  • window_type
  • remove_reflection
  • center
  • original_signal_length

Returns:

nussl.stft_utils.make_window(window_type, length, symmetric=False)

Returns an np.array populated with samples of a normalized window of type :param:`window_type`.

Parameters:
  • window_type (str) – Type of window to create, string can be
  • length (int) – length of window
  • symmetric (bool) – If False, generates a periodic window (for use in spectral analysis). If True, generates a symmetric window (for use in filter design). Does nothing for rectangular window.
Returns:

window (np.array) – np array with a window of type window_type

class nussl.stft_utils.StftParams(sample_rate, window_length=None, hop_length=None, window_type=None, n_fft_bins=None)

The StftParams class is a container for information needed to run an STFT or iSTFT. This is meant as a convenience and does not actually perform any calculations within. It should get “decomposed” by the time e_stft() or e_istft() are called, so that every attribute in this object is a parameter to one of those functions.

Every class that inherits from the SeparationBase class has an StftParms object, and this is the only way that a top level user has access to the STFT parameter settings that all of the separation algorithms are built upon. This object will get passed around instead of each of these individual attributes.

window_length
hop_length
n_fft_bins

type: Returns

window_overlap

Returns number of samples of overlap between adjacent time slices. This is calculated like self.window_length - self.hop_length This property is not settable.

to_json()
static from_json(json_string)