BinaryMask

The BinaryMask class is for creating a time-frequency mask with binary values. Like all separation.masks.mask_base.MaskBase objects, BinaryMask is initialized with a 2D or 3D numpy array containing the mask data. The data type (numpy.dtype) of the initial mask can be either bool, int, or float. The mask is stored as a 3-dimensional boolean-valued numpy array.

The best case scenario for the input mask np array is when the data type is bool. If the data type of the input mask upon init is int it is expected that all values are either 0 or 1. If the data type of the mask is float, all values must be within 1e-2 of either 1 or 0. If the array is not set as one of these, BinaryMask will raise an exception.

BinaryMask (like separation.masks.soft_mask.SoftMask) is one of the return types for the run() methods of separation.mask_separation_base.MaskSeparationBase-derived objects (this is most of the separation methods in nussl.

See also

  • separation.masks.mask_base.MaskBase: The base class for BinaryMask and SoftMask
  • separation.masks.soft_mask.SoftMask: Similar to BinaryMask, but instead of taking boolean values, takes floats between [0.0 and 1.0].
  • separation.mask_separation_base.MaskSeparationBase: Base class for all mask-based separation methods in nussl.

Examples

Initializing a mask from a numpy array…

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
import nussl
import numpy as np

# load a file
signal = nussl.AudioSignal('path/to/file.wav')
stft = signal.stft()

# Make a random binary mask with the same shape as the stft with dtype == bool
rand_bool_mask = np.random.randint(2, size=stft.shape).astype('bool')
bin_mask_bool = nussl.BinaryMask(rand_bool_mask)

# Make a random binary mask with the same shape as the stft with dtype == int
rand_int_mask = np.random.randint(2, size=stft.shape)
bin_mask_int = nussl.BinaryMask(rand_int_mask)

# Make a random binary mask with the same shape as the stft with dtype == float
rand_float_mask = np.random.randint(2, size=stft.shape).astype('float')
bin_mask_int = nussl.BinaryMask(rand_float_mask)

separation.mask_separation_base.MaskSeparationBase-derived methods return separation.masks.mask_base.MaskBase masks, like so…

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
import nussl

# load a file
signal = nussl.AudioSignal('path/to/file.wav')

repet = nussl.Repet(signal, mask_type=nussl.BinaryMask)  # You have to specify that you want Binary Masks back
assert isinstance(repet, nussl.MaskSeparationBase)  # Repet is a MaskSeparationBase-derived class

[background_mask, foreground_mask] = repet.run()  # MaskSeparationBase-derived classes return MaskBase objects
assert isinstance(foreground_mask, nussl.BinaryMask)  # this is True
assert isinstance(background_mask, nussl.BinaryMask)  # this is True
class nussl.separation.masks.binary_mask.BinaryMask(input_mask=None, mask_shape=None)

Bases: nussl.separation.masks.mask_base.MaskBase

Class for creating a Binary Mask to apply to a time-frequency representation of the audio.

Parameters:input_mask (np.ndarray) – 2- or 3-D np.array that represents the mask.
mask_as_ints(channel=None)

Returns this BinaryMask as a numpy array of ints of 0’s and 1’s.

Returns:numpy ndarray of this BinaryMask represented as ints instead of bools.
invert_mask()

Makes a new BinaryMask object with a logical not applied to flip the values in this BinaryMask object.

Returns:A new BinaryMask object that has all of the boolean values flipped.
static mask_to_binary(mask_, threshold)

Makes a binary mask from a soft mask with a True/False threshold.

Parameters:
  • mask (MaskBase or np.ndarray) – Soft mask to convert to BinaryMask
  • threshold (float) – Value between [0.0, 1.0] to determine the True/False cutoff

Returns:

dtype

(str) Returns the data type of the values of the mask.

classmethod from_json(json_string)

Creates a new MaskBase object from the parameters stored in this JSON string.

Parameters:json_string (str) – A JSON string containing all the data to create a new MaskBase object.
Returns:(SeparationBase) A new MaskBase object from the JSON string.

See also

to_json() to make a JSON string to freeze this object.

get_channel(n)

Gets mask channel n and returns it as a 2D np.ndarray

Parameters:

n (int) – Channel index to return (0-based).

Returns:

np.array with the mask channel

Raises:
  • AttributeError if mask is None
  • ValueError if n is less than 0 or greater than the number of channels that this mask object has.
height

(int) Number of frequency bins this mask has.

inverse_mask()

Alias for invert_mask()

See also

invert_mask()

Returns:

length

(int) Number of time hops that this mask represents.

mask

PROPERTY

The actual mask. This is represented as a three dimensional numpy ndarray object. The input gets validated by _validate_mask(). In the case of separation.masks.binary_mask.BinaryMask the validation checks that the values are all 1 or 0 (or bools), in the case of separation.masks.soft_mask.SoftMask the validation checks that all values are within the domain [0.0, 1.0].

This base class will throw a NotImplementedError if instantiated directly.

Raises:
  • ValueError if mask.ndim is less than 2 or greater than 3, or if values fail validation.
  • NotImplementedError if instantiated directly.
num_channels

(int) Number of channels this mask has.

classmethod ones(shape)

Makes a mask with all ones with the specified shape. Exactly the same as np.ones(). :param shape: Shape of the resultant mask. :type shape: tuple

Returns:

shape

(tuple) Returns the shape of the whole mask. Identical to np.ndarray.shape().

to_json()

Returns:

classmethod zeros(shape)

Makes a mask with all zeros with the specified shape. Exactly the same as np.zeros(). :param shape: Shape of the resultant mask. :type shape: tuple

Returns: