nupic.research.frameworks.pytorch.speech_commands_dataset

Adapted from https://github.com/tugstugi/pytorch-speech-commands Google speech commands dataset.

class SpeechCommandsDataset(folder, transform=None, classes=('unknown', 'silence', 'zero', 'one', 'two', 'three', 'four', 'five', 'six', 'seven', 'eight', 'nine'), silence_percentage=0.1, sample_rate=16000)[source]

Bases: torch.utils.data.Dataset

Google speech commands dataset. Only labels in CLASSES, plus silence, are treated as known classes. All other classes are used as ‘unknown’ samples.

Similar to the Kaggle challenge here: https://www.kaggle.com/c/tensorflow-speech-recognition-challenge

make_weights_for_balanced_classes()[source]

adopted from https://discuss.pytorch.org/t/balanced-sampling-between-classes-with-torchvision-dataloader/2703/3. # noqa: E501

class BackgroundNoiseDataset(folder, transform=None, sample_rate=16000, sample_length=1)[source]

Bases: torch.utils.data.Dataset

Dataset for silence / background noise.

class PreprocessedSpeechDataset(root, subset, classes=('unknown', 'silence', 'zero', 'one', 'two', 'three', 'four', 'five', 'six', 'seven', 'eight', 'nine'), silence_percentage=0.1)[source]

Bases: torch.utils.data.Dataset

Google Speech Commands dataset preprocessed with with all transforms already applied.

Use the ‘process_dataset.py’ script to create preprocessed dataset

static is_valid(folder, epoch=0)[source]

Check if the given folder is a valid preprocessed dataset.

make_weights_for_balanced_classes()[source]

adopted from https://discuss.pytorch.org/t/balanced-sampling-between-classes-with-torchvision-dataloader/2703/3. # noqa E501

next_epoch()[source]

Load next epoch from disk.