Audio Classification
speechbrain
PyTorch
English
embeddings
Commands
Keywords
Keyword Spotting
xvectors
TDNN
Command Recognition
Instructions to use ayeberraen/google_speech_command_xvector with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- speechbrain
How to use ayeberraen/google_speech_command_xvector with speechbrain:
from speechbrain.pretrained import EncoderClassifier model = EncoderClassifier.from_hparams( "ayeberraen/google_speech_command_xvector" ) model.classify_file("file.wav") - Notebooks
- Google Colab
- Kaggle
| # ############################################################################ | |
| # Model: xvector for Command Recognition with Google Speech Commands | |
| # ############################################################################ | |
| # Pretrain folder (HuggingFace) | |
| pretrained_path: speechbrain/google_speech_command_xvector | |
| # Feature parameters | |
| n_mels: 24 | |
| # Output parameters | |
| out_n_neurons: 12 # 12 command version | |
| # Model params | |
| compute_features: !new:speechbrain.lobes.features.Fbank | |
| n_mels: !ref <n_mels> | |
| mean_var_norm: !new:speechbrain.processing.features.InputNormalization | |
| norm_type: sentence | |
| std_norm: False | |
| embedding_model: !new:speechbrain.lobes.models.Xvector.Xvector | |
| in_channels: !ref <n_mels> | |
| activation: !name:torch.nn.LeakyReLU | |
| tdnn_blocks: 5 | |
| tdnn_channels: [512, 512, 512, 512, 1500] | |
| tdnn_kernel_sizes: [5, 3, 3, 1, 1] | |
| tdnn_dilations: [1, 2, 3, 1, 1] | |
| lin_neurons: 512 | |
| classifier: !new:speechbrain.lobes.models.Xvector.Classifier | |
| input_shape: [null, null, 512] | |
| activation: !name:torch.nn.LeakyReLU | |
| lin_blocks: 1 | |
| lin_neurons: 512 | |
| out_neurons: !ref <out_n_neurons> | |
| mean_var_norm_emb: !new:speechbrain.processing.features.InputNormalization | |
| norm_type: global | |
| std_norm: False | |
| modules: | |
| compute_features: !ref <compute_features> | |
| mean_var_norm: !ref <mean_var_norm> | |
| embedding_model: !ref <embedding_model> | |
| classifier: !ref <classifier> | |
| label_encoder: !new:speechbrain.dataio.encoder.CategoricalEncoder | |
| pretrainer: !new:speechbrain.utils.parameter_transfer.Pretrainer | |
| loadables: | |
| embedding_model: !ref <embedding_model> | |
| classifier: !ref <classifier> | |
| label_encoder: !ref <label_encoder> | |
| paths: | |
| embedding_model: !ref <pretrained_path>/embedding_model.ckpt | |
| classifier: !ref <pretrained_path>/classifier.ckpt | |
| label_encoder: !ref <pretrained_path>/label_encoder.txt | |