MIT/ast-finetuned-speech-commands-v2

This is the MIT/ast-finetuned-speech-commands-v2 model converted to OpenVINO, for accellerated inference.

An example of how to do inference on this model:

from optimum.intel.openvino import OVModelForAudioClassification
from transformers import AutoFeatureExtractor, pipeline

# model_id should be set to either a local directory or a model available on the HuggingFace hub.
model_id = "helenai/MIT-ast-finetuned-speech-commands-v2-ov"
feature_extractor = AutoFeatureExtractor.from_pretrained(model_id)
model = OVModelForAudioClassification.from_pretrained(model_id)
pipe = pipeline("audio-classification", model=model, feature_extractor=feature_extractor)
result = pipe("/static-proxy?url=https%3A%2F%2Fdatasets-server.huggingface.co%2Fassets%2Fspeech_commands%2F--%2Fv0.01%2Ftest%2F38%2Faudio%2Faudio.mp3")
print(result)