Inferences API Results: Similar Values for Arosal, Valence, and Dominance

#11
by shivamisb - opened

Hi,
I am using the inference API and noted that the results for arousal, valence, and dominance are very similar similar, with values ranging between 3.2-3.4.
Can someone help me understand why?

Thanks.

audEERING GmbH org

The model should return values in the range of 0 to 1 for all three dimensions.

What do you mean with "inference API"?

I have a similar thing that happens when I try to run with the HuggingFace audio-classification pipeline, even when I set the function_to_apply to be none (rather than running a softmax over the results). My best guess right now is that the audio-classification pipeline doesn't add the correct classification head that is needed but am not sure.
https://huggingface.co/docs/transformers/v4.47.1/en/main_classes/pipelines#transformers.AudioClassificationPipeline.__call__.function_to_apply(str,

audmax changed discussion status to closed
audmax changed discussion status to open
audEERING GmbH org

The classification head used in the audio-classification pipeline is different to the one used for this model. In the pipeline, the projection layer comes before the pooling while in this repository, the pooling is applied beforehand. The layers also differ in terms of their dimensionalities and names.

Thus, you will need to use the code example as given in the model card in order to run the inference correctly.

Appreciate it! I had also found this to be the case but was investigating other issues I was noticing in the audio-classification pipelines. Seems like even if I fix those issues, the possible naming conventions/pipeline AutoModel configuration causes basically a complete loss of "classification" ability. Not sure if long term this should be something HuggingFace tries to standardize the naming of such that the pipelines can work for arbitrary classification heads but appreciate the help all the same! Very cool work y'all have put together!

audEERING GmbH org

Yes, layers are newly initialized if they are not found (or named differently) in the model file. If the architecture of the head is exactly the same as in the pipeline, renaming in the safetensors or pytorch_model.bin file, respectively, might solve the issue.
Otherwise, transformers returns only a message such as Some weights of Wav2Vec2ForSequenceClassification were not initialized from the model checkpoint at [...]and no explicit warning or error.

Sign up or log in to comment