Model Overview

This model is a multi-class emotion classifier trained on German text translated equally into Hungarian, Polish, Slovak, and Czech. It identifies nine distinct emotional states in text. The dataset combines synthetic and original German sentences translated into four target languages, offering insights into multilingual emotion classification in a cross-linguistic setting.

Emotion Classes

The model classifies the following emotional states:

  • Anger (0)
  • Fear (1)
  • Disgust (2)
  • Sadness (3)
  • Joy (4)
  • Enthusiasm (5)
  • Hope (6)
  • Pride (7)
  • No emotion (8)

Dataset and Preprocessing

The dataset comprises German text translated equally into Hungarian, Polish, Slovak, and Czech, ensuring balanced representation of target languages. Preprocessing steps included:

  • Normalization to address linguistic variations across translations.
  • Undersampling of overrepresented classes, such as "No emotion" and "Anger," to balance the dataset.

Evaluation Metrics

The model's performance was evaluated using precision, recall, F1-score, and accuracy metrics. Detailed results are as follows:

Class Precision Recall F1-Score Support
Anger (0) 0.50 0.66 0.57 3108
Fear (1) 0.84 0.76 0.80 3104
Disgust (2) 0.93 0.93 0.93 3104
Sadness (3) 0.89 0.82 0.85 3100
Joy (4) 0.76 0.85 0.80 3108
Enthusiasm (5) 0.63 0.60 0.62 3104
Hope (6) 0.49 0.55 0.52 3108
Pride (7) 0.76 0.78 0.77 3104
No emotion (8) 0.71 0.58 0.64 6212

Overall Metrics

  • Accuracy: 0.71
  • Macro Average: Precision = 0.72, Recall = 0.73, F1-Score = 0.72
  • Weighted Average: Precision = 0.72, Recall = 0.71, F1-Score = 0.71

Performance Insights

The model performs well in identifying "Disgust," "Fear," and "Joy," while "Anger," "Hope," and "No emotion" present challenges due to the subtlety of these emotions and the potential noise introduced during translation. Balancing across four target languages adds complexity, yet the model demonstrates robust cross-linguistic classification capabilities.

Model Usage

Applications

  • Multilingual emotion analysis for texts originating in German and translated into Hungarian, Polish, Slovak, or Czech.
  • Sentiment tracking or research in cross-linguistic contexts.
  • Studying emotion classification across multiple languages with machine-translated datasets.

Limitations

  • Sequential translation into multiple target languages may introduce noise or biases, affecting performance for nuanced emotional states.
  • While effective, the model's accuracy may be limited compared to models trained on single-language or single-step translations.

Ethical Considerations

This model's reliance on machine-translated data means it may inherit biases or inaccuracies from the translation process. Users should carefully evaluate its applicability to sensitive use cases such as mental health assessments or social research.

Citation

For further information, visit: uvegesistvan/wildmann_german_proposal_multilingual_HU_PL_SK_CS

Downloads last month
9
Safetensors
Model size
560M params
Tensor type
F32
·
Inference API
Unable to determine this model's library. Check the docs .