Update README.md
Browse files
README.md
CHANGED
@@ -53,6 +53,7 @@ This distribution reflects the balance of emotions in the dataset, with some emo
|
|
53 |
The model used is the **Wav2Vec2 Large XLR-53** model, fine-tuned for **audio classification** tasks:
|
54 |
- **Model**: [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3)
|
55 |
- **Output**: Emotion labels (`Angry', 'Disgust', 'Fearful', 'Happy', 'Neutral', 'Sad', 'Surprised'`)
|
|
|
56 |
I map the emotion labels to numeric IDs and use them for model training and evaluation.
|
57 |
|
58 |
|
@@ -69,6 +70,7 @@ The model is trained with the following parameters:
|
|
69 |
- **Warmup Ratio for LR Scheduler**: `0.1`
|
70 |
- **Number of Epochs**: `25`
|
71 |
- **Mixed Precision Training**: Native AMP (Automatic Mixed Precision)
|
|
|
72 |
These parameters ensure efficient model training and stability, especially when dealing with large datasets and deep models like **Whisper**.
|
73 |
The training utilizes **Wandb** for experiment tracking and monitoring.
|
74 |
|
@@ -80,6 +82,7 @@ The following evaluation metrics were obtained after training the model:
|
|
80 |
- **Precision**: `0.9230`
|
81 |
- **Recall**: `0.9199`
|
82 |
- **F1 Score**: `0.9198`
|
|
|
83 |
These metrics demonstrate the model's performance on the speech emotion recognition task. The high values for accuracy, precision, recall, and F1 score indicate that the model is effectively identifying emotional states from speech data.
|
84 |
|
85 |
|
|
|
53 |
The model used is the **Wav2Vec2 Large XLR-53** model, fine-tuned for **audio classification** tasks:
|
54 |
- **Model**: [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3)
|
55 |
- **Output**: Emotion labels (`Angry', 'Disgust', 'Fearful', 'Happy', 'Neutral', 'Sad', 'Surprised'`)
|
56 |
+
|
57 |
I map the emotion labels to numeric IDs and use them for model training and evaluation.
|
58 |
|
59 |
|
|
|
70 |
- **Warmup Ratio for LR Scheduler**: `0.1`
|
71 |
- **Number of Epochs**: `25`
|
72 |
- **Mixed Precision Training**: Native AMP (Automatic Mixed Precision)
|
73 |
+
|
74 |
These parameters ensure efficient model training and stability, especially when dealing with large datasets and deep models like **Whisper**.
|
75 |
The training utilizes **Wandb** for experiment tracking and monitoring.
|
76 |
|
|
|
82 |
- **Precision**: `0.9230`
|
83 |
- **Recall**: `0.9199`
|
84 |
- **F1 Score**: `0.9198`
|
85 |
+
|
86 |
These metrics demonstrate the model's performance on the speech emotion recognition task. The high values for accuracy, precision, recall, and F1 score indicate that the model is effectively identifying emotional states from speech data.
|
87 |
|
88 |
|