bayrameker commited on
Commit
f199601
1 Parent(s): 735a76c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +121 -3
README.md CHANGED
@@ -1,3 +1,121 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - winvoker/turkish-sentiment-analysis-dataset
5
+ language:
6
+ - tr
7
+ base_model:
8
+ - answerdotai/ModernBERT-large
9
+ ---
10
+
11
+ Here's an updated **Model Card** in a **README format** based on the training results and the model you've used (ModernBERT-large for Turkish sentiment analysis):
12
+
13
+ ```markdown
14
+ # Turkish Sentiment ModernBERT-large
15
+ ```
16
+ This is a fine-tuned **ModernBERT-large** model for **Turkish Sentiment Analysis**. The model was trained on the `winvoker/turkish-sentiment-analysis-dataset` and is designed to classify Turkish text into sentiment categories such as positive, negative, and neutral.
17
+
18
+ ## Model Overview
19
+
20
+ - **Model Type**: ModernBERT (BERT variant)
21
+ - **Task**: Sentiment Analysis
22
+ - **Languages**: Turkish
23
+ - **Dataset**: [winvoker/turkish-sentiment-analysis-dataset](https://huggingface.co/datasets/winvoker/turkish-sentiment-analysis-dataset)
24
+ - **Labels**: Positive, Negative, Neutral
25
+ - **Fine-Tuning**: Fine-tuned for sentiment classification.
26
+
27
+ ## Performance Metrics
28
+
29
+ The model was trained for **4 epochs** with the following results:
30
+
31
+ | Epoch | Training Loss | Validation Loss | Accuracy | F1 Score |
32
+ |-------|---------------|-----------------|----------|----------|
33
+ | 1 | 0.2884 | 0.1133 | 95.72% | 92.18% |
34
+ | 2 | 0.1759 | 0.1050 | 96.24% | 93.33% |
35
+ | 3 | 0.0633 | 0.1233 | 96.14% | 93.19% |
36
+ | 4 | 0.0623 | 0.1213 | 96.14% | 93.19% |
37
+
38
+ - **Training Loss**: Measures how well the model fits the training data.
39
+ - **Validation Loss**: Measures how well the model generalizes to unseen data.
40
+ - **Accuracy**: Percentage of correct predictions over all examples.
41
+ - **F1 Score**: A balanced metric between precision and recall, accounting for both false positives and false negatives.
42
+
43
+ ## Model Inference Example
44
+
45
+ You can use this model for sentiment analysis of Turkish text. Here’s an example of how to use it:
46
+
47
+ ```python
48
+ from transformers import AutoModelForSequenceClassification, AutoTokenizer
49
+ import torch
50
+
51
+ # Load the pre-trained model and tokenizer
52
+ model_name = "bayrameker/Turkish-sentiment-ModernBERT-large"
53
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
54
+ model = AutoModelForSequenceClassification.from_pretrained(model_name)
55
+
56
+ # Example texts for prediction
57
+ texts = ["bu ürün çok iyi", "bu ürün berbat"]
58
+
59
+ # Tokenize the inputs
60
+ inputs = tokenizer(texts, padding=True, truncation=True, return_tensors="pt")
61
+
62
+ # Make predictions
63
+ with torch.no_grad():
64
+ logits = model(**inputs).logits
65
+
66
+ # Get the predicted sentiment labels
67
+ predictions = torch.argmax(logits, dim=-1)
68
+ labels = ["Negative", "Neutral", "Positive"] # Adjust based on your label mapping
69
+ for text, pred in zip(texts, predictions):
70
+ print(f"Text: {text} -> Sentiment: {labels[pred.item()]}")
71
+ ```
72
+
73
+ ### Example Output:
74
+
75
+ ```
76
+ Text: bu ürün çok iyi -> Sentiment: Positive
77
+ Text: bu ürün berbat -> Sentiment: Negative
78
+ ```
79
+
80
+ ## Installation
81
+
82
+ To use this model, install the following dependencies:
83
+
84
+ ```bash
85
+ pip install transformers
86
+ pip install torch
87
+ pip install datasets
88
+ ```
89
+
90
+ ## Model Card
91
+
92
+ - **Model Name**: Turkish-sentiment-ModernBERT-large
93
+ - **Hugging Face Repo**: [Link to Model Repository](https://huggingface.co/bayrameker/Turkish-sentiment-ModernBERT-large)
94
+ - **License**: MIT (or any applicable license you choose)
95
+ - **Author**: Bayram Eker
96
+ - **Date**: 2024-12-21
97
+
98
+ ## Training Details
99
+
100
+ - **Model**: ModernBERT-large
101
+ - **Framework**: PyTorch
102
+ - **Training Time**: Approximately 50 minutes (4 epochs)
103
+ - **Batch Size**: 64
104
+ - **Learning Rate**: 8e-5
105
+ - **Optimizer**: AdamW
106
+ - **Mixed Precision**: bf16 for A100 GPU
107
+
108
+ ## Acknowledgments
109
+
110
+ - The model was trained on the `winvoker/turkish-sentiment-analysis-dataset` dataset.
111
+ - Special thanks to the Hugging Face community and the contributors to the transformers library.
112
+ - Thanks to all contributors of the dataset and pretrained models.
113
+
114
+ ## Future Work
115
+
116
+ - Expand the model with more complex sentiment labels (e.g., multi-class sentiments, aspect-based sentiment analysis).
117
+ - Fine-tune the model on a larger, more diverse dataset for better generalization across various domains.
118
+
119
+ ## License
120
+
121
+ This model is licensed under the MIT License. See the LICENSE file for more details.