--- license: mit language: - tr datasets: - winvoker/turkish-sentiment-analysis-dataset metrics: - accuracy base_model: - answerdotai/ModernBERT-base --- ```markdown # Turkish Sentiment Modern BERT ``` This model is a fine-tuned **ModernBERT** for **Turkish Sentiment Analysis**. It was trained on the [winvoker/turkish-sentiment-analysis-dataset](https://huggingface.co/datasets/winvoker/turkish-sentiment-analysis-dataset) and is designed to classify Turkish text into sentiment categories, such as **Positive**, **Negative**, and **Neutral**. ## Model Overview - **Model Type**: ModernBERT (BERT variant) - **Task**: Sentiment Analysis - **Languages**: Turkish - **Dataset**: [winvoker/turkish-sentiment-analysis-dataset](https://huggingface.co/datasets/winvoker/turkish-sentiment-analysis-dataset) - **Labels**: Positive, Negative, Neutral - **Fine-Tuning**: Fine-tuned for sentiment classification. ## Performance Metrics The model was trained for **2 epochs** with the following results: | Epoch | Training Loss | Validation Loss | Accuracy | F1 Score | |-------|---------------|-----------------|-----------|-----------| | 1 | 0.2182 | 0.1920 | 92.16% | 84.57% | | 2 | 0.1839 | 0.1826 | 92.58% | 86.05% | - **Training Loss**: Measures the model's fit to the training data. - **Validation Loss**: Measures the model's generalization to unseen data. - **Accuracy**: The percentage of correct predictions over all examples. - **F1 Score**: A balanced metric between precision and recall. ## Model Inference Example Here’s an example of how to use the model for sentiment analysis of Turkish text: ```python from transformers import AutoModelForSequenceClassification, AutoTokenizer import torch # Load the pre-trained model and tokenizer model_name = "bayrameker/turkish-sentiment-modern-bert" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForSequenceClassification.from_pretrained(model_name) # Example texts for prediction texts = ["bu ürün çok iyi", "bu ürün berbat"] # Tokenize the inputs inputs = tokenizer(texts, padding=True, truncation=True, return_tensors="pt") # Make predictions with torch.no_grad(): logits = model(**inputs).logits # Get the predicted sentiment labels predictions = torch.argmax(logits, dim=-1) labels = ["Negative", "Neutral", "Positive"] # Adjust based on your label mapping for text, pred in zip(texts, predictions): print(f"Text: {text} -> Sentiment: {labels[pred.item()]}") ``` ### Example Output: ``` Text: bu ürün çok iyi -> Sentiment: Positive Text: bu ürün berbat -> Sentiment: Negative ``` ## Installation To use this model, first install the required dependencies: ```bash pip install transformers pip install torch pip install datasets ``` ## Model Card - **Model Name**: turkish-sentiment-modern-bert - **Hugging Face Repo**: [Link to Model Repository](https://huggingface.co/bayrameker/turkish-sentiment-modern-bert) - **License**: MIT (or another applicable license) - **Author**: Bayram Eker - **Date**: 2024-12-21 ## Training Details - **Model**: ModernBERT (Base variant) - **Framework**: PyTorch - **Training Time**: 34 minutes (2 epochs) - **Batch Size**: 32 - **Learning Rate**: 8e-5 - **Optimizer**: AdamW ## Acknowledgments - The model was trained on the [winvoker/turkish-sentiment-analysis-dataset](https://huggingface.co/datasets/winvoker/turkish-sentiment-analysis-dataset). - Special thanks to the Hugging Face community and all contributors to the transformers library. ## Future Work - Expand the model with more complex sentiment labels (e.g., multi-class sentiment, aspect-based sentiment analysis). - Fine-tune the model on a larger, more diverse dataset for better generalization across various domains. ## License This model is licensed under the MIT License. See the LICENSE file for more details.