LLM_project

This model is a fine-tuned version of distilbert-base-uncased-finetuned-sst-2-english on IMDb reviews dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0852
  • Accuracy: 0.9804

Model description

This model is a fine-tuned version of the DistilBERT model, which is a smaller, faster, and lighter version of BERT (Bidirectional Encoder Representations from Transformers). The base model has been pre-trained on a large corpus of English data in a self-supervised fashion, and fine-tuning was performed using a sentiment analysis dataset. The model is uncased, meaning it does not distinguish between uppercase and lowercase letters.

DistilBERT retains 97% of BERT's language understanding while being 60% faster and 40% smaller, making it highly efficient for various NLP tasks including sentiment analysis, which this model is specifically tuned for.

Intended uses & limitations

Intended Uses:

Sentiment analysis of English text, particularly for binary classification tasks such as identifying positive and negative sentiments. Can be applied to product reviews, social media posts, customer feedback, etc.

Limitations:

The model's performance is highly dependent on the quality and representativeness of the fine-tuning dataset. May not perform well on text data that is very different from the fine-tuning dataset. Limited by the scope of sentiment analysis and may not capture nuanced sentiments or complex emotions. Not suitable for tasks outside binary sentiment classification without further fine-tuning.

Training and evaluation data

The model was evaluated on a separate validation set that was not seen during training. This evaluation set is also designed for sentiment analysis and includes examples that reflect real-world use cases.

Training procedure

Procedure

  1. Data Preprocessing: Text data was tokenized using the DistilBERT tokenizer, which converts text into a format suitable for the model.
  2. Model Fine-Tuning: The pre-trained DistilBERT model was fine-tuned on the training dataset. Fine-tuning involves adjusting the weights of the model to better fit the sentiment analysis task.
  3. Evaluation: After training, the model was evaluated on the validation set to measure its performance in terms of loss and accuracy.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Accuracy
0.0743 1.0 1250 0.1208 0.9696
0.145 2.0 2500 0.0852 0.9804
0.0322 3.0 3750 0.1043 0.9822

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.1+cpu
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
18
Safetensors
Model size
67M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for ThuyTran102/LLM_project