pipeline_tag: token-classification
tags:
- code
license: apache-2.0
datasets:
- Alex123321/english_cefr_dataset
language:
- en
metrics:
- accuracy
library_name: transformers
Model Card: BERT-based CEFR Classifier
Overview
This repository contains a model trained to predict Common European Framework of Reference (CEFR) levels for a given text using a BERT-based model architecture. The model was fine-tuned on the CEFR dataset, and the bert-base-...
pre-trained model was used as the base.
Model Details
- Model architecture: BERT (base model:
bert-base-...
) - Task: CEFR level prediction for text classification
- Training dataset: CEFR dataset
- Fine-tuning: Epochs, Loss, etc.
Performance
The model's performance during training is summarized below:
Epoch | Training Loss | Validation Loss |
---|---|---|
1 | 0.412300 | 0.396337 |
2 | 0.369600 | 0.388866 |
3 | 0.298200 | 0.419018 |
4 | 0.214500 | 0.481886 |
5 | 0.148300 | 0.557343 |
--Additional metrics:
--Training Loss: 0.2900624789151278 --Training Runtime: 5168.3962 seconds --Training Samples per Second: 10.642 --Total Floating Point Operations: 1.447162776576e+16
Usage
- Install the required libraries by running
pip install transformers
. - Load the trained model and use it for CEFR level prediction.
from transformers import pipeline
Load the model
model_name = "AbdulSami/bert-base-cased-cefr"
classifier = pipeline("text-classification", model=model_name)
Text for prediction
text = "This is a sample text for CEFR classification."
Predict CEFR level
predictions = classifier(text)
Print the predictions
print(predictions)