AbdulSami commited on
Commit
fa1a67f
·
1 Parent(s): 1522602

# Model Card: BERT-based CEFR Classifier

## Overview

This repository contains a model trained to predict Common European Framework of Reference (CEFR) levels for a given text using a BERT-based model architecture. The model was fine-tuned on the CEFR dataset, and the `bert-base-...` pre-trained model was used as the base.

## Model Details

- Model architecture: BERT (base model: `bert-base-...`)
- Task: CEFR level prediction for text classification
- Training dataset: CEFR dataset
- Fine-tuning: Epochs, Loss, Accuracy, etc.

## Performance

The model's performance during training is summarized below:

| Epoch | Training Loss | Validation Loss | Accuracy |
|-------|---------------|-----------------|----------|
| 1 | 1.491800 | 1.319211 | 0.420690 |
| 2 | 1.238600 | 0.864768 | 0.700447 |
| 3 | 0.813200 | 0.538081 | 0.815057 |

Additional metrics:

- Training Loss: 1.1851
- Training Runtime: 7465.51 seconds
- Training Samples per Second: 7.633
- Total Floating Point Operations: 1.499392196785152e+16

## Usage

1. Install the required libraries by running `pip install transformers`.
2. Load the trained model and use it for CEFR level prediction.


from transformers import pipeline

# Load the model
model_name = "AbdulSami/bert-base-cased-cefr"
classifier = pipeline("text-classification", model=model_name)

# Text for prediction
text = "This is a sample text for CEFR classification."

# Predict CEFR level
predictions = classifier(text)

# Print the predictions
print(predictions)

Files changed (1) hide show
  1. README.md +8 -0
README.md CHANGED
@@ -2,4 +2,12 @@
2
  pipeline_tag: token-classification
3
  tags:
4
  - code
 
 
 
 
 
 
 
 
5
  ---
 
2
  pipeline_tag: token-classification
3
  tags:
4
  - code
5
+ license: apache-2.0
6
+ datasets:
7
+ - Alex123321/english_cefr_dataset
8
+ language:
9
+ - en
10
+ metrics:
11
+ - accuracy
12
+ library_name: transformers
13
  ---