|
--- |
|
library_name: transformers |
|
datasets: |
|
- fancyzhx/ag_news |
|
metrics: |
|
- accuracy |
|
model-index: |
|
- name: distillbert-uncased-ag-news |
|
results: |
|
- task: |
|
name: Text Classification |
|
type: text-classification |
|
dataset: |
|
name: ag_news |
|
type: ag_news |
|
args: default |
|
metrics: |
|
- name: Accuracy |
|
type: accuracy |
|
value: 0.9265 |
|
--- |
|
|
|
# Akirami/distillbert-uncased-ag-news |
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
|
|
|
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. |
|
|
|
- **Developed by:** [Akirami](https://huggingface.co/Akirami) |
|
- **Model type:** DistillBert |
|
- **License:** MIT |
|
- **Finetuned from model [optional]:** [distilbert/distilbert-base-uncased](https://huggingface.co/distilbert/distilbert-base-uncased) |
|
|
|
### Model Sources |
|
|
|
<!-- Provide the basic links for the model. --> |
|
|
|
- **Repository:** [Akirami/distillbert-uncased-ag-news](https://huggingface.co/Akirami/distillbert-uncased-ag-news) |
|
|
|
|
|
|
|
## How to Get Started with the Model |
|
|
|
Use the code below to get started with the model. |
|
```python |
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("Akirami/distillbert-uncased-ag-news") |
|
model = AutoModelForSequenceClassification.from_pretrained("Akirami/distillbert-uncased-ag-news") |
|
``` |
|
## Training Details |
|
|
|
### Training Data |
|
|
|
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. --> |
|
|
|
[AG News Dataset](https://huggingface.co/datasets/fancyzhx/ag_news) |
|
|
|
### Training Procedure |
|
|
|
The model has been trained through Knowledge Distillation, where the teacher model is [nateraw/bert-base-uncased-ag-news](https://huggingface.co/nateraw/bert-base-uncased-ag-news) and the student model is [distilbert/distilbert-base-uncased](https://huggingface.co/distilbert/distilbert-base-uncased) |
|
|
|
#### Preprocessing [optional] |
|
|
|
[More Information Needed] |
|
|
|
|
|
#### Training Hyperparameters |
|
|
|
- **Training regime:** Trained in fp16 format |
|
|
|
#### Speeds, Sizes, Times [optional] |
|
|
|
<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. --> |
|
|
|
[More Information Needed] |
|
|
|
## Evaluation |
|
|
|
<!-- This section describes the evaluation protocols and provides the results. --> |
|
|
|
### Testing Data, Factors & Metrics |
|
|
|
#### Testing Data |
|
|
|
The test portion of AG News data is used for testing |
|
|
|
#### Metrics |
|
|
|
Classification Report: |
|
| Class | Precision | Recall | F1-Score | Support | |
|
|-------|-----------|--------|----------|---------| |
|
| 0 | 0.95 | 0.92 | 0.94 | 1900 | |
|
| 1 | 0.98 | 0.98 | 0.98 | 1900 | |
|
| 2 | 0.90 | 0.88 | 0.89 | 1900 | |
|
| 3 | 0.88 | 0.92 | 0.90 | 1900 | |
|
| **Accuracy** | | | **0.93** | **7600** | |
|
| **Macro Avg** | **0.93** | **0.93** | **0.93** | **7600** | |
|
| **Weighted Avg** | **0.93** | **0.93** | **0.93** | **7600** | |
|
|
|
|
|
Balanced Accuracy Score: 0.926578947368421 |
|
|
|
Accuracy Score: 0.9265789473684211 |
|
|
|
## Environmental Impact |
|
|
|
<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly --> |
|
|
|
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). |
|
|
|
- **Hardware Type:** [T4 GPU] |
|
- **Hours used:** [25 Minutes] |
|
- **Cloud Provider:** [Google Colab] |
|
- **Carbon Emitted:** [0.01] |