|
--- |
|
license: apache-2.0 |
|
base_model: sentence-transformers/all-MiniLM-L6-v2 |
|
tags: |
|
- generated_from_trainer |
|
metrics: |
|
- accuracy |
|
model-index: |
|
- name: new_classifier_model |
|
results: [] |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# new_classifier_model |
|
|
|
This model is a fine-tuned version of [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) on the None dataset. |
|
It achieves the following results on the evaluation set: |
|
- Loss: 0.4181 |
|
- Accuracy: 0.9193 |
|
|
|
## Model description |
|
|
|
The model is fine-tuned with academic publications in Linguistics, to classify texts in publications into 4 classes as a filter to other tasks. |
|
|
|
The 4 classes: |
|
- 0: out of scope - materials that are of low significance, eg. page number and page header, noise from OCR/pdf-to-text convertion |
|
- 1: main text - texts that are the main texts of the publication, to be used for down-stream tasks |
|
- 2: examples - texts that are captions of the figures, or quotes or excerpts |
|
- 3: references - references of the publication, excluding in-text citations |
|
|
|
## Intended uses & limitations |
|
|
|
Intended uses: |
|
- to extract main text in academic texts for down-stream tasks |
|
|
|
Limitations: |
|
- training and evaluation data is limited to English, and academic texts in Linguistics |
|
|
|
## Try it yourself with the following examples (not in training/ evaluation data) |
|
Excerpts from Chomsky, N. (2014). Aspects of the Theory of Syntax (No. 11). MIT press. |
|
retrieved from https://apps.dtic.mil/sti/pdfs/AD0616323.pdf |
|
|
|
- In the case of (ioii) and (1 lii), the passive transformation will |
|
apply to the embedded sentence, and in all four cases other |
|
operations will give the final surface forms of (8) and (g). |
|
|
|
|
|
- (10) (i) Noun Phrase — Verb — Noun Phrase — Sentence |
|
(/ — persuaded — a specialist — a specialist will examine |
|
John) |
|
(ii) Noun Phrase — Verb — Noun Phrase — Sentence |
|
(/ — persuaded — John — a specialist will examine John) |
|
|
|
|
|
- (13) S |
|
Det |
|
Predicate-Phrase |
|
[+Definite] nom VP |
|
their |
|
F1...Fm Det N |
|
destroy [+Definite] G, ... G, |
|
the property |
|
|
|
- 184 SOME RESIDUAL PROBLEMS |
|
|
|
- Peshkovskii, A. M. (1956). Russkii Sintaksis v Nauchnom Osveshchenii. |
|
Moscow. |
|
|
|
|
|
|
|
|
|
## Training and evaluation data |
|
|
|
More information needed |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 2e-05 |
|
- train_batch_size: 16 |
|
- eval_batch_size: 16 |
|
- seed: 42 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: linear |
|
- num_epochs: 10 |
|
|
|
### Training results |
|
|
|
| Training Loss | Epoch | Step | Validation Loss | Accuracy | |
|
|:-------------:|:-----:|:----:|:---------------:|:--------:| |
|
| 0.5772 | 1.0 | 762 | 0.3256 | 0.9062 | |
|
| 0.2692 | 2.0 | 1524 | 0.3038 | 0.9163 | |
|
| 0.217 | 3.0 | 2286 | 0.3109 | 0.9180 | |
|
| 0.1773 | 4.0 | 3048 | 0.3160 | 0.9209 | |
|
| 0.1619 | 5.0 | 3810 | 0.3440 | 0.9206 | |
|
| 0.1329 | 6.0 | 4572 | 0.3675 | 0.9160 | |
|
| 0.1165 | 7.0 | 5334 | 0.3770 | 0.9209 | |
|
| 0.0943 | 8.0 | 6096 | 0.4012 | 0.9203 | |
|
| 0.085 | 9.0 | 6858 | 0.4166 | 0.9196 | |
|
| 0.0811 | 10.0 | 7620 | 0.4181 | 0.9193 | |
|
|
|
|
|
### Framework versions |
|
|
|
- Transformers 4.34.1 |
|
- Pytorch 2.1.0+cpu |
|
- Datasets 2.14.7 |
|
- Tokenizers 0.14.1 |
|
|