hyunwoo3235
commited on
Commit
·
0e624fc
1
Parent(s):
02d389a
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,46 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
|
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
+
language:
|
4 |
+
- ko
|
5 |
+
tags:
|
6 |
+
- deberta-v3
|
7 |
---
|
8 |
+
# deberta-v3-small-korean
|
9 |
+
|
10 |
+
## Model Details
|
11 |
+
|
12 |
+
DeBERTa는 Disentangled Attention과 Enhanced Masked Language Model을 통해 BERT의 성능을 향상시킨 모델입니다.
|
13 |
+
그중 DeBERTa V3은 ELECTRA-Style Pre-Training에 Gradient-Disentangled Embedding Sharing을 적용사여 DeBERTA를 개선했습니다.
|
14 |
+
|
15 |
+
이 연구는 구글의 TPU Research Cloud(TRC)를 통해 지원받은 Cloud TPU로 학습되었습니다.
|
16 |
+
|
17 |
+
## How to Get Started with the Model
|
18 |
+
|
19 |
+
```python
|
20 |
+
from transformers import AutoTokenizer, DebertaV2ForSequenceClassification
|
21 |
+
|
22 |
+
tokenizer = AutoTokenizer.from_pretrained("team-lucid/deberta-v3-small-korean")
|
23 |
+
model = DebertaV2ForSequenceClassification.from_pretrained("team-lucid/deberta-v3-small-korean")
|
24 |
+
|
25 |
+
inputs = tokenizer("안녕, 세상!", return_tensors="pt")
|
26 |
+
outputs = model(**inputs)
|
27 |
+
```
|
28 |
+
|
29 |
+
## Evaluation
|
30 |
+
|
31 |
+
| | Backbone<br/>Parameters(M) | **NSMC**<br/>(acc) | **PAWS**<br/>(acc) | **KorNLI**<br/>(acc) | **KorSTS**<br/>(spearman) | **Question Pair**<br/>(acc) |
|
32 |
+
|:-------------------|:--------------------------:|:------------------:|:------------------:|:--------------------:|:-------------------------:|:---------------------------:|
|
33 |
+
| DistilKoBERT | 22M | 88.41 | 62.55 | 70.55 | 73.21 | 92.48 |
|
34 |
+
| KoBERT | 85M | 89.63 | 80.65 | 79.00 | 79.64 | 93.93 |
|
35 |
+
| XLM-Roberta-Base | 85M | 89.49 | 82.95 | 79.92 | 79.09 | 93.53 |
|
36 |
+
| KcBERT-Base | 85M | 89.62 | 66.95 | 74.85 | 75.57 | 93.93 |
|
37 |
+
| KcBERT-Large | 302M | 90.68 | 70.15 | 76.99 | 77.49 | 94.06 |
|
38 |
+
| KoELECTRA-Small-v3 | 9.4M | 89.36 | 77.45 | 78.60 | 80.79 | 94.85 |
|
39 |
+
| KoELECTRA-Base-v3 | 85M | 90.63 | 84.45 | 82.24 | **85.53** | 95.25 |
|
40 |
+
| Ours | | | | | | |
|
41 |
+
| DeBERTa-xsmall | 22M | 91.21 | 84.40 | 82.13 | 83.90 | 95.38 |
|
42 |
+
| DeBERTa-small | 43M | **91.34** | 83.90 | 81.61 | 82.97 | 94.98 |
|
43 |
+
| DeBERTa-base | 86M | 91.22 | **85.5** | **82.81** | 84.46 | **95.77** |
|
44 |
+
|
45 |
+
\* 다른 모델의 결과는 [KcBERT-Finetune](https://github.com/Beomi/KcBERT-Finetune)
|
46 |
+
과 [KoELECTRA](https://github.com/monologg/KoELECTRA)를 참고했으며, Hyperparameter 역시 다른 모델과 유사하게 설정습니다.
|