conan1024hao
/

cjkbert-small

Inference Endpoints

Model card Files Files and versions Community

conan1024hao commited on May 14, 2022

Commit

c134f4c

·

1 Parent(s): a1ef4b6

Update README.md

Files changed (1) hide show

README.md +20 -2

README.md CHANGED Viewed

@@ -13,5 +13,23 @@ widget:
 - text: "불고기[MASK] 먹겠습니다."
 ---
-### Introduction
-This model is trained by ZH, JA, KO's Wikipedia. Please see the config file about details.

 - text: "불고기[MASK] 먹겠습니다."
 ---
+### Model description
+This model was trained on ZH, JA, KO's Wikipedia (5 epochs).
+### How to use
+```python
+from transformers import AutoTokenizer, AutoModelForMaskedLM
+tokenizer = AutoTokenizer.from_pretrained("conan1024hao/cjkbert-small")
+model = AutoModelForMaskedLM.from_pretrained("conan1024hao/cjkbert-small")
+```
+You don't need to do any text segmentation when you do downstream tasks.
+### Tokenization
+We use character-based tokenization with whole-word-masking strategy.
+### Model size
+- vocab_size: 15015
+- num_hidden_layers: 4
+- hidden_size: 512
+- num_attention_heads: 8
+- param_num: 25M