conan1024hao commited on
Commit
c134f4c
·
1 Parent(s): a1ef4b6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -2
README.md CHANGED
@@ -13,5 +13,23 @@ widget:
13
  - text: "불고기[MASK] 먹겠습니다."
14
  ---
15
 
16
- ### Introduction
17
- This model is trained by ZH, JA, KO's Wikipedia. Please see the config file about details.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  - text: "불고기[MASK] 먹겠습니다."
14
  ---
15
 
16
+ ### Model description
17
+ This model was trained on ZH, JA, KO's Wikipedia (5 epochs).
18
+
19
+ ### How to use
20
+ ```python
21
+ from transformers import AutoTokenizer, AutoModelForMaskedLM
22
+ tokenizer = AutoTokenizer.from_pretrained("conan1024hao/cjkbert-small")
23
+ model = AutoModelForMaskedLM.from_pretrained("conan1024hao/cjkbert-small")
24
+ ```
25
+ You don't need to do any text segmentation when you do downstream tasks.
26
+
27
+ ### Tokenization
28
+ We use character-based tokenization with whole-word-masking strategy.
29
+
30
+ ### Model size
31
+ - vocab_size: 15015
32
+ - num_hidden_layers: 4
33
+ - hidden_size: 512
34
+ - num_attention_heads: 8
35
+ - param_num: 25M