dhkim95 commited on
Commit
609730e
·
1 Parent(s): 5e6e313

Upload tokenizer

Browse files
special_tokens_map.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
  "bos_token": "<|endoftext|>",
3
  "eos_token": "<|endoftext|>",
4
- "unk_token": "<|endoftext|>"
5
  }
 
1
  {
2
  "bos_token": "<|endoftext|>",
3
  "eos_token": "<|endoftext|>",
4
+ "unk_token": "<unk>"
5
  }
tokenizer.json CHANGED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json CHANGED
@@ -5,5 +5,5 @@
5
  "eos_token": "<|endoftext|>",
6
  "model_max_length": 1024,
7
  "tokenizer_class": "GPT2Tokenizer",
8
- "unk_token": "<|endoftext|>"
9
  }
 
5
  "eos_token": "<|endoftext|>",
6
  "model_max_length": 1024,
7
  "tokenizer_class": "GPT2Tokenizer",
8
+ "unk_token": "<unk>"
9
  }
vocab.json CHANGED
The diff for this file is too large to render. See raw diff