FIM-Tokens not marked special

#4
by ruediste - opened

Hi
I debugged the tokenizer stack for a few hours until I discovered that the FIM tokens are not marked special (<|fim_prefix|>,<|fim_middle|>,<|fim_suffix|>, etc). Any reason for this? Below an excerpt from tokenizer.json

{
      "id": 151660,
      "content": "<|fim_middle|>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": false
    },

Sign up or log in to comment