Vocabulary issues
#1
by
Lambent
- opened
I think when I merged the Lumen adapter in, I inadvertently did it in fp16, and this for some reason simplified the embedding layer and vocabulary size to shed the 399 tokens it had room for but did not use? So it's weirdly incompatible with same-ancestor models currently. Seems functional, but that's still unfortunate.
... Actually no, it's when I 'reinstructed' it. So I'm unsure what exactly introduced the error.