mesolitica/mallam-1.1B-4096
Text Generation
•
Updated
•
361
•
5
Pretrain from scratch 4096 context length on 90B tokens Malaysian text, https://huggingface.co/papers/2401.14680