Show-o: One Single Transformer to Unify Multimodal Understanding and Generation Paper ā¢ 2408.12528 ā¢ Published Aug 22, 2024 ā¢ 51
Gemma 2 2B Release Collection The 2.6B parameter version of Gemma 2. ā¢ 6 items ā¢ Updated 24 days ago ā¢ 78
Human-like Episodic Memory for Infinite Context LLMs Paper ā¢ 2407.09450 ā¢ Published Jul 12, 2024 ā¢ 60
LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs Paper ā¢ 2407.03963 ā¢ Published Jul 4, 2024 ā¢ 15
MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention Paper ā¢ 2407.02490 ā¢ Published Jul 2, 2024 ā¢ 23
SSMs Collection A collection of Mamba-2-based research models with 8B parameters trained on 3.5T tokens for comparison with Transformers. ā¢ 5 items ā¢ Updated about 13 hours ago ā¢ 26
The Hallucinations Leaderboard -- An Open Effort to Measure Hallucinations in Large Language Models Paper ā¢ 2404.05904 ā¢ Published Apr 8, 2024 ā¢ 8
SpaceByte: Towards Deleting Tokenization from Large Language Modeling Paper ā¢ 2404.14408 ā¢ Published Apr 22, 2024 ā¢ 6
The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions Paper ā¢ 2404.13208 ā¢ Published Apr 19, 2024 ā¢ 38
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases ā¢ 5 items ā¢ Updated about 1 month ago ā¢ 698
RecurrentGemma: Moving Past Transformers for Efficient Open Language Models Paper ā¢ 2404.07839 ā¢ Published Apr 11, 2024 ā¢ 43
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models Paper ā¢ 2402.19427 ā¢ Published Feb 29, 2024 ā¢ 52
Gemma release Collection Groups the Gemma models released by the Google team. ā¢ 40 items ā¢ Updated 24 days ago ā¢ 328