view article Article **Intelligence Potentiation: An Evolutionary Perspective on AI Agent Designs** By KnutJaegersberg β’ 15 days ago β’ 3
view article Article SauerkrautLM's Multi-Phase Spectrum Training: A Technical Deep Dive By DavidGF β’ Nov 9, 2024 β’ 9
π«π· Calme-3 Collection Here you can find all the new Calme-3 models β’ 27 items β’ Updated 2 days ago β’ 10
Spectrum: Targeted Training on Signal to Noise Ratio Paper β’ 2406.06623 β’ Published Jun 7, 2024 β’ 12
VAGO solutions quants Collection Quantized version for the excellent german speaking models created by VAGO solutions. β’ 6 items β’ Updated Apr 20, 2024 β’ 2
Qwen2 Collection Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. β’ 39 items β’ Updated Nov 28, 2024 β’ 353
π Dataset comparison models Collection 1.8B models trained on 350BT to compare different pretraining datasets β’ 8 items β’ Updated Jun 12, 2024 β’ 35
view article Article LLM Comparison/Test: Llama 3 Instruct 70B + 8B HF/GGUF/EXL2 (20 versions tested and compared!) By wolfram β’ Apr 24, 2024 β’ 60
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases β’ 5 items β’ Updated 28 days ago β’ 698
π©πͺGerman SFT and DPO datasets Collection Datasets that can be used for LLM training with axolotl, trl or llama_factory. β’ 32 items β’ Updated Nov 11, 2024 β’ 11
Arcee's MergeKit: A Toolkit for Merging Large Language Models Paper β’ 2403.13257 β’ Published Mar 20, 2024 β’ 20
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper β’ 2402.17764 β’ Published Feb 27, 2024 β’ 605
Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens Paper β’ 2401.17377 β’ Published Jan 30, 2024 β’ 35
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling Paper β’ 2312.15166 β’ Published Dec 23, 2023 β’ 56