Alpaca Style Datasets Collection Datasets which follow the Alpaca Style format based on having 'instruction', 'input', and 'output' columns • 4062 items • Updated 14 days ago • 2
Probably function calling datasets Collection Created using the https://huggingface.co/spaces/librarian-bots/dataset-column-search-api Space. • 39 items • Updated Jul 17, 2024 • 36
🪐 SmolLM Collection A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated 9 days ago • 205
view article Article Wikipedia's Treasure Trove: Advancing Machine Learning with Diverse Data By frimelle • Jun 3, 2024 • 13
ZeroGPU Spaces Collection ZeroGPU Spaces made by the community • 17 items • Updated Jun 6, 2024 • 231
haiku Collection 🌸 This is a collection of synthetic datasets built to help improve the ability of open language models to better write haikus through the use of DPO • 3 items • Updated Jun 21, 2024 • 6
Image dataset Collection 10 datasets showcase how to configure and load image datasets • 10 items • Updated Aug 2, 2024 • 4