109 11 213

s3nh

AI & ML interests

Quantization, LLMs, Deep Learning for good. Follow me if you like my work. Patreon.com/s3nh

Recent Activity

upvoted a paper 9 days ago

A Rank Stabilization Scaling Factor for Fine-Tuning with LoRA

reacted to sayakpaul's post with 🔥 9 days ago

Commits speak louder than words 🤪 * 4 new video models * Multiple image models, including SANA & Flux Control * New quantizers -> GGUF & TorchAO * New training scripts Enjoy this holiday-special Diffusers release 🤗 Notes: https://github.com/huggingface/diffusers/releases/tag/v0.32.0

updated a Space 10 days ago

SmolTuners/README

View all activity

Organizations

s3nh's activity

upvoted a paper 9 days ago

A Rank Stabilization Scaling Factor for Fine-Tuning with LoRA

Paper • 2312.03732 • Published Nov 28, 2023 • 8

reacted to sayakpaul's post with 🔥 9 days ago

Post

3746

Commits speak louder than words 🤪

* 4 new video models
* Multiple image models, including SANA & Flux Control
* New quantizers -> GGUF & TorchAO
* New training scripts

Enjoy this holiday-special Diffusers release 🤗
Notes: https://github.com/huggingface/diffusers/releases/tag/v0.32.0

updated a Space 10 days ago

Running

😻

README

New activity in SmolTuners/README 10 days ago

Gh organization

#3 opened 11 days ago by

s3nh

New activity in SmolTuners/README 11 days ago

Optimizers

#2 opened 11 days ago by

s3nh

liked a dataset 11 days ago

fluently-sets/ultraset

Viewer • Updated 13 days ago • 785k • 160 • 3

New activity in SmolTuners/README 13 days ago

Datasets

#1 opened 15 days ago by

s3nh

reacted to merve's post with 🧠 15 days ago

Post

1739

A complete RAG pipeline includes a reranker, which ranks the documents to find the best document 📓
Same goes for multimodal RAG, multimodal rerankers which we can integrate to multimodal RAG pipelines!
Learn how to build a complete multimodal RAG pipeline with vidore/colqwen2-v1.0 as retriever, lightonai/MonoQwen2-VL-v0.1 as reranker, Qwen/Qwen2-VL-7B-Instruct as VLM in this notebook that runs on a GPU as small as L4 🔥 https://huggingface.co/learn/cookbook/multimodal_rag_using_document_retrieval_and_reranker_and_vlms

reacted to fdaudens's post with 🤗 15 days ago

Post

1207

🤝 Want to share your AI models while protecting your work? Licenses are key!

Fascinating to see that nearly 60% of models on the Hub use Apache & MIT licenses.

Explore the viz here: huggingface/open-source-ai-year-in-review-2024

reacted to Lewdiculous's post with ➕ 15 days ago

Post

2612

Hello fellow LLMers, just a quick notice that some of my activity will be moved into the AetherArchitectural Commuity and split with @Aetherarchio .

[here] https://huggingface.co/AetherArchitectural

All activity should be visible in the left side of my profile.

1 reply

reacted to fdaudens's post with 👍 15 days ago

Post

1244

🔍 From instruction-following to creative storytelling, dive into 2024's most impactful AI datasets! These gems are shaping everything from scientific research to video understanding.

Check it out: huggingface/open-source-ai-year-in-review-2024

replied to louisbrulenaudet's post 15 days ago

very useful, thanks!

reacted to louisbrulenaudet's post with 🤗 15 days ago

Post

1785

I’ve published a new dataset to simplify model merging 🤗

This dataset facilitates the search for compatible architectures for model merging with @arcee_ai’s mergekit, streamlining the automation of high-performance merge searches 📖

Dataset : louisbrulenaudet/mergekit-configs

1 reply

reacted to nyuuzyou's post with 👍 15 days ago

Post

1509

✈️ Aircraft Dataset & Generation Model nyuuzyou/aircraft-images & nyuuzyou/AircraftFLUX-LoRA

Dataset Features:
• 165,340 high-res aircraft images with metadata
• Machine-generated English captions
• Detailed aircraft specs, registration & flight info
• Environmental context descriptions

LoRA model specializes in:
• Realistic aircraft generation
• Accurate technical details for unpopular airplanes compared to black-forest-labs/FLUX.1-schnell
• Proper airline liveries
• Contextual aviation scenes

liked 2 models 15 days ago

opencsg/csg-wukong-1B

Text Generation • Updated Aug 3, 2024 • 54 • 13

MBZUAI/MobiLlama-1B-Chat

Text Generation • Updated Feb 28, 2024 • 205 • 24

replied to danielhanchen's post 15 days ago

Amazing, thank you!

reacted to danielhanchen's post with 🤗👍 15 days ago

Post

1454

I uploaded GGUFs, 4bit bitsandbytes and full 16bit precision weights for Llama 3.3 70B Instruct are here: unsloth/llama-33-all-versions-67535d7d994794b9d7cf5e9f

You can also finetune Llama 3.3 70B in under 48GB of VRAM with Unsloth!
GGUFs: unsloth/Llama-3.3-70B-Instruct-GGUF
BnB 4bit: unsloth/Llama-3.3-70B-Instruct-bnb-4bit
16bit: unsloth/Llama-3.3-70B-Instruct

1 reply

reacted to stefan-it's post with ❤️ 15 days ago

Post

1183

My latest project is the outcome of the last 2+ years working with TPUs from the amazing TPU Research Cloud (TRC) program and training Encoder-only LMs with the TensorFlow Model Garden library.

👉 Link: https://github.com/stefan-it/model-garden-lms

An overview of some features:

- Cheatsheet for setting-up a TPU VM Pod (with all necessary dependencies) to pretrain LMs with TF Model Garden
- Conversion scripts that convert TF Model Garden weights to Hugging Face Transformers-compatible models
- Supported architectures include BERT, BERT with Token Dropping and TEAMS

I also released BERT-based models pretrained on the great Hugging Face FineWeb and FineWeb-Edu datasets (10BT subset). With more to come!

👉 Model Hub Link: https://huggingface.co/model-garden-lms

If you find these resources useful, please give them a like!

Made from Bavarian Oberland with ❤️ and 🥨.