Julien Chaumond's picture

Julien Chaumond PRO

julien-c

AI & ML interests

<3 ML/AI for everyone, building products to propel communities fwd

Recent Activity

liked a model 11 days ago
Lightricks/LTX-Video
liked a dataset 12 days ago
CohereForAI/Global-MMLU
liked a model 12 days ago
fal/AuraFlow
View all activity

Articles

Organizations

Hugging Face's profile picture Nbconvert-internal's profile picture Notebooks-explorers's profile picture Safetensors's profile picture BigScience Workshop's profile picture Spaces-explorers's profile picture Flax Community's profile picture Templates's profile picture Hugging Face Course's profile picture Giskard's profile picture Text Generation Inference's profile picture ph-snps's profile picture Amazon SageMaker Community's profile picture Training Transformers Together's profile picture Hugging Chat's profile picture Atmos Bank's profile picture Godot Engine Demos's profile picture Pyodide Demos's profile picture Huggingface.js's profile picture Webhooks Explorers (BETA)'s profile picture Workshop June 13 Classroom's profile picture HF Canonical Model Maintainers's profile picture Open-Source AI Meetup's profile picture TRL's profile picture Scanned Tokens's profile picture HF Legal's profile picture Language Tools's profile picture Stable Diffusion concepts library's profile picture Teven-projects's profile picture Exbert-project's profile picture Banana-projects's profile picture Blog-explorers's profile picture EU org's profile picture Hacktoberfest 2023's profile picture huggingPartyParis's profile picture Enterprise Explorers's profile picture ZeroGPU Explorers's profile picture OpenAI community's profile picture ALBERT community's profile picture T5 community's profile picture Facebook AI community's profile picture BERT community's profile picture DistilBERT community's profile picture Transformer-XL community's profile picture XLNet community's profile picture choosealicense.com mirror's profile picture Social Post Explorers's profile picture Dev Mode Explorers's profile picture Test's profile picture private beta for deeplinks's profile picture Paris AI Running Club's profile picture kmhf's profile picture Hugging Face Party @ PyTorch Conference's profile picture Nerdy Face's profile picture Hugging Face Science's profile picture open/ acc's profile picture DDUF's profile picture

julien-c's activity

replied to burtenshaw's post 13 days ago
reacted to burtenshaw's post with 🤗❤️ 13 days ago
view post
Post
2603
People are flexing their end of year stats, so I made this app to show hub stats in a tidy design!

Thanks @Ameeeee and @jfcalvo for the feature from Argilla!
burtenshaw/recap
  • 1 reply
·
replied to victor's post 14 days ago
reacted to Kseniase's post with 🔥 16 days ago
view post
Post
2788
TL;DR: The Story of Attention's Development by @karpathy

Origin: First proposed in 2014 by @Dzmitry Bahdanau, @KyunghyunCho , and Yoshua Bengio in Neural Machine Translation by Jointly Learning to Align and Translate (1409.0473) . Inspired by cognitive processes and later renamed from "RNNSearch."

Key Idea: A data-dependent weighted average for pooling and communication, enabling flexible and powerful neural network connections.

Breakthrough: Bahdanau's "soft search" mechanism (softmax + weighted averaging) solved encoder-decoder bottlenecks in machine translation.
Transformer Revolution: Attention Is All You Need (1706.03762) (2017) by @ashishvaswanigoogle et al. simplified architectures by stacking attention layers, introducing multi-headed attention and positional encodings.
Legacy: Attention replaced RNNs, driving modern AI systems like ChatGPT. It emerged independently but was influenced by contemporaneous work like Alex Graves’s Neural Turing Machines (1410.5401) and Jason Weston’s Memory Networks (1410.3916) .

Attention to history: Jürgen Schmidhuber claims his 1992 Fast Weight Programmers anticipated modern attention mechanisms. While conceptually similar, the term “attention” was absent, and there’s no evidence it influenced Bahdanau, Cho, and Bengio’s 2014 work. Paying attention (!) to history might have brought us to genAI earlier – but credit for the breakthrough still goes to Montreal.

Referenced Papers:
Attention Origin: Neural Machine Translation by Jointly Learning to Align and Translate (1409.0473)
Transformers: Attention Is All You Need (1706.03762)
Alex Graves' Work: Neural Turing Machines (1410.5401), Generating Sequences With Recurrent Neural Networks (1308.0850)
Jason Weston @spermwhale 's Memory Networks (1410.3916)
Sequence to Sequence Learning with Neural Networks (1409.3215) by Ilya Sutskever ( @ilyasut ), Oriol Vinyals, Quoc V. Le

Who else deserves recognition in this groundbreaking narrative of innovation? Let’s ensure every contributor gets the credit they deserve. Leave a comment below 👇🏻🤗
·
replied to Duskfallcrew's post 20 days ago
view reply

Public storage- y'all ... HF are you nuts?

i can neither confirm nor deny

reacted to FranckAbgrall's post with 👍 20 days ago
view post
Post
1981
Hey!

✨ If you're using HF access tokens, we just released an overview of the permissions for fine-grained tokens by hovering over the badge on token settings page (org and user)

It will show the highest permission you've set for each entity 👀
reacted to their post with 😎🤝 21 days ago
view post
Post
7825
After some heated discussion 🔥, we clarify our intent re. storage limits on the Hub

TL;DR:
- public storage is free, and (unless blatant abuse) unlimited. We do ask that you consider upgrading to PRO and/or Enterprise Hub if possible
- private storage is paid above a significant free tier (1TB if you have a paid account, 100GB otherwise)

docs: https://huggingface.co/docs/hub/storage-limits

We optimize our infrastructure continuously to scale our storage for the coming years of growth in Machine learning, to the benefit of the community 🔥

cc: @reach-vb @pierric @victor and the HF team
·
reacted to their post with 👍🤗❤️🔥 22 days ago
view post
Post
7825
After some heated discussion 🔥, we clarify our intent re. storage limits on the Hub

TL;DR:
- public storage is free, and (unless blatant abuse) unlimited. We do ask that you consider upgrading to PRO and/or Enterprise Hub if possible
- private storage is paid above a significant free tier (1TB if you have a paid account, 100GB otherwise)

docs: https://huggingface.co/docs/hub/storage-limits

We optimize our infrastructure continuously to scale our storage for the coming years of growth in Machine learning, to the benefit of the community 🔥

cc: @reach-vb @pierric @victor and the HF team
·
reacted to burtenshaw's post with 🔥 22 days ago
view post
Post
2402
Quick update from week 1 of smol course. The community is taking the driving seat and using the material for their own projects. If you want to do the same, join in!

- we have ongoing translation projects in Korean, Vietnamese, Portuguese, and Spanish
- 3 chapters are ready for students. On topics like, instruction tuning, preference alignment, and parameter efficient fine tuning
- 3 chapters are in progress on evaluation, vision language models, and synthetic data.
- around 780 people have forked the repo to use it for learning, teaching, sharing.

⏭️ Next step is to support people that want to use the course for teaching, content creation, internal knowledge sharing, or anything. If you're into this. Drop an issue or PR

REPO: https://buff.ly/3ZCMKX2
discord channel: https://buff.ly/4f9F8jA
reacted to bartowski's post with 👀 22 days ago
view post
Post
10593
Looks like Q4_0_N_M file types are going away

Before you panic, there's a new "preferred" method which is online (I prefer the term on-the-fly) repacking, so if you download Q4_0 and your setup can benefit from repacking the weights into interleaved rows (what Q4_0_4_4 was doing), it will do that automatically and give you similar performance (minor losses I think due to using intrinsics instead of assembly, but intrinsics are more maintainable)

You can see the reference PR here:

https://github.com/ggerganov/llama.cpp/pull/10446

So if you update your llama.cpp past that point, you won't be able to run Q4_0_4_4 (unless they add backwards compatibility back), but Q4_0 should be the same speeds (though it may currently be bugged on some platforms)

As such, I'll stop making those newer model formats soon, probably end of this week unless something changes, but you should be safe to download and Q4_0 quants and use those !

Also IQ4_NL supports repacking though not in as many shapes yet, but should get a respectable speed up on ARM chips, PR for that can be found here: https://github.com/ggerganov/llama.cpp/pull/10541

Remember, these are not meant for Apple silicon since those use the GPU and don't benefit from the repacking of weights
·
posted an update 22 days ago
view post
Post
7825
After some heated discussion 🔥, we clarify our intent re. storage limits on the Hub

TL;DR:
- public storage is free, and (unless blatant abuse) unlimited. We do ask that you consider upgrading to PRO and/or Enterprise Hub if possible
- private storage is paid above a significant free tier (1TB if you have a paid account, 100GB otherwise)

docs: https://huggingface.co/docs/hub/storage-limits

We optimize our infrastructure continuously to scale our storage for the coming years of growth in Machine learning, to the benefit of the community 🔥

cc: @reach-vb @pierric @victor and the HF team
·
reacted to cfahlgren1's post with 👀👍 26 days ago
view post
Post
3010
We just dropped an LLM inside the SQL Console 🤯

The amazing, new Qwen/Qwen2.5-Coder-32B-Instruct model can now write SQL for any Hugging Face dataset ✨

It's 2025, you shouldn't be hand writing SQL! This is a big step in making it where anyone can do in depth analysis on a dataset. Let us know what you think 🤗