Train almost any model on a variety of tasks such as llm finetuning, text classification/regression, summarization, question answering, image classification/regression, object detection, tabular data, etc for FREE using AutoTrain locally. ๐ฅ https://github.com/huggingface/autotrain-advanced
INTRODUCING Hugging Face AutoTrain Client ๐ฅ Fine-tuning models got even easier!!!! Now you can fine-tune SOTA models on all compatible dataset-model pairs on Hugging Face Hub using Python on Hugging Face Servers. Choose from a number of GPU flavors, millions of models and dataset pairs and 10+ tasks ๐ค
To try, install autotrain-advanced using pip. You can ignore dependencies and install without --no-deps and then you'd need to install some dependencies by hand.
๐จ NEW TASK ALERT ๐จ Extractive Question Answering: because sometimes generative is not all you need ๐ AutoTrain is the only open-source, no code solution to offer so many tasks across different modalities. Current task count: 23 ๐ Check out the blog post on getting started with this task: https://huggingface.co/blog/abhishek/extractive-qa-autotrain
๐จ NEW TASK ALERT ๐จ ๐ AutoTrain now supports Object Detection! ๐ Transform your projects with these powerful new features: ๐น Fine-tune any supported model from the Hugging Face Hub ๐น Seamless logging with TensorBoard or W&B ๐น Support for local and hub datasets ๐น Configurable training for tailored results ๐น Train locally or leverage Hugging Face Spaces ๐น Deployment-ready with API inference or Hugging Face endpoints AutoTrain: https://hf.co/autotrain
The first open Stable Diffusion 3-like architecture model is JUST out ๐ฃ - but it is not SD3! ๐ค
It is Tencent-Hunyuan/HunyuanDiT by Tencent, a 1.5B parameter DiT (diffusion transformer) text-to-image model ๐ผ๏ธโจ, trained with multi-lingual CLIP + multi-lingual T5 text-encoders for english ๐ค chinese understanding
๐๐๐๐ Introducing AutoTrain Configs! ๐๐๐๐ Now you can train models using yaml config files! ๐ฅ These configs are easy to understand and are not at all overwhelming. So, even a person with almost zero knowledge of machine learning can train state of the art models without writing any code. Check out example configs in the config directory of autotrain-advanced github repo and feel free to share configs by creating a pull request ๐ค Github repo: https://github.com/huggingface/autotrain-advanced
Trained another version of llama3-8b-instruct which beats the base model. This time without losing too many points on gsm8k benchmark. Again, using AutoTrain ๐ฅ pip install autotrain-advanced Trained model: abhishek/autotrain-llama3-orpo-v2
With AutoTrain, you can already finetune the latest llama3 models without writing a single line of code. Here's an example finetune of llama3 8b model: abhishek/autotrain-llama3-no-robots
The Stable Diffusion 3 research paper broken down, including some overlooked details! ๐
Model ๐ 2 base model variants mentioned: 2B and 8B sizes
๐ New architecture in all abstraction levels: - ๐ฝ UNet; โฌ๏ธ Multimodal Diffusion Transformer, bye cross attention ๐ - ๐ Rectified flows for the diffusion process - ๐งฉ Still a Latent Diffusion Model
๐ 3 text-encoders: 2 CLIPs, one T5-XXL; plug-and-play: removing the larger one maintains competitiveness
๐๏ธ Dataset was deduplicated with SSCD which helped with memorization (no more details about the dataset tho)
Variants ๐ A DPO fine-tuned model showed great improvement in prompt understanding and aesthetics โ๏ธ An Instruct Edit 2B model was trained, and learned how to do text-replacement
Results โ State of the art in automated evals for composition and prompt understanding โ Best win rate in human preference evaluation for prompt understanding, aesthetics and typography (missing some details on how many participants and the design of the experiment)