Flex.1-alpha

Description

Flex.1 alpha is a pre-trained base 8 billion parameter rectified flow transformer capable of generating images from text descriptions. It has a similar architecture to FLUX.1-dev, but with fewer double transformer blocks (8 vs 19). It began as a finetune of FLUX.1-schnell which allows the model to retain the Apache 2.0 license. A guidance embedder has been trained for it so that it no longer requires CFG to generate images.

Model Specs

8 billion parameters
Guidance embedder
True CFG capable
Fine tunable
OSI compliant license (Apache 2.0)
512 token length input

Support Needed

I am just a solo Machine Learning Engineer doing this in my free time with my own money because I truly believe in open source models. I have already spent a significant amount of time and money to get this model to where it is. But to get this model where I want it to be, I need to continue to dump a significant amount of time and money into it, well beyond what I am financially capable of doing on my own. I have set up a Patreon for those individuals and organizations that want to financially support this project. I plan to also allow support in other ways soon for those that prefer to get their hands dirty.

Usage

The model can be used almost identically to FLUX.1-dev and will work out of the box with most inference engines that support that. (Diffusers, ComfyUI etc.)

For ComfyUI, there is an all in one file called Flex.1-alpha.safetensors. Put this in your checkpoints folder and use like you would FLUX.1-dev.

More detailed instructions coming soon.

History

Flex.1 started as the FLUX.1-schnell-training-adapter to make training LoRAs on FLUX.1-schnell possible. The original goal was to train a LoRA that can be activated during training to allow for fine tuning on the step compressed model. I merged this adapter into FLUX.1-schnell and continued to train it on images generated by the FLUX.1-schnell model to further break down the compression, without injecting any new data, with the goal of making a stand-alone base model. This became OpenFLUX.1, which was continuously trained for months, resulting in 10 version releases. After the final release of OpenFLUX.1, I began training the model on new data and began experimenting with pruning. I ended up with pruned versions of OpenFLUX.1 that were 7B, and 4B parameters (unreleased). Around this time, flux.1-lite-8B-alpha was released and produced very good results. I decided to follow their pruning strategy and ended up with a 8B parameter version. I continued to train the model, adding new datasets and doing various experimental training tricks to improve the quality of the model.

At this point, the model still required CFG in order to generate images. I decided the model needed a guidance embedder similar to FLUX.1-dev, but I wanted it to be bypassable to make the model more flexible and trainable so I trained a new guidance embedder for the model independently of the model weights so that it behaves like an optional adapter leaving the model capable of being trained and inferenced without it.

Fine Tuning

Flex.1 is designed to be fine tunable. It will finetune very similar to FLUX.1-dev, with the exception of the guidance embedder. With FLUX.1-dev, it is best to fine tune with a guidance of 1. However, With Flex.1, it is best to fine tune with the guidance embedder completely bypassed.

Day 1 LoRA training support is in AI-Toolkit. You can use the example config to get started.

Special Thanks

A special thanks to the following people/organizations, but also the entire ML community and countless researchers.

Black Forest Labs
Glif
Lodestone Rock
RunDiffusion
Freepik
Countless others…

ostris
/

Flex.1-alpha