Spaces:
Running
Setting sigmas to 0.95? [UNREQUESTED]
jonesaid at reddit found a way to unleash the powers of FLUX.1-Dev by reducing sigmas to 0.95:
https://www.reddit.com/r/comfyui/comments/1g9wfbq/simple_way_to_increase_detail_in_flux_and_remove/
Apparently, the schedulers were removing too much noise at every step of generating the image and this is causing all the problems we see on Flux, so reducing sigmas 0.95 makes the outputs looks like some sort of fantasied Flux 2! Is there a way to make something like that work over here?
I've scoured the docs and I can't figure out a way to make this happen outside of ComfyUI lol. The serverless inference docs are still lacking in some areas and I haven't come across a working example on the hub... I imagine it's possible on a pipeline. It looks great though, have you got it running on any space?
P.S. Sorry for the late response, I suck at responding to threads!
Thanks for looking into this.
have you got it running on any space?
No, not even ZeroGPU spaces have done it! Perhaps it's impossible!
Anyway things have changed since then with the release of Stable Diffusion 3.5 Medium, because it got this level of detail and compositions, so we just need SDXL's refiner running on the inference API... The idea would be to let SD3.5 Medium run the prompt for some steps, and then switch to the refiner that runs Flux.1 Dev to finish the picture, with some sort of adapter similar to the one used by Loras, and then it should be able to even look better than this! Of course currently this is even more impossible, but a man can dream! Imagine with this tech we would be able to run a model for some steps and then allow another one to finish the pic so we could have another model fix any issues and refine the details of another! This sounds incredible in my head.
Let me, huh... summon @john6666 into the thread, last time I did so we got seeds, who knows what we can get now, because the models aren't using more steps or resources, as long as they load on the serverless API the first model will run for less time and the second for less steps so if the pipeline could support this we'd get infinite possibilities for free! I don't know who made the Lora adapter but this looks like the next... step, heh.
I've been summoned.๐
Well, if the goal is to use it with Diffusers, I think Diffusers itself needs to be compatible. Otherwise, you'd need a long source code to call a single model...
On the other hand, the pipeline source code is public, so in theory it should be possible to manually rewrite and replace it. In theory. Nyanko7 and others do it sometimes.
Since sigma changes are beneficial and inexpensive, I think it would be quicker to raise an issue on github, but I'm a github beginner too.
Edit:
Already there?
https://github.com/huggingface/diffusers/issues/9924
https://github.com/huggingface/diffusers/issues/9971
If we wait, it will be possible automatically with a version upgrade.
I think these schedulers could be specified with sigma.
Edit:
Oh, maybe not if I read the manual again?
It's a bit iffy whether or not it can be specified.
Maybe we need a bit more detail about the ComfyUI sampler and scheduler that was actually used in the Reddit post.
Wow! A lot of thing are happening behind the scenes! So if I'm understanding correctly, after these changes are implemented we'd need to clone the Flux Dev.1 repo and change their scheduler's file to the one that can reduce sigmas and set it to 0.95 on there and then the serverless API of our clone would work like that? If so, that feels like a classic, I remember digiplay would make identical copies of repos but with schedulers changed to give different results, but this time we could have a model with better outputs than Black Forest Labs's original repo for Flux!
That's right. If it's implemented in Diffusers, Python writers can change the behavior of the model by simply passing options to the scheduler class instance on initialization, and people uploading models can change the behavior of the model by editing scheduler_config.json in Notepad.
The only problem is that we still need to check whether the ongoing changes described above will produce the same results as those on Reddit. If the scheduler algorithm is completely different, we may need to add a scheduler itself. That would be a big job for the Diffusers team, mainly in terms of debugging and testing the operation.
I think there is probably something similar...
Perhaps it's easy to get the information.. But, I'm feeling curious, as Flux 1 Dev is the subject of all things (haven't yet used, I'm not sure to test it by the way), but I saw Sigma? What are sigma? Thanks for the answer by advance!
@FashionStash Hey! You can test it here: https://huggingface.co/spaces/black-forest-labs/FLUX.1-dev - since you are used to creative and artful models it's probably going to be a disappointment, so INSTEAD of using your prompts, I recommend searching for an image in Google Images, or using one that you already have, and uploading it here: https://huggingface.co/spaces/fancyfeast/joy-caption-alpha-one - after running that you will get a very long prompt that is paragraphs long, you copy and paste that into Flux, then you will see why it's the most advanced model capable of drawing pretty much anything as long as you specify it to the prompt in a way it understands, like, in a SD1.5 saying someone is "mean" will get you the expression you want, here you'd need to write a paragraph detailing how the eyebrows and eyes and mouth should look like to get it, but then you can make advanced facial expressions unavailable on previous models.
Sigmas are how much noise is removed from the picture at each step, the reason Flux's outputs aren't very detailed and they often have a blurry background is because sigmas are removing way too much noise, so just removing 95% of what it removes normally keeps detail in and is able to produce much crisper backgrounds. There's a GUI called ComfyUI that people with the hardware use to run Flux and people have created addons for it, including ones to modify CFG so it follows your prompts better (unlocking many styles that it has that will not be given normally) or reduce sigmas, but we currently have no way to do that in the diffusers version that huggingface uses for inference, so without the hardware we are out of luck.
What I realized yesterday is that we are at the bleeding edge of this and people are implementing features in real time as we speak, so we may just need to be patient, though, a bit of a problem is we can't talk with the people doing it directly because we don't have github accounts, and they prefer to use it to discuss instead of huggingface, so, heh, we may need to send a spy over there to inquire about all this, it seems it'd be easier to design a working rocket!
Things are changing too fast...
news from discord about flux:
sayakpaul โ ไปๆฅ 18:46
Apart from Control LoRA Flux (which we're actively working on), we have shipped support for the new Flux tools.
Docs: https://huggingface.co/docs/diffusers/main/en/api/pipelines/flux
NF4 checkpoints: https://huggingface.co/collections/sayakpaul/flux-tools-in-nf4-6742ede9e736bfbb3ab4f4e1
Make sure you install diffusers from source and pass the right revision while loading the checkpoints as the original repos are yet to merge diffusers weights.
https://discord.com/channels/879548962464493619/1014557141132132392/1310179757228294215
YnTec, I remember that you might have a trauma with 2FA
on github, but would you guys like to join github or Discord?
From what I can see, there are a lot of coders and staff, but there is a lack of feedback and mutual support from the perspective of users and artists.
So, library developers don't notice bugs and inconveniences that are important for practical use. This is a level that even coders are complaining about. (in real, mainly for Gradio)
I'm willing to join github if that would solve things, surely, though, I wouldn't know what to say, or where to say it, I'd not have said anything else if if wasn't for Nymbo's reply from yesterday! My problem is not knowing what to do, that's why back when I had a fully working github account I didn't do much with it, other than starting projects I never finished that now I feel shame about. I don't even check huggingface forums! Maybe you asked me a question last time we talked and I never went to check that thread again, I engage on here because I get a golden circle around my avatar when there's a notification, it's like different countries we have to visit and I have only found toxic communities at discord where they ban you if you make any comment about having some phobia, so I'd rather have someone else extract the information from there.
I love the idea of having a Discord, considering the fractured nature of communicating on HF. I just threw this server together now, it's called SupaHuggers (name subject to change, I'm not married to it) - https://discord.gg/E59k3gkZyd
I'm also in the official HF discord but the idea with SupaHuggers would be a small server for a few very regular users to collab, vent about gradio 5.0, etc.
I'd love you guys to join to get it started @John6666 @Yntec
Huh, so now I'm glad I wrote a book about why I'm against discord in principle so I don't have to repeat myself: https://huggingface.co/Yntec/Dreamlike/discussions/3#6614ee5e06ee61ff24dacddb - but for short, I deem it as "closed source", keeping things private and unsearchable as a secret so only "members of the club" can benefit, just like keeping the recipe of a model unpublished or posting an AI generated image without the prompt, I want all the discussions I'm involved with to remain open and accessible to everyone and it makes me glad there's no private message system over here, so you can check every single thing I've told to john6666 and he has told me for instance and nothing is concealed (I don't use emails either!)
I guess publishing monthly publicly all the messages from the discord would solve it, then again, when it comes to it I turn out to be a difficult person to deal with, heh, but imagine Black Forest Labs was like this and we could see all their conversations about how they made Flux, everyone could benefit and we could create a new version, but Flux Pro was never published, akin to being buried in a discord server.
I see. You don't like the way Discord is.
I also think that the fact that you can't search for Discord on Google makes it particularly difficult to use from an OSS perspective. On the other hand, it's easy to have private conversations on it, so it's somewhere between anonymous message boards and HF, leaning towards HF...
For example, if you discuss solutions here, on the HF forum, on github, or on anonymous message boards, then that becomes a resource that you can search for on Google in the same way as StackOverflow, but that's not the case with Discord.
I'm not familiar with the github culture either. I think it's been less than a month...
However, if you think there's a bug or something inconvenient, you can just write an issue, and if it can be fixed, you can just PR it, I think. I still don't really understand how to use Discussion.
Well, if we consult on HF and figure out where the problem is, wouldn't it be fine if one of us just writes an issue?
You articulated it much better than I could, imagine Nymbo and you join a Discord and have discussions of information I'll never have access to, it's probably happening all the time with other people and several groups have to reinvent the wheel because the solutions keep being hidden in Discord groups, and that only slows things for everyone because solutions are never published (the last Stable Diffusion Discord I tried to join had as their first rule that nothing you read on it could be shared outside of it! ๐ฎ)
I'm currently very happy with HF's messaging system and wonder why people are using github to communicate and improve Diffusers instead of a HF model card, I guess it's a technical issue.
We can talk behind their backs all we want.๐ You can rest assured that so far it's not working and we won't do it.
The problem is that no one can know about it openly.
Also, there is one more misconception. You and I are the only ones who use this place as an anonymous forum!
Here's usually thought of as a place to report bugs, so not many people are chatting. And the forum only has a serious section.
The only place where people can casually talk about dinner is the HF Discord normally.
It would be nice if more people knew that model repos can be used as a forum.
@Yntec
r3gm, the author of DiffuseCraft and stablepy, wrote us some code, so I've set up a test space.
If it works well, we'll go ahead and commit it to Diffusers.
Help with the performance test.
By default, the two images generated with sigma=1 and sigma=0.95 are compared.
https://huggingface.co/spaces/John6666/flux-sigmas-test
Wow! That was quick! Happy new December John! I managed to create 3 comparisons before running out of quotas, it seems to be working as expected and providing new styles and detail previously locked in Flux! Here's the samples and prompts I tried (all use seed 9119):
(click for larger)
80s mage handsome wizard guy as Benjamin with pretty girl as Clara fighting dinosaur dragon portrait with detailed retro face and eyes, movie still, sword and shield holding jupiter fire sunset in the sky with action, his hands, she stands on a mountain, under the there is a small town, illustration, sharp focus, very detailed, 8 k, hd
(click for larger)
80s cinematic colored sitcom screenshot. young husband with wife. festive scene at a copper brewery with a wooden keg of enjoying burrito juice in the center. sitting cute little daughter. Display mugs of dark beer. Closeup. beautiful eyes. accompanied by halloween Shirley ingredients. portrait smile
(click for larger)
anthropomorphic pig Programmer with laptop, colorfull, funny
Thanks! This is big!
Happy New December!
Well, I tried it too, but seriously, just changing that makes it work exactly as we want...๐
The logic part is less than a line of code.
With such detailed samples, it's easy to explain to the development team. Thanks a lot.
The rest is easy. It just takes time.
Awesome! Here's the one I tried to make yesterday:
A painting by Picasso of Hatsune Miku in an office. Desk, window, books.
(click for larger)
Sigmas 0.95 make default flux look lazy in comparison, heh, and the best part is all Flux based models and Loras will benefit! It's incredible the changes a few lines of code can make.
I tried consulting with them in advance on Discord, but they said I didn't have enough convincing samples.
asomoza
the best method here would be to open an issue with a feature request, it's perfectly fine to just give a quick description and a link to the HF discussion, what's important and I don't see clearly is to post some images showing the enhancements, for example I think in that discussion the images with it are the left ones right?
In those examples I don't see that clearly that they are better, just that they are a variation of the image, maybe the pig one would be the only one. I understand that this can diminish the bokeh of the images, but there are some other methods that also claim to do that.
The repositories for them are this one for auto1111 and this one for comfyui which don't have that many stars, this is not a clear representation that they aren't used but we don't have any other good measure tool to know if people find them usefull or not.
Adding more code to the pipelines (even if it is just a new arg an a line of code) is not something we would do if it's not used or we don't have some good evidence that it's clearly better.
Also this is going to be easy to add when we have modular diffusers finished.
for example I think in that discussion the images with it are the left ones right?
In those examples I don't see that clearly that they are better, just that they are a variation of the image, maybe the pig one would be the only one.
Whoa! What? No, the sigma ones are the ones on the right! The left one is a soulless pig staring at the camera and such boring outputs are a big problem of Flux, plus lack of diversity, and by that, I mean if you run different seeds with that prompt most are going to be such a soulless picture so you never create a second one because you saw them, all, sigmas fix all that but he just sees the wall decorations and books on the right and thinks it's better?? No wonder we are in this state, probably black forest labs hired people to get an aesthetic score and they scored better ones like the ones on the left, that pig looks like a beginner trying 3D software and failing to imitate pixar style... even my DreamWorksRemix model can draw a better pig!
But I'm sure he'll like Flux's one of the left better, maybe he likes animals with glasses...
Anyway, so I'm giving up on this, I'd tell him that even if he wanted to implement it I'd not want him to do it anymore, such people have bad vibes that get embedded into the code and things will eventually stop working, I claim such things were what killed the Automatic1111 GUI, you have to get coders in there that enjoy what they're doing and helping others implement features, not looking for reasons of why it's a bad idea. No thanks.
All I wanted was to be able to use sigmas and I can do that in your space! Could you modify it so that it's optional that it makes also the default version to compare, so I may use it to only do sigma ones? To save quotas and make more images, I could make six Flux 0.95 sigmas images! Compared to yesterday's zero that'd be the dream!
No, that space is only for the purpose of the demonstration experiment, and I can't leave it there forever... I don't have enough Zero GPU space slots...
Of course, even if I offload it, I won't delete it.
However, I will continue to explore and find a way to keep this feature. Ultimately, I plan to put it in the FLUX LoRA space. (That space can be generated even without LoRA.)
I think there is a big enough difference... If there is this much difference, it is a different model. It is not good or bad, but diversity. Neither of them is broken.
It's genuinely sad that the value of algorithms is only simply evaluated by the majority...
Edit:
I would like to know what happened to the development of A1111.
I can't leave it there forever... I don't have enough Zero GPU space slots...
It was tempting to buy a HF account and put a ZeroGPU myself, just with the sending of default model picture disabled, because this is the only way for people without the hardware to use Flux sigmas, but I wouldn't want to be on your shoes.
This is very frustrating so I'm closing this and hope nobody replies to this thread anymore, I take back the request and will stop supporting Flux, I'll be removing it from my spaces when it's time to update them and pretend it never existed, this rotten technology can burn in a fire for all I care now.
@Yntec I was the one that commented that on discord, you don't have to be so mad about someone else opinion that doesn't match yours.
For me all the Flux default generations looks soulless, that was just my honest opinion, also usually when doing comparisons you have to tell people which ones are what. I apology to you if my bad taste offended you but you shouldn't drop all efforts because of it and maybe you can show me why are they so good in comparison to the normal ones.
You can get mad, but I'm almost sure that I can do the same and make everyone guess which one is the one with the sigmas at .98 and people won't be able to tell, maybe you also won't. Just remember that diffusers and image generation in general are also used by normal people and not specialized people like you.
P.S.: Also I said maybe the pig as like to say something, but I really don't find any better than the other one, if you want I can copy these to github and ask more people about it and see if they're really noticeable for the rest of the people.
https://github.com/huggingface/diffusers/pull/10081
I know, hlky implemented it.
I am sad that my lack of explanation has caused your displeasure, irritated you, and led to our quarrel, which is essentially unnecessary. there is no one on HF who is happy with the reduced functionality of Diffusers.
I too was impatient and abandoned the additional explanation at that time. I regret that.
I am sorry that my poor English made me sound like a fraud in the LyCORIS case as well. I'm sure that I caused alarm in your mind. I will be careful not to cause such a thing in the future.
This case is our fault for my lack of explanation and my / your attitude or feeling at that time.
Even though it's important to point out the problem, there should always be a moderate way to say it.
Yntec is the only pure victim.
REALLY sorry Yntec.
@John6666 if part of the apology is to me, don't worry, you're have been very polite and didn't cause any troubles for me or the team, please continue doing what you're doing which is great for the community and also helps us get in touch with what's important for the users.
The only thing that I can add is that we don't have the bandwidth to test every use case, so we appreciate a lot if the issue is with a detailed explanation (doesn't matter the quality of the english), a reproducible code and some clear image examples (or the media type related).
I also reiterate my apology to Yntec if my bad taste in images caused any grief, and I must add that this kind of problems won't be an issue once we have the modular diffusers PR merged which will allow users like you to just add whatever they need.
Thanks guys, at the end of the day this was because it made me feel worthless as I wasn't worth a single ZeroGPU slot, this was never about getting 0.75 sigmas on this space or on the inference API, it was about being able to use them at all, and John6666's space was already fulfilling the wish, except it was using double the quotas by also generating an unwanted image with default Flux settings, and if I couldn't convince John of removing a few characters in a line of code so only 0.75 sigmas images were produced in that space, what hope do I have of anything else?
But since they're implementing unacceptable changes on Huggingface related to the limits of usage, I took away my toys and quit the place, so none of this matters anymore.
Thanks guys, at the end of the day this was because it made me feel worthless as I wasn't worth a single ZeroGPU slot, this was never about getting 0.75 sigmas on this space or on the inference API, it was about being able to use them at all, and John6666's space was already fulfilling the wish, except it was using double the quotas by also generating an unwanted image with default Flux settings, and if I couldn't convince John of removing a few characters in a line of code so only 0.75 sigmas images were produced in that space, what hope do I have of anything else?
But since they're implementing unacceptable changes on Huggingface related to the limits of usage, I took away my toys and quit the place, so none of this matters anymore.
Where will you be going - your spaces we epic and willing to follow where u go .