< Skip to content

Flux AI Compared To PixArt And Stable Diffusion

Last week Black Forest Labs released their generative AI, Flux AI. I have tested and compared it to PixArt, SDXL and SD 3 and my initial thought is that Flux is what everyone thought SD 3 would be.

Flux is available in 2 versions that can be run local on your computer, Flux .1 dev and Flux .1 Schnell. The checkpoints for both of them are close to 28GB to which an additional ~10GB is added for the text encoders. This usually means that you would need to have a graphic card with at least 32GB VRAM but it’s possible to offload a lot to your CPU, making it possible to run the model on much less VRAM.

In addition there’s fp8 versions available for both checkpoints, making them less than 12GB.

Using my Geforce RTX 3060 with 12GB VRAM I create an image with the dev checkpoint in 2 min 18 sec and with the schnell checkpoint it takes 28 sec.

These are the results.

Flux AI vs SD 3 vs Pixart vs SDXL

In this text I will post images from SDXL, SD 3, PixArt and Flux using the same prompt, to see the differences. I will also provide the base prompt for each set of images, meaning a condensed version of the prompts. The reason being that the prompts in some cases had to be altered a bit depending on the checkpoint. And some prompts can be very long and not provide much for a reader.

A beautiful woman wearing a pirate costume in a forest with plants behind her

Photo of a group of musical notes standing in a forest with butterflies flying around them, all surrounded by fireflies

Photo of a beautiful young woman with long silver hair, wearing an elaborate black gothic outfit with intricate lace details and silver jewelry

Below I use a long and complicated prompt to compare how the different models handles it.

In the striking neo-noir high contrast image, a fascinatingly eccentric gorgeous female warlock dominates the scene. The subject, depicted in a mesmerizing photograph, captures the warlock’s enigmatic essence. Adorned in a tattered, yet stylish, midnight black suit, her piercing emerald eyes pierce through the darkness, sparking intrigue and curiosity. The warlock’s unkempt silver hair cascades down her shoulders, adding an air of mystery to her persona. her pale skin bears intricate, intricate and captivating tattoos, emanating an otherworldly aura. This masterfully captured image immerses viewers in the captivating world of a kooky and provocative warlock, leaving them enthralled by her haunting presence

A beautiful female cyborg with red glass, purple ceramic details, beautiful natural hair, wearing an elaborate dress made of chiffon fabric, red glass tubes, complex wiring and mechanical intricacy in the cyberpunk style

Mythical Nightscape – a path through a forest with lots of trees and flowers on it at night time with lights shining on the trees, magical atmosphere

Stable Diffusion 3 was completely useless at the below prompt

A raw photo of a tattooed woman, goth straight black hair wears a delicate black and red dress accentuated by black lace, laying down in puddle of shiny black liquid, floating red roses

Photo of a cyber-pirate woman dressed in short skirt with garter holders

A large barrel made of glass filled with water. The glass is transparent and is reflecting the light. Inside the barrel som Koi fishes swimming. In the background there is a traditional japanese garden

Full body shot of a 28 year old woman holding a sign in that says “Open for business”. She is wearing a fake moustache

I think most people would agree when I say that all these models have their strength and weaknesses. Generally I believe Flux to currently be the best and most versatile at the moment, especially when it comes to creating texts in images.

There are also a third Flux checkpoint available, Flux Pro, but it’s only available through API and at a cost, so I haven’t bothered trying that one.

Dela med dina vänner
Published inEnglishTech