Train FLUX.1 Image Models Fast with Your Photos — No Coding Needed!

Fine-tune FLUX.1 models using your photos on Replicate — no coding needed! Create unique images in styles like superheroes or cartoons

8 min readSep 10, 2024

Cloudways — The Best Managed Cloud Hosting | Web Hosting

The release of the FLUX.1 model quickly became popular around the world, generating images of higher quality than existing open source models, and supporting fine-tuning through simple operations without programming knowledge. Hundreds of public Flux fine-tunings have been published on Replicate, and thousands of private fine-tunings.

One of the most exciting features of Flux is its ability to fine-tune on facial images, something that was difficult to achieve with previous open source image generation models like Stable Diffusion or SDXL. Since Dreambooth, it has never been easier to fine-tune to get great results from just a few training images.

This article details how to use your own photos on the Replicate platform to fine-tune the FLUX.1 image training model, so that you can use your own photos to generate new images of various styles, such as superheroes, cartoon characters, or adventurers.

Key Steps of Training FLUX.1

Prepare training images

at least 10 high-quality facial photos taken from different angles and lighting conditions.

Select Trigger Word: Create a unique “trigger word” that will be used to activate the model later in the prompt text.
Create and train the model: Upload images and trigger words on Replicate for training, which takes about 20 minutes.
Generate images: Use the trained model to generate detailed description text containing trigger words to generate images related to the prompt.

Step 0: Preparation

Before you start fine-tuning your FLUX.1 model, you will need the following:

A Replicate account
A few photos of yourself as training pictures
$2–3 to cover model training costs

Step 1: Collect training images

You will need to collect several photos of yourself for training. It is best to use photos from different angles, lighting conditions, and backgrounds. It is recommended to have at least 10 high-quality facial photos , but at least 2 are required.

Image requirements:

Supported formats: WebP, JPG, PNG
Resolution: 1024×1024 or higher is recommended
File name: You can name it as you like. The file name does not affect the training.
Aspect ratio: Images can be square, landscape, or portrait
It is recommended to have at least 10 pictures. The more pictures, the better the effect.
Choose a variety of pictures, such as different backgrounds, clothing, lighting, angles, etc.

Preparation steps:

Store the images in a folder named, for example data.
Compress the folder into .zip a file named data.zip.

Step 2: Choose a unique trigger word

When fine-tuning the FLUX.1 model, you need to select a unique trigger word that will be used later in the image generation prompt text.

photo of YOUR_TRIGGER_WORD_HERE looking super-cool, riding on a segway scooter

Trigger word requirements:

When choosing trigger words, keep the following in mind:
It should be unique, i.e. MY_UNIQ_TRGGR
It should not be an existing word in any language, such as dog or cyberpunk
Don’t use TOKcombine it with other tweaks .
Capitalization doesn’t matter, but capital letters can help you differentiate your trigger words from the rest of the text in your prompt.

For example, the author used “ZIKI” as the trigger word in the example, and you can choose a similar unique character combination according to your preferences.

Step 3: Create and train the model

Next, you will upload the training images on the Replicate platform and start training the model. You can choose to use the web form for training, or automate the process through the API.

Web page training steps:

Visit the Flux tuning form.
Choose where to publish the model: You can choose to publish the fine-tuned model as public or private.
Upload training images: In input_images field, upload the compressed package created previously data.zip.
Enter Trigger Word: In trigger_word field, enter the unique trigger word you selected previously.
Select the number of training steps: The default setting is 1000 steps. Fewer steps may not be able to learn the concepts in the image well, while more than 1000 steps may be a waste of time and cost.
Click Create training to start training.

Step 4: Wait for training to complete

The training process is pretty fast, but still takes a few minutes. If you use ten images and 1000 steps, the whole process should take about 20 minutes. Use this time to get up from your computer, stretch your arms and legs, drink some water, etc.

When you come back, your model should be ready.

Step 5: Generate an image using a web page

Once training is complete, your fine-tuned model is ready to use. Enter the prompt sentence through the web form to generate an image.

Visit the Replicate platform web playground.
Enter the prompt: Include the trigger word you set previously in the prompt. For example:

Step 6: Generate an image using the API

If you don’t want to manually enter the prompts on the web page to generate images every time, you can use Replicate’s API to automate the generation process.

The Web Playground is a great place to start playing with your new model, but generating an image with every click can quickly become tiresome. Fortunately, your model is also hosted in the cloud and provides an API, so you can run it from code in the programming language of your choice.

When you run the model, you’ll see tabs for different languages, such as Node.js and Python. These tabs contain code snippets that show you how to construct the API call to replicate the exact parameters you just entered into the browser form.

Click the Node.js tab in the web playground to view the API code:

API code examples: You can find API code examples in different languages in the web playground, such as Python or Node.js. Here is a simplified Node.js code example:

import Replicate from "replicate";

const replicate = new Replicate();
const model = "zeke/ziki-flux:dadc276a9062240e68f110ca06521752f334777a94f031feb0ae78ae3edca58e";
const prompt = "ZIKI, an adult man, standing atop Mount Everest at dawn...";
const output = await replicate.run(model, { input: { prompt } });
console.log(output);

With this API, you can generate images from your own programs.

Step 7: Generate better prompts using LLM

Sometimes, you may have difficulty coming up with a suitable prompt to generate an image. In such cases, you can use a language model to generate detailed prompts.

Prompt generation example: Use LLM to generate detailed prompts such as the following:

Write ten prompts for an image generation model. The prompts should describe a fictitious person named ZIKI in various scenarios. Make sure to use the word ZIKI in all caps in every prompt. Make the prompts highly detailed and interesting, and make them varied in subject matter. Make sure the prompts will generate images that include unobscured facial details. ZIKI is a 43 year old adult male. Include some reference to this in prompt to avoid misrepresenting ZIKI’s age or gender. Do not allude to ZIKI’s eye color.Write ten prompts for the image generation model. Each prompt should describe a fictional character named ZIKI in a different scenario. Make sure to use ZIKI in all caps in each prompt. The prompts should be highly detailed and interesting, with some variation on the theme. Make sure the images generated by the prompts include clear facial details. ZIKI is a 43-year-old adult male. Include descriptions related to this age or gender in the prompts. Do not mention ZIKI's eye color.

This generates some interesting prompt:

Close-up of ZIKI, a male street artist in his 40s, spray-painting a vibrant mural on a city wall. His face shows intense concentration, with flecks of paint on his cheeks and forehead. He wears a respirator mask around his neck and a beanie on his head. The partially completed mural is visible behind him.Close-up: ZIKI, a male street artist in his 40s, spray-paints a colorful mural on a city wall. His face is full of concentration, with paint on his cheeks and forehead. He wears a gas mask around his neck and a woolen hat on his head. The unfinished mural is visible behind him.

ZIKI, a dapper gentleman spy in his 40s, engaged in a high-stakes poker game in a luxurious Monte Carlo casino. His face betrays no emotion as he studies his cards, one eyebrow slightly raised. He wears a tailored tuxedo and a bow tie, with a martini glass on the table in front of him.ZIKI, a dashing gentleman spy in his 40s, is playing a high-stakes poker game in a luxurious casino in Monte Carlo. His face is expressionless, with only one eyebrow slightly raised, as he studies his cards carefully. He is wearing a well-tailored tuxedo and bow tie, with a martini on the table.

ZIKI, a distinguished-looking gentleman in his 40s, conducting a symphony orchestra. His expressive face shows intense concentration as he gestures dramatically with a baton. He wears a crisp tuxedo, and his salt-and-pepper hair is slightly disheveled from his passionate movements.ZIKI, a gentleman in his 40s with extraordinary temperament, is conducting a symphony orchestra. His expression is full of concentration, and he is waving his baton vigorously. He is wearing a straight evening gown, and his hair is a little messy because of his passionate movements.

……

For more info ↓