Flux.1 [Schnell] を試した

A cup of coffee

Flux.1 [Schnell]を stable-diffusion.cppを使って文字列の入った画像を生成してみた。手順はここ https://github.com/leejet/stable-diffusion.cpp/blob/master/docs/flux.md にある通り。

M1 Max を使って 512x512 サイズの画像を出力するのに 1分半程度、 768x768 サイズの場合は2分半程度です。

Ubuntu + CUDA(GPU) でもだいたい同じくらいの時間で出力できました。

macOS で実行する場合 Metal 対応したオプションで sd コマンドをビルドする必要があります。非対応の sd コマンドでは1枚の画像生成に10分以上時間がかかりました。 Metal 対応のビルドは cmake コマンドさえあればとても簡単でした。

リリースされているコマンドは試していません。もしかして自分でビルドしなくてもマック用のものは Metal 対応なのかもしれない。（未確認）
stable-diffusion.cpp RELEASE https://github.com/leejet/stable-diffusion.cpp/releases/

冒頭の画像を生成したプロンプト:

a cup of coffee on the small wood dining table with a candle at night in the small lodge,
the cup side printed 'TULLY'S COFFEE',
high quality detail, art anime style,
best quality, 4k resolution"

次のようなスクリプトで出力できます。

#!/bin/bash
prompt="a cup of coffee on the small wood dining table with a candle at night in the small lodge, the cup side printed 'TULLY'S COFFEE', high quality detail, art anime style, best quality, 4k resolution"

wh=512
seed=46

outfile="a-cup-of-coffee.png"

dir=/path/to/models

model=flux1-schnell-q8_0.gguf
vae=ae.safetensors
clip_l=clip_l.safetensors
t5xxl=t5xxl_fp16.safetensors

./bin/sd \
  --output $outfile \
  --seed $seed \
  --diffusion-model $dir/$model \
  --vae $dir/$vae \
  --clip_l $dir/$clip_l \
  --t5xxl $dir/$t5xxl \
  --cfg-scale 1 \
  --steps 6 \
  --sampling-method euler \
  -H $wh -W $wh \
  -p "$prompt"

以上です。

Liked some of this entry? Buy me a coffee, please.