Conversation
leejet
commented
Apr 15, 2026
|
Can an |
|
Tried some quants for turbo with flux2 vae smaller decoder #1402:
Quants work really well with this model. Must be the arch. |
|
Interesting, using the text_encoder (the .safetensors) seem to fail to load, as they seem to hae all the data in a sub-node "language_model". Using a different Ministral-3B does work tho. Failing: Working: While this works, as soon as I provide any of the officially supported width/height parameters (-W 1024 -H 1024) i only get a white output... However, removing the parameter "-W" and "-H" it all works again. (the default output seems to be 512x512, so that would check out) |
|
It would probably be a little faster once SD.cpp syncs with GGML and incorporates the latest optimizations. |
Done — The prompt enhancer isn’t built into |
@kuhnchris This is just a naming convention issue. You can download the compatible |
|
For the fun of it, instead of using a vae, I tried using the taef2 tae with |
|
I encountered some numerical issues on CUDA.
Important here seems to be the size (full hd 😄 ) and steps. ernie-image-Q4_K_M. Tried 25 steps and it disintegrates at step 24. A smaller size with artifacts is 1600x1152. Setting the correct flow shift value |
|
I converted their ministral prompt enhancer finetune to gguf for llama.cpp. https://huggingface.co/Green-Sky/Ernie-Image-Prompt-Enhancer-Ministral-3B-GGUF It contains the correct system prompt. run it like this: One issue I have is that it really likes to produce drawings, even when "photograph" is specified. Also features specified in the prompt never occur. Using the to english translated prompt works however. eg: "A photo of a lovely cat" result:
firefox translation:
image (not using the translation): This happens very consistently and strongly smells like tokenizer issue for han script. |
|
You can use the mmproj files from the base model to prompt it using an existing image too. So captioning. But while testing that and providing the request to create a prompt for the image in the json format, it glitched and created a prompt of itself. 😆 command: resulting prompt:
translated to english:
resuling image from english translation: better working command: resulting prompt:
translated to english:
So it kinda works. https://huggingface.co/unsloth/Ministral-3-3B-Instruct-2512-GGUF/blob/main/mmproj-F16.gguf |
|
leejet wrote: "The prompt enhancer isn’t built into sd.cpp; it’s just standard LLM-based prompt expansion and can be done via tools like llama.cpp or ChatGPT / Gemini." Can anybody help me in these steps? Converting the PE.safetensors file with (" sd -M convert -m PE.safetensors -o PE.gguf") was not good, I got this error from llama-cli: "Loading model... |llama_model_load: error loading model: error loading model architecture: unknown model architecture: '' So how can we convert it, and how can we use it after conversion? |
|
@ZahyGabi just read mylast comment above yours. :) |
And a next question: what was the trick with the PE-conversion? (Actually, may I ask for the correct command line?) |
It is all llama.cpp . For safetensor to gguf you need the convert_hf_to_gguf.py python script. Once you have your gguf you can quantize it with llama.cpp binaries. I don't have the commands anymore, but uploaded good files. Do keep in mind though that the sd.cpp tokenizer for chinese seems to be broken, as described in my comment. |













