We are thrilled to release Qwen-Image, an image generation foundation model in the Qwen series that achieves significant advances in complex text rendering and precise image editing. Experiments show strong general capabilities in both image generation and editing, with exceptional performance in text rendering, especially for Chinese.
One of its standout capabilities is high-fidelity text rendering across diverse images. Whether it’s alphabetic languages like English or logographic scripts like Chinese, Qwen-Image preserves typographic details, layout coherence, and contextual harmony with stunning accuracy. Text isn’t just overlaid—it’s seamlessly integrated into the visual fabric.
Beyond text, Qwen-Image excels at general image generation with support for a wide range of artistic styles. From photorealistic scenes to impressionist paintings, from anime aesthetics to minimalist design, the model adapts fluidly to creative prompts, making it a versatile tool for artists, designers, and storytellers.
When it comes to image editing, Qwen-Image goes far beyond simple adjustments. It enables advanced operations such as style transfer, object insertion or removal, detail enhancement, text editing within images, and even human pose manipulation—all with intuitive input and coherent output. This level of control brings professional-grade editing within reach of everyday users.
But Qwen-Image doesn’t just create or edit—it understands. It supports a suite of image understanding tasks, including object detection, semantic segmentation, depth and edge (Canny) estimation, novel view synthesis, and super-resolution. These capabilities, while technically distinct, can all be seen as specialized forms of intelligent image editing, powered by deep visual comprehension.
Together, these features make Qwen-Image not just a tool for generating pretty pictures, but a comprehensive foundation model for intelligent visual creation and manipulation—where language, layout, and imagery converge.
License Agreement
Qwen-Image is licensed under Apache 2.0.
Original Text and Models: https://huggingface.co/Qwen/Qwen-Image
Description
FAQ
Comments (13)
isn't that a duplicate to this model?
https://civitai.com/models/1843568/qwenimagefp8e4m3fn?modelVersionId=2086298
Maybe, but we need models hosted on an official account to be able to add it to the Civitai Generator. Generations made with Qwen-Image will route to the gallery of this model.
theally I guess we should link external generated images also to this model then 👍
theally are we getting an official for wan 2.2? i can only post my gens to a workflow currently and not a model
first time i heard of this model. which program does it work in?
ComfyUI Nightly (probably Stable version already support it)
homoludens
also works in Wan2GP 👍😊
Huge model with great potential
seems to be very interesting.
Not very clear on the second image, where the pictures look like working with ControlNet.
Where can we even see examples of workflow for image editing as shown in the examples? The one presented here is just text2image, as I understand it?
Hey, yes, I'm waiting for more information about how that works. Standard img2img flows work with it, but according to the blurb, it should mimic the capabilities of Flux Kontext somewhat.
Unless something has changed from one day to the next, the released Qwen-Image checkpoints are only for txt2img and img2img workflows, with the Instructed Image Manipulation model being released to the public (not just available on their web-platform) "at a later date(tm)"
VeerGeer this is what I have heard as well.
