阿里开源Qwen-Image,AI会写汉字了!


阿里开源Qwen-Image,已经成为huggingface排名第一。

亲测12G的3060可用!

Qwen-Image 是一个拥有 20B 参数的 MMDiT(多模态扩散变换器)模型,基于 Apache 2.0 许可证开源。

Qwen-Image 是阿里巴巴通义千问团队发布的首个图像生成基础模型,这是一个拥有 20B 参数的 MMDiT(多模态扩散变换器)模型,基于 Apache 2.0 许可证开源。该模型在复杂文本渲染精确图像编辑方面取得了显著进展,无论是英语还是中文等多种语言都能实现高保真输出。模型亮点

  • 卓越的多语言文本渲染
    :支持英语、中文、韩语、日语等多种语言的高精度文本生成,保持字体细节和布局一致性
  • 多样化艺术风格
    :从照片级真实到印象派绘画,从动漫美学到极简设计,流畅适应各种创意提示

模型(网盘下载见文末):

Diffusion Model

  • qwen_image_fp8_e4m3fn.safetensors

Text Encoder

  • qwen_2.5_vl_7b_fp8_scaled.safetensors

VAE

  • qwen_image_vae.safetensors

模型下载后安装路径:

ComfyUI/

├── models/

│ ├── diffusion_models/

│ │ └── qwen_image_fp8_e4m3fn.safetensors

│ ├── vae/

│ │ └── qwen_image_vae.safetensors

│ └── text_encoders/

│ └── qwen_2.5_vl_7b_fp8_scaled.safetensors


1、”A vibrant, warm neon-lit street scene in Hong Kong at the afternoon, with a mix of colorful Chinese and English signs glowing brightly. The atmosphere is lively, cinematic, and rain-washed with reflections on the pavement. The colors are vivid, full of pink, blue, red, and green hues. Crowded buildings with overlapping neon signs. 1980s Hong Kong style. Signs include:

“龍鳳冰室” “金華燒臘” “HAPPY HAIR” “鴻運茶餐廳” “EASY BAR” “永發魚蛋粉” “添記粥麵” “SUNSHINE MOTEL” “美都餐室” “富記糖水” “太平館” “雅芳髮型屋” “STAR KTV” “銀河娛樂城” “百樂門舞廳” “BUBBLE CAFE” “萬豪麻雀館” “CITY LIGHTS BAR” “瑞祥香燭莊” “文記文具” “GOLDEN JADE HOTEL” “LOVELY BEAUTY” “合興百貨” “興旺電器” And the background is warm yellow street and with all stores’ lights on.

2、A solitary teenage girl lies prone on a soft unmade bed, her upper body gently propped on folded arms, chin resting on the backs of her hands. She is dressed only in loose cotton panties, back fully exposed and illuminated by warm late-afternoon sunlight streaming through half-closed venetian blinds, creating rhythmic stripes across her smooth pale skin. The camera angle is low and intimate, three-quarter view from the foot of the bed, emphasizing the gentle S-curve of her spine.


Running vertically along the center of her back, between the shoulder blades and tapering to just above the small of her back, is a single-column blackletter tattoo that reads:

“大江东去,浪淘尽,千古风流人物。故垒西边,人道是,三国周郎赤壁。”

The text is rendered in sharp, high-contrast black ink with subtle dot-work shading at the edges, giving it a slightly embossed, engraved-steel appearance. Fine hairline cracks and minuscule white highlights suggest the tattoo is freshly healed yet already carrying the patina of personal history. Tiny individual serif flourishes curl like thorns on certain letters, hinting at a hidden floral motif.


The bedroom is softly cluttered: rumpled white linen sheets, a pale-grey knitted throw spilling onto a worn wooden floor, and a single open paperback novel (pages yellowed, spine cracked) lying beside her. A muted palette of dusty rose and slate-blue tones dominates, with warm golden rim light outlining the girl’s silhouette. Depth of field is shallow: the tattoo and the gentle rise of her shoulder blades are tack-sharp, while the background melts into creamy bokeh, evoking a quiet, private afternoon.


Shot on a 50 mm prime lens at f/1.8, ISO 100, 1/250 s. Natural light only, no flash. Fine cinematic grain, subtle lens flare where sunlight kisses the edges of her hair. Mood is introspective, serene, and slightly voyeuristic without being overtly sensual.

3、(当然了,阿里的这个模型是支持中文提示词的,只是效果略逊色一点点)

鲁迅拿着一块牌子,上面写着:我没说过这句话!

字太多了问题比较多(也可能是FP8的原因)。

但整体效果还算可以,希望不断完善。

网盘下载:

https://pan.quark.cn/s/0571d4670650



(文:路过银河AI)

发表评论