Z-Image Turbo
Sub-second generation, 1 credit per image. The #1 ranked open-source text-to-image model — built on a 6B parameter S3-DiT architecture with 8-step inference, delivering photorealistic quality at unmatched speed and cost.
Powered by Alibaba Tongyi MAI open-source foundation model
Lightning Speed, Rock-Bottom Cost
Z-Image Turbo distills a 6B parameter foundation model into just 8 inference steps using Decoupled-DMD + Reinforcement Learning. The result: sub-second latency on enterprise GPUs and photorealistic output that rivals models 10x slower.
What Makes It Special
Four core strengths that set Z-Image Turbo apart
Ultra-Fast Inference
Only 8 inference steps via the S3-DiT (Scalable Single-Stream Diffusion Transformer) architecture. Sub-second latency on enterprise GPUs, runs comfortably on 16GB VRAM consumer devices.
Bilingual Text Rendering
Accurately renders complex Chinese and English text within generated images. Ideal for marketing banners, posters, social media graphics, and any scenario requiring precise in-image typography.
Prompt Reasoning Enhancement
Built-in reasoning capability that goes beyond surface-level descriptions. The model taps into world knowledge to understand context, spatial relationships, and implicit details in your prompts.
Fully Open Source
Released under Apache 2.0 license with full model weights on Hugging Face. 6B parameters fit on consumer-grade GPUs (16GB VRAM). Free to use, modify, and deploy commercially.
Frequently Asked Questions
Start Creating with Z-Image Turbo
The fastest AI image generation at the lowest cost. 1 credit, sub-second quality.