CivArchive
    svdq-int4_r64-ernie-image - v1.0
    Preview 131290953
    # ERNIE Image Turbo — Nunchaku W4A4 Quantized Inference
    
    [中文](#chinese) | [English](#english)
    
    ---
    
    ### Introduction
    
    This adds **W4A4 quantized inference** support for [ERNIE Image Turbo](https://huggingface.co/baidu/ERNIE-Image-Turbo) to [**Nunchaku**](https://github.com/nunchaku-ai/nunchaku), delivering significant speedup and memory reduction with minimal quality loss.
    
    Built on [Nunchaku](https://github.com/nunchaku-ai/nunchaku). We gratefully acknowledge their excellent work on efficient diffusion model inference.
    
    
    ### Installation
    
    ```bash
    # This fork adds ERNIE Image support to Nunchaku
    git clone https://github.com/Hzj199/nunchaku.git
    cd nunchaku
    git submodule update --init --recursive
    
    pip install build
    NUNCHAKU_BUILD_WHEELS=1 python -m build --wheel --no-isolation
    pip install dist/nunchaku-*.whl
    ```
    
    ### Quick Start
    
    ```python
    import torch
    from diffusers.pipelines.ernie_image.pipeline_ernie_image import ErnieImagePipeline
    from nunchaku import NunchakuErnieImageTransformer2DModel
    from nunchaku.utils import get_precision
    
    precision = get_precision()  # auto-detect: "int4" or "fp4"
    rank = 64
    
    transformer = NunchakuErnieImageTransformer2DModel.from_pretrained(
        f"ZJMuYun97/ERNIE-Image-Nunchaku/svdq-{precision}_r{rank}-ernie-image.safetensors",
        torch_dtype=torch.bfloat16,
        device="cuda",
    )
    
    pipe = ErnieImagePipeline.from_pretrained(
        "baidu/ERNIE-Image-Turbo",
        transformer=transformer,
        torch_dtype=torch.bfloat16,
        pe=None, pe_tokenizer=None,
    )
    
    image = pipe(
        prompt="a cute orange cat sitting on a sunlit windowsill",
        height=1024, width=1024,
        num_inference_steps=8,
        guidance_scale=1.0,
        generator=torch.Generator().manual_seed(42),
    ).images[0]
    image.save("ernie-image.png")
    ```
    
    ### Performance (Reference)
    
    Tested on a single A800 GPU, 1024×1024 resolution, 8 inference steps:
    
    | Model | Avg Latency | Speedup |
    |-------|-------------|---------|
    | Original BF16 | 4.89s | 1.0x |
    | **Nunchaku W4A4** | **2.81s** | **1.74x** |
    
    ### Notes
    
    - Only `batch_size=1` is supported (same as typical inference use case).
    ---
    
    
    ### 简介
    
    为 [**Nunchaku**](https://github.com/nunchaku-ai/nunchaku) 添加了对 [ERNIE Image Turbo](https://huggingface.co/baidu/ERNIE-Image-Turbo) 的 **W4A4 量化推理**支持,在保持图像质量的前提下显著提升推理速度、降低显存占用。
    
    本实现基于 [Nunchaku](https://github.com/nunchaku-ai/nunchaku),感谢其在高效扩散模型推理方面的出色工作。
    
    ### 安装
    
    ```bash
    # 本 fork 基于 Nunchaku 添加了对 ERNIE Image 的支持
    git clone https://github.com/Hzj199/nunchaku.git
    cd nunchaku
    git submodule update --init --recursive
    
    pip install build
    NUNCHAKU_BUILD_WHEELS=1 python -m build --wheel --no-isolation
    pip install dist/nunchaku-*.whl
    ```
    
    ### 快速开始
    
    ```python
    import torch
    from diffusers.pipelines.ernie_image.pipeline_ernie_image import ErnieImagePipeline
    from nunchaku import NunchakuErnieImageTransformer2DModel
    from nunchaku.utils import get_precision
    
    precision = get_precision()  # 自动检测:int4 或 fp4
    rank = 64
    
    transformer = NunchakuErnieImageTransformer2DModel.from_pretrained(
        f"ZJMuYun97/ERNIE-Image-Nunchaku/svdq-{precision}_r{rank}-ernie-image.safetensors",
        torch_dtype=torch.bfloat16,
        device="cuda",
    )
    
    pipe = ErnieImagePipeline.from_pretrained(
        "baidu/ERNIE-Image-Turbo",
        transformer=transformer,
        torch_dtype=torch.bfloat16,
        pe=None, pe_tokenizer=None,
    )
    
    image = pipe(
        prompt="一只可爱的橘色猫咪坐在阳光照射的窗台上,旁边放着一盆绿色植物",
        height=1024, width=1024,
        num_inference_steps=8,
        guidance_scale=1.0,
        generator=torch.Generator().manual_seed(42),
    ).images[0]
    image.save("ernie-image.png")
    ```
    
    
    ### 性能参考
    
    A800 单卡测试,1024×1024 分辨率,8 步推理:
    
    | 模型 | 平均延迟 | 加速比 |
    |------|---------|--------|
    | 原始 BF16 | 4.89s | 1.0x |
    | **Nunchaku W4A4** | **2.81s** | **1.74x** |
    
    ### 注意事项
    
    - 仅支持 `batch_size=1`(符合常见推理场景)。

    Description

    Checkpoint
    Ernie

    Details

    Downloads
    17
    Platform
    CivitAI
    Platform Status
    Available
    Created
    5/20/2026
    Updated
    5/23/2026
    Deleted
    -

    Files

    svdqInt4R64Ernie_v10.safetensors

    Mirrors