svdq-int4_r64-ernie-image - CivArchive (CivitAI Archive)

# ERNIE Image Turbo — Nunchaku W4A4 Quantized Inference

[中文](#chinese) | [English](#english)

---

### Introduction

This adds **W4A4 quantized inference** support for [ERNIE Image Turbo](https://huggingface.co/baidu/ERNIE-Image-Turbo) to [**Nunchaku**](https://github.com/nunchaku-ai/nunchaku), delivering significant speedup and memory reduction with minimal quality loss.

Built on [Nunchaku](https://github.com/nunchaku-ai/nunchaku). We gratefully acknowledge their excellent work on efficient diffusion model inference.


### Installation

```bash
# This fork adds ERNIE Image support to Nunchaku
git clone https://github.com/Hzj199/nunchaku.git
cd nunchaku
git submodule update --init --recursive

pip install build
NUNCHAKU_BUILD_WHEELS=1 python -m build --wheel --no-isolation
pip install dist/nunchaku-*.whl
```

### Quick Start

```python
import torch
from diffusers.pipelines.ernie_image.pipeline_ernie_image import ErnieImagePipeline
from nunchaku import NunchakuErnieImageTransformer2DModel
from nunchaku.utils import get_precision

precision = get_precision()  # auto-detect: "int4" or "fp4"
rank = 64

transformer = NunchakuErnieImageTransformer2DModel.from_pretrained(
    f"ZJMuYun97/ERNIE-Image-Nunchaku/svdq-{precision}_r{rank}-ernie-image.safetensors",
    torch_dtype=torch.bfloat16,
    device="cuda",
)

pipe = ErnieImagePipeline.from_pretrained(
    "baidu/ERNIE-Image-Turbo",
    transformer=transformer,
    torch_dtype=torch.bfloat16,
    pe=None, pe_tokenizer=None,
)

image = pipe(
    prompt="a cute orange cat sitting on a sunlit windowsill",
    height=1024, width=1024,
    num_inference_steps=8,
    guidance_scale=1.0,
    generator=torch.Generator().manual_seed(42),
).images[0]
image.save("ernie-image.png")
```

### Performance (Reference)

Tested on a single A800 GPU, 1024×1024 resolution, 8 inference steps:

| Model | Avg Latency | Speedup |
|-------|-------------|---------|
| Original BF16 | 4.89s | 1.0x |
| **Nunchaku W4A4** | **2.81s** | **1.74x** |

### Notes

- Only `batch_size=1` is supported (same as typical inference use case).
---


### 简介

为 [**Nunchaku**](https://github.com/nunchaku-ai/nunchaku) 添加了对 [ERNIE Image Turbo](https://huggingface.co/baidu/ERNIE-Image-Turbo) 的 **W4A4 量化推理**支持，在保持图像质量的前提下显著提升推理速度、降低显存占用。

本实现基于 [Nunchaku](https://github.com/nunchaku-ai/nunchaku)，感谢其在高效扩散模型推理方面的出色工作。

### 安装

```bash
# 本 fork 基于 Nunchaku 添加了对 ERNIE Image 的支持
git clone https://github.com/Hzj199/nunchaku.git
cd nunchaku
git submodule update --init --recursive

pip install build
NUNCHAKU_BUILD_WHEELS=1 python -m build --wheel --no-isolation
pip install dist/nunchaku-*.whl
```

### 快速开始

```python
import torch
from diffusers.pipelines.ernie_image.pipeline_ernie_image import ErnieImagePipeline
from nunchaku import NunchakuErnieImageTransformer2DModel
from nunchaku.utils import get_precision

precision = get_precision()  # 自动检测：int4 或 fp4
rank = 64

transformer = NunchakuErnieImageTransformer2DModel.from_pretrained(
    f"ZJMuYun97/ERNIE-Image-Nunchaku/svdq-{precision}_r{rank}-ernie-image.safetensors",
    torch_dtype=torch.bfloat16,
    device="cuda",
)

pipe = ErnieImagePipeline.from_pretrained(
    "baidu/ERNIE-Image-Turbo",
    transformer=transformer,
    torch_dtype=torch.bfloat16,
    pe=None, pe_tokenizer=None,
)

image = pipe(
    prompt="一只可爱的橘色猫咪坐在阳光照射的窗台上，旁边放着一盆绿色植物",
    height=1024, width=1024,
    num_inference_steps=8,
    guidance_scale=1.0,
    generator=torch.Generator().manual_seed(42),
).images[0]
image.save("ernie-image.png")
```


### 性能参考

A800 单卡测试，1024×1024 分辨率，8 步推理：

| 模型 | 平均延迟 | 加速比 |
|------|---------|--------|
| 原始 BF16 | 4.89s | 1.0x |
| **Nunchaku W4A4** | **2.81s** | **1.74x** |

### 注意事项

- 仅支持 `batch_size=1`（符合常见推理场景）。
Description

Comments (3)

Details

Files

svdqInt4R64Ernie_v10.safetensors

Mirrors