ternary-models: VLMs, Multimodal & Audio
Collection
Ternary-quantized models for architectures GGUF can't handle. tritplane3 scheme. โข 16 items โข Updated โข 2
How to use AsadIsmail/Wan2.2-TI2V-5B-ternary with Diffusers:
pip install -U diffusers transformers accelerate
import torch
from diffusers import DiffusionPipeline
# switch to "mps" for apple devices
pipe = DiffusionPipeline.from_pretrained("AsadIsmail/Wan2.2-TI2V-5B-ternary", dtype=torch.bfloat16, device_map="cuda")
prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
image = pipe(prompt).images[0]How to use AsadIsmail/Wan2.2-TI2V-5B-ternary with Wan2.2:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
First publicly available ternary-quantized Wan 2.2 model on HuggingFace.
Ternary-quantized version of Wan-AI/Wan2.2-TI2V-5B-Diffusers โ Alibaba's latest text-image-to-video DiT model (5B, 572 likes on original).
| Property | Value |
|---|---|
| Base Model | Wan-AI/Wan2.2-TI2V-5B-Diffusers |
| Architecture | WanTransformer3DModel (DiT) |
| Transformer Params | 5.00B |
| Quantization | tritplane3 (306 linear layers) |
| Text Encoder (UMT5-XXL) | FP16 (preserved) |
| VAE (WanVAE) | FP16 (preserved) |
| License | Apache 2.0 |
| Method | Transformer Size |
|---|---|
| FP16 (original) | 10.02 GB |
| Ternary tritplane3 (theoretical packed) | ~5.0 GB |
| In this repo (dequantized FP16) | 9.4 GB |
import torch
from diffusers import WanPipeline
from diffusers.utils import export_to_video
pipe = WanPipeline.from_pretrained(
"AsadIsmail/Wan2.2-TI2V-5B-ternary",
torch_dtype=torch.bfloat16,
)
pipe.to("mps") # or "cuda"
output = pipe(
prompt="a cat walking on green grass",
num_frames=81,
num_inference_steps=30,
).frames[0]
export_to_video(output, "output.mp4", fps=16)
Part of ternary-models.
Base model
Wan-AI/Wan2.2-TI2V-5B-Diffusers