xet-gguf-edit-test / README.md
mishig's picture
mishig HF Staff
Update README.md
fb0c855 verified

GGUF Header Edit Benchmark

Benchmark script for measuring how long it takes to edit GGUF headers in-place on Hugging Face with streaming blobs (xet) and create a pull request per file.
It fetches metadata, rebuilds the header with a small change, commits an edit (header slice only), and records timings to a CSV.

Result from benchmark.ts

Rule of thumb (linear fit):
time_minutes ≈ 0.36 × size_GB + 0.25

Model Size (GB) Time (minutes)
0.5 0.28
1.0 0.47
1.5 0.24
2.0 1.06
2.5 1.29
3.0 1.43
3.5 1.59
4.0 1.61
4.5 1.82
5.0 1.98
5.5 2.10
6.0 2.18
6.5 2.14
7.0 4.73
7.5 5.04
8.0 2.71
8.5 2.75
9.0 3.03
9.5 3.11
10.0 3.24

✨ What this does

For each *.gguf file in a model repo:

  1. Discover files via the Hugging Face model tree API.
  2. Fetch GGUF + typed metadata with @huggingface/gguf.
  3. Rebuild the header using buildGgufHeader (preserving endianness, alignment, and tensor info range).
  4. Commit a slice edit (header bytes only) using commitIter with useXet: true to avoid full re-uploads.
  5. Create a PR titled benchmark.
  6. Record timing (wall-clock) to benchmark-results.csv.

🧱 Requirements

  • Node 18+
  • A Hugging Face token with read + write on the target repo: HF_TOKEN
  • NPM packages:
    • @huggingface/gguf
    • @huggingface/hub
  • Network access to huggingface.co

🔧 Setup

npm i
npm run benchmark