Add missing tokenizer.json (fast tokenizer for TEI compatibility)

#11
by LHC88 - opened

The 0.6B model includes tokenizer.json but the 4B does not. This prevents HuggingFace Text Embeddings Inference (TEI) from loading the model, since TEI requires the fast tokenizer format.

Generated from the same Qwen3 tokenizer files (vocab.json + merges.txt) that are already in the repo. Identical to the tokenizer.json in pplx-embed-v1-0.6B.

Closing — need to validate the tokenizer works with TEI before submitting. Will reopen with a tested PR.

LHC88 changed pull request status to closed

Sign up or log in to comment