Question about prompts

by tomaarsen - opened 19 days ago

19 days ago

Hello!

Sentence Transformers maintainer here. I'm very excited about this model!

I had a question about the prompts, specifically because the section here (https://huggingface.co/tencent/KaLM-Embedding-Gemma3-12B-2511#sentence-transformers-support) states that "Instruct: Given a query, retrieve documents that answer the query \n Query: " will be used with model.encode_query automatically, but that method will use the "query" prompt here: https://huggingface.co/tencent/KaLM-Embedding-Gemma3-12B-2511/blob/main/config_sentence_transformers.json#L7

But there's no query prompt defined like in https://huggingface.co/KaLM-Embedding/KaLM-embedding-multilingual-mini-instruct-v2.5/blob/main/config_sentence_transformers.json#L8.

Should the prompts be the same as with https://huggingface.co/KaLM-Embedding/KaLM-embedding-multilingual-mini-instruct-v2.5/blob/main/config_sentence_transformers.json#L8? I understand that different prompts are used for different tasks (as can be seen here: https://huggingface.co/tencent/KaLM-Embedding-Gemma3-12B-2511/blob/main/task_prompts.json), but I'm not fully sure on the format yet. For example, the README also says "Instruct: Classifying the category of french news.\nQuery:", i.e. without any spaces before and after \n and after Query:.

What should the correct "default prompt" be? Then we can add it to https://huggingface.co/tencent/KaLM-Embedding-Gemma3-12B-2511/blob/main/config_sentence_transformers.json nicely.

cc @YanshekWoo

Tom Aarsen

YanshekWoo

Tencent org 19 days ago

Thank you for your question.

The phrase "Instruct: Classifying the category of French news.\nQuery:" represents the default prompt format (without any spaces before and after \n and after Query:).
However, a single space typically does not significantly impact embedding performance.

Additionally, the content at https://huggingface.co/tencent/KaLM-Embedding-Gemma3-12B-2511/blob/main/task_prompts.json pertains only to the instruction section.
We will subsequently revise and upload the content, updating it to the complete prompt format to prevent ambiguity.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment