Instructions to use Snowflake/snowflake-arctic-embed-l with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use Snowflake/snowflake-arctic-embed-l with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("Snowflake/snowflake-arctic-embed-l") sentences = [ "That is a happy person", "That is a happy dog", "That is a very happy person", "Today is a sunny day" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Transformers.js
How to use Snowflake/snowflake-arctic-embed-l with Transformers.js:
// npm i @huggingface/transformers import { pipeline } from '@huggingface/transformers'; // Allocate pipeline const pipe = await pipeline('sentence-similarity', 'Snowflake/snowflake-arctic-embed-l'); - Inference
- Notebooks
- Google Colab
- Kaggle
Ollama version doesn't properly truncate tokens to 512 max
When using the official Ollama model of snowflake-arctic-embed-l (latest/335m - 21ab8b9b0545), if input is greater than 512 tokens, instead of truncating, the model encounters an error somewhere and returns only [0,0,0...] embeddings.
I've checked my Ollama parameters and this occurs when "truncate": true. Other embedding models properly truncates the input and I see the INFO log in Ollama say "input truncated". I don't see this message with snowflake-arctic-embed-l.
When "truncate" is set to false, I get the expected "input length exceeds maximum context length".
Also just leaving a thanks for building these embedding models!
I'm not super familiar with truncation in Ollama -- the Ollama version of this model is provided by the Ollama community, not Snowflake. You may want to raise this issue on their GitHub issues.