Instructions to use Snowflake/snowflake-arctic-embed-m-v1.5 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Snowflake/snowflake-arctic-embed-m-v1.5 with sentence-transformers:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("Snowflake/snowflake-arctic-embed-m-v1.5")

sentences = [
    "That is a happy person",
    "That is a happy dog",
    "That is a very happy person",
    "Today is a sunny day"
]
embeddings = model.encode(sentences)

similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]

Transformers.js

How to use Snowflake/snowflake-arctic-embed-m-v1.5 with Transformers.js:

// npm i @huggingface/transformers
import { pipeline } from '@huggingface/transformers';

// Allocate pipeline
const pipe = await pipeline('sentence-similarity', 'Snowflake/snowflake-arctic-embed-m-v1.5');

llama-cpp-python

How to use Snowflake/snowflake-arctic-embed-m-v1.5 with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="Snowflake/snowflake-arctic-embed-m-v1.5",
	filename="gguf/snowflake-arctic-embed-m-v1.5-bf16.gguf",
)

output = llm(
	"Once upon a time,",
	max_tokens=512,
	echo=True
)
print(output)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use Snowflake/snowflake-arctic-embed-m-v1.5 with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Snowflake/snowflake-arctic-embed-m-v1.5:BF16
# Run inference directly in the terminal:
llama-cli -hf Snowflake/snowflake-arctic-embed-m-v1.5:BF16

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Snowflake/snowflake-arctic-embed-m-v1.5:BF16
# Run inference directly in the terminal:
llama-cli -hf Snowflake/snowflake-arctic-embed-m-v1.5:BF16

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf Snowflake/snowflake-arctic-embed-m-v1.5:BF16
# Run inference directly in the terminal:
./llama-cli -hf Snowflake/snowflake-arctic-embed-m-v1.5:BF16

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf Snowflake/snowflake-arctic-embed-m-v1.5:BF16
# Run inference directly in the terminal:
./build/bin/llama-cli -hf Snowflake/snowflake-arctic-embed-m-v1.5:BF16

Use Docker

docker model run hf.co/Snowflake/snowflake-arctic-embed-m-v1.5:BF16

LM Studio
Jan
Ollama
How to use Snowflake/snowflake-arctic-embed-m-v1.5 with Ollama:
```
ollama run hf.co/Snowflake/snowflake-arctic-embed-m-v1.5:BF16
```

Unsloth Studio new

How to use Snowflake/snowflake-arctic-embed-m-v1.5 with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Snowflake/snowflake-arctic-embed-m-v1.5 to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Snowflake/snowflake-arctic-embed-m-v1.5 to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for Snowflake/snowflake-arctic-embed-m-v1.5 to start chatting

Docker Model Runner
How to use Snowflake/snowflake-arctic-embed-m-v1.5 with Docker Model Runner:
```
docker model run hf.co/Snowflake/snowflake-arctic-embed-m-v1.5:BF16
```

Lemonade

How to use Snowflake/snowflake-arctic-embed-m-v1.5 with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull Snowflake/snowflake-arctic-embed-m-v1.5:BF16

Run and chat with the model

lemonade run user.snowflake-arctic-embed-m-v1.5-BF16

List all available models

lemonade list

Specify add_pooling_layer=False via configuration instead

by tomaarsen HF Staff - opened Aug 30, 2024

base: refs/heads/main

←

from: refs/pr/5

Discussion Files changed

-4

tomaarsen

Aug 30, 2024

•

edited Aug 30, 2024

Hello!

Pull Request overview

Specify add_pooling_layer=False via configuration instead

Details

The underlying transformers AutoModel should be called with add_pooling_layer=False to avoid confusing warnings. This can be done directly via the SentenceTransformer init, but we can also define a default in the sentence_bert_config.json. In short, the values in that config file get passed to the Transformer init, so we can specify all kinds of values in our config, e.g. model_args, tokenizer_args, config-args, max_seq_length, etc.

By setting the new default in the config, less people should experience this warning.

Note: This does mean that this model can only be loaded with SentenceTransformer v3 and up (but this was already required for the remainder of the README.md snippet regardless).

Note 2: Looks like we do the same already in the v1: https://huggingface.co/Snowflake/snowflake-arctic-embed-m/blob/main/sentence_bert_config.json#L4-L6

Tom Aarsen

Specify add_pooling_layer=False via configuration instead96b7da4c

tomaarsen changed pull request status to open Aug 30, 2024

spacemanidol changed pull request status to merged Sep 3, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment