Instructions to use nomic-ai/nomic-embed-text-v1.5-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use nomic-ai/nomic-embed-text-v1.5-GGUF with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="nomic-ai/nomic-embed-text-v1.5-GGUF",
	filename="nomic-embed-text-v1.5.Q2_K.gguf",
)

output = llm(
	"Once upon a time,",
	max_tokens=512,
	echo=True
)
print(output)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use nomic-ai/nomic-embed-text-v1.5-GGUF with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf nomic-ai/nomic-embed-text-v1.5-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf nomic-ai/nomic-embed-text-v1.5-GGUF:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf nomic-ai/nomic-embed-text-v1.5-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf nomic-ai/nomic-embed-text-v1.5-GGUF:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf nomic-ai/nomic-embed-text-v1.5-GGUF:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf nomic-ai/nomic-embed-text-v1.5-GGUF:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf nomic-ai/nomic-embed-text-v1.5-GGUF:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf nomic-ai/nomic-embed-text-v1.5-GGUF:Q4_K_M

Use Docker

docker model run hf.co/nomic-ai/nomic-embed-text-v1.5-GGUF:Q4_K_M

LM Studio
Jan
Ollama
How to use nomic-ai/nomic-embed-text-v1.5-GGUF with Ollama:
```
ollama run hf.co/nomic-ai/nomic-embed-text-v1.5-GGUF:Q4_K_M
```

Unsloth Studio new

How to use nomic-ai/nomic-embed-text-v1.5-GGUF with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for nomic-ai/nomic-embed-text-v1.5-GGUF to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for nomic-ai/nomic-embed-text-v1.5-GGUF to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for nomic-ai/nomic-embed-text-v1.5-GGUF to start chatting

Docker Model Runner
How to use nomic-ai/nomic-embed-text-v1.5-GGUF with Docker Model Runner:
```
docker model run hf.co/nomic-ai/nomic-embed-text-v1.5-GGUF:Q4_K_M
```

Lemonade

How to use nomic-ai/nomic-embed-text-v1.5-GGUF with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull nomic-ai/nomic-embed-text-v1.5-GGUF:Q4_K_M

Run and chat with the model

lemonade run user.nomic-embed-text-v1.5-GGUF-Q4_K_M

List all available models

lemonade list

nomic-embed-text-v1.5 - GGUF

Original model: nomic-embed-text-v1.5

Usage

Embedding text with nomic-embed-text requires task instruction prefixes at the beginning of each string.

For example, the code below shows how to use the search_query prefix to embed user questions, e.g. in a RAG application.

To see the full set of task instructions available & how they are designed to be used, visit the model card for nomic-embed-text-v1.5.

Description

This repo contains llama.cpp-compatible files for nomic-embed-text-v1.5 in GGUF format.

llama.cpp will default to 2048 tokens of context with these files. For the full 8192 token context length, you will have to choose a context extension method. The 🤗 Transformers model uses Dynamic NTK-Aware RoPE scaling, but that is not currently available in llama.cpp.

Example `llama.cpp` Command

Compute a single embedding:

./embedding -ngl 99 -m nomic-embed-text-v1.5.f16.gguf -c 8192 -b 8192 --rope-scaling yarn --rope-freq-scale .75 -p 'search_query: What is TSNE?'

You can also submit a batch of texts to embed, as long as the total number of tokens does not exceed the context length. Only the first three embeddings are shown by the embedding example.

texts.txt:

search_query: What is TSNE?
search_query: Who is Laurens Van der Maaten?

Compute multiple embeddings:

./embedding -ngl 99 -m nomic-embed-text-v1.5.f16.gguf -c 8192 -b 8192 --rope-scaling yarn --rope-freq-scale .75 -f texts.txt

Compatibility

These files are compatible with llama.cpp as of commit 4524290e8 from 2/15/2024.

Provided Files

The below table shows the mean squared error of the embeddings produced by these quantizations of Nomic Embed relative to the Sentence Transformers implementation.

Name	Quant	Size	MSE
nomic-embed-text-v1.5.Q2_K.gguf	Q2_K	48 MiB	2.33e-03
nomic-embed-text-v1.5.Q3_K_S.gguf	Q3_K_S	57 MiB	1.19e-03
nomic-embed-text-v1.5.Q3_K_M.gguf	Q3_K_M	65 MiB	8.26e-04
nomic-embed-text-v1.5.Q3_K_L.gguf	Q3_K_L	69 MiB	7.93e-04
nomic-embed-text-v1.5.Q4_0.gguf	Q4_0	75 MiB	6.32e-04
nomic-embed-text-v1.5.Q4_K_S.gguf	Q4_K_S	75 MiB	6.71e-04
nomic-embed-text-v1.5.Q4_K_M.gguf	Q4_K_M	81 MiB	2.42e-04
nomic-embed-text-v1.5.Q5_0.gguf	Q5_0	91 MiB	2.35e-04
nomic-embed-text-v1.5.Q5_K_S.gguf	Q5_K_S	91 MiB	2.00e-04
nomic-embed-text-v1.5.Q5_K_M.gguf	Q5_K_M	95 MiB	6.55e-05
nomic-embed-text-v1.5.Q6_K.gguf	Q6_K	108 MiB	5.58e-05
nomic-embed-text-v1.5.Q8_0.gguf	Q8_0	140 MiB	5.79e-06
nomic-embed-text-v1.5.f16.gguf	F16	262 MiB	4.21e-10
nomic-embed-text-v1.5.f32.gguf	F32	262 MiB	6.08e-11

Downloads last month: 89,102

GGUF

Model size

0.1B params

Architecture

nomic-bert

Hardware compatibility

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

32-bit

Model tree for nomic-ai/nomic-embed-text-v1.5-GGUF

Base model

nomic-ai/nomic-embed-text-v1.5

Quantized

(34)

this model

Quantizations

1 model

Spaces using nomic-ai/nomic-embed-text-v1.5-GGUF 2

Collection including nomic-ai/nomic-embed-text-v1.5-GGUF

Nomic Embed

Collection

Open Source Long Context Text Embedders • 8 items • Updated Feb 14, 2024 • 24