Instructions to use arcee-ai/Trinity-Mini-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use arcee-ai/Trinity-Mini-GGUF with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="arcee-ai/Trinity-Mini-GGUF")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("arcee-ai/Trinity-Mini-GGUF", dtype="auto")

llama-cpp-python

How to use arcee-ai/Trinity-Mini-GGUF with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="arcee-ai/Trinity-Mini-GGUF",
	filename="Trinity-Mini-IQ2_M.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use arcee-ai/Trinity-Mini-GGUF with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf arcee-ai/Trinity-Mini-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf arcee-ai/Trinity-Mini-GGUF:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf arcee-ai/Trinity-Mini-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf arcee-ai/Trinity-Mini-GGUF:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf arcee-ai/Trinity-Mini-GGUF:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf arcee-ai/Trinity-Mini-GGUF:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf arcee-ai/Trinity-Mini-GGUF:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf arcee-ai/Trinity-Mini-GGUF:Q4_K_M

Use Docker

docker model run hf.co/arcee-ai/Trinity-Mini-GGUF:Q4_K_M

LM Studio
Jan

vLLM

How to use arcee-ai/Trinity-Mini-GGUF with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "arcee-ai/Trinity-Mini-GGUF"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "arcee-ai/Trinity-Mini-GGUF",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/arcee-ai/Trinity-Mini-GGUF:Q4_K_M

SGLang

How to use arcee-ai/Trinity-Mini-GGUF with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "arcee-ai/Trinity-Mini-GGUF" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "arcee-ai/Trinity-Mini-GGUF",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "arcee-ai/Trinity-Mini-GGUF" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "arcee-ai/Trinity-Mini-GGUF",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Ollama
How to use arcee-ai/Trinity-Mini-GGUF with Ollama:
```
ollama run hf.co/arcee-ai/Trinity-Mini-GGUF:Q4_K_M
```

Unsloth Studio

How to use arcee-ai/Trinity-Mini-GGUF with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for arcee-ai/Trinity-Mini-GGUF to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for arcee-ai/Trinity-Mini-GGUF to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for arcee-ai/Trinity-Mini-GGUF to start chatting

How to use arcee-ai/Trinity-Mini-GGUF with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf arcee-ai/Trinity-Mini-GGUF:Q4_K_M

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "arcee-ai/Trinity-Mini-GGUF:Q4_K_M"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use arcee-ai/Trinity-Mini-GGUF with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf arcee-ai/Trinity-Mini-GGUF:Q4_K_M

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default arcee-ai/Trinity-Mini-GGUF:Q4_K_M

Run Hermes

hermes

Atomic Chat new
Docker Model Runner
How to use arcee-ai/Trinity-Mini-GGUF with Docker Model Runner:
```
docker model run hf.co/arcee-ai/Trinity-Mini-GGUF:Q4_K_M
```

Lemonade

How to use arcee-ai/Trinity-Mini-GGUF with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull arcee-ai/Trinity-Mini-GGUF:Q4_K_M

Run and chat with the model

lemonade run user.Trinity-Mini-GGUF-Q4_K_M

List all available models

lemonade list

Trinity-Mini-GGUF / LICENSE

annekethvij

Adopt OpenMDW-1.1 license (#1)

a11737e 19 days ago

raw

history blame contribute delete

2.62 kB

	OpenMDW License Agreement, version 1.1 (OpenMDW-1.1)

	By exercising rights granted to you under this agreement, you accept and agree
	to its terms.

	As used in this agreement, "Model Materials" means the materials provided to
	you under this agreement, consisting of: (1) one or more machine learning
	models (including architecture and parameters); and (2) all related artifacts
	(including associated data, documentation and software) that are provided to
	you hereunder.

	Subject to your compliance with this agreement, permission is hereby granted,
	free of charge, to deal in the Model Materials without restriction, including
	under all copyright, patent, database, and trade secret rights included or
	embodied therein.

	If you distribute any portion of the Model Materials, you shall retain in your
	distribution (1) a copy of this agreement, and (2) all copyright notices and
	other notices of origin included in the Model Materials that are applicable to
	your distribution.

	If you file, maintain, or voluntarily participate in a lawsuit against any
	person or entity asserting that the Model Materials directly or indirectly
	infringe any patent or copyright, then all rights and grants made to you
	hereunder are terminated, unless that lawsuit was in response to a
	corresponding lawsuit first brought against you.

	This agreement does not impose any restrictions or obligations with respect to
	any use, modification, or sharing of any outputs generated by using the Model
	Materials.

	THE MODEL MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
	OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
	FITNESS FOR A PARTICULAR PURPOSE, TITLE, NONINFRINGEMENT, ACCURACY, OR THE
	ABSENCE OF LATENT OR OTHER DEFECTS OR ERRORS, WHETHER OR NOT DISCOVERABLE, ALL
	TO THE GREATEST EXTENT PERMISSIBLE UNDER APPLICABLE LAW.

	YOU ARE SOLELY RESPONSIBLE FOR (1) CLEARING RIGHTS OF OTHER PERSONS THAT MAY
	APPLY TO THE MODEL MATERIALS OR ANY USE THEREOF, INCLUDING WITHOUT LIMITATION
	ANY PERSON'S COPYRIGHTS OR OTHER RIGHTS INCLUDED OR EMBODIED IN THE MODEL
	MATERIALS; (2) OBTAINING ANY NECESSARY CONSENTS, PERMISSIONS OR OTHER RIGHTS
	REQUIRED FOR ANY USE OF THE MODEL MATERIALS; OR (3) PERFORMING ANY DUE
	DILIGENCE OR UNDERTAKING ANY OTHER INVESTIGATIONS INTO THE MODEL MATERIALS OR
	ANYTHING INCORPORATED OR EMBODIED THEREIN.

	IN NO EVENT SHALL THE PROVIDERS OF THE MODEL MATERIALS BE LIABLE FOR ANY CLAIM,
	DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
	OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE MODEL MATERIALS, THE
	USE THEREOF OR OTHER DEALINGS THEREIN.