Instructions to use nqd145/Gemma-4-E2B-it-abliterated-litertlm with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use nqd145/Gemma-4-E2B-it-abliterated-litertlm with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="nqd145/Gemma-4-E2B-it-abliterated-litertlm")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("nqd145/Gemma-4-E2B-it-abliterated-litertlm", dtype="auto")

LiteRT-LM

How to use nqd145/Gemma-4-E2B-it-abliterated-litertlm with LiteRT-LM:

# LiteRT-LM runs on various platforms (Android, iOS, Windows, Linux, macOS, IoT, Web/WASM)
# and supports many APIs (C++, Python, Kotlin, Swift, JavaScript, Flutter).
# For platform-specific integration guides, please refer to the official developer website:
# https://ai.google.dev/edge/litert-lm

# To try LiteRT-LM, the easiest way is to use our CLI tool.
# 1. Install the LiteRT-LM CLI tool:
pip install litert-lm

# 2. Download and run this model locally:
# See: https://ai.google.dev/edge/litert-lm/cli
litert-lm run \
  --from-huggingface-repo=nqd145/Gemma-4-E2B-it-abliterated-litertlm \
  model.litertlm \
  --prompt="Write me a poem"

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use nqd145/Gemma-4-E2B-it-abliterated-litertlm with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "nqd145/Gemma-4-E2B-it-abliterated-litertlm"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "nqd145/Gemma-4-E2B-it-abliterated-litertlm",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/nqd145/Gemma-4-E2B-it-abliterated-litertlm

SGLang

How to use nqd145/Gemma-4-E2B-it-abliterated-litertlm with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "nqd145/Gemma-4-E2B-it-abliterated-litertlm" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "nqd145/Gemma-4-E2B-it-abliterated-litertlm",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "nqd145/Gemma-4-E2B-it-abliterated-litertlm" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "nqd145/Gemma-4-E2B-it-abliterated-litertlm",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use nqd145/Gemma-4-E2B-it-abliterated-litertlm with Docker Model Runner:
```
docker model run hf.co/nqd145/Gemma-4-E2B-it-abliterated-litertlm
```

Gemma-4-E2B-it-abliterated (LiteRT-LM)

LiteRT-LM export of huihui-ai/Huihui-gemma-4-E2B-it-abliterated for on-device / edge inference workflows.

Model File

Gemma-4-E2B-it-abliterated.litertlm

Source

Base checkpoint: huihui-ai/Huihui-gemma-4-E2B-it-abliterated
Export pipeline: safetensors-to-litertlm

Export Notes

Export format: .litertlm (LiteRT-LM bundle)
Quantization: INT8 profile (dynamic_wi8_afp32)
Intended runtime: litert-lm CLI / LiteRT-LM compatible apps

Quick Start (CPU)

litert-lm run ./Gemma-4-E2B-it-abliterated.litertlm --prompt "Hi" --backend cpu

Limitations

Behavior may differ from the original HF checkpoint due to conversion/quantization/runtime differences.
Some export profiles that reduce memory pressure can alter section topology and runtime behavior.

Safety

This model may generate unsafe or incorrect content. Evaluate carefully for your use case and apply application-level safeguards where needed.

License

Please follow the upstream license and usage terms of:

huihui-ai/Huihui-gemma-4-E2B-it-abliterated
underlying Gemma model family terms

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for nqd145/Gemma-4-E2B-it-abliterated-litertlm

Base model

google/gemma-4-E2B

Finetuned

google/gemma-4-E2B-it

Finetuned

(237)

this model