Instructions to use nqd145/Gemma-4-E2B-it-abliterated-litertlm with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use nqd145/Gemma-4-E2B-it-abliterated-litertlm with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="nqd145/Gemma-4-E2B-it-abliterated-litertlm")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("nqd145/Gemma-4-E2B-it-abliterated-litertlm", dtype="auto") - LiteRT-LM
How to use nqd145/Gemma-4-E2B-it-abliterated-litertlm with LiteRT-LM:
# LiteRT-LM runs on various platforms (Android, iOS, Windows, Linux, macOS, IoT, Web/WASM) # and supports many APIs (C++, Python, Kotlin, Swift, JavaScript, Flutter). # For platform-specific integration guides, please refer to the official developer website: # https://ai.google.dev/edge/litert-lm # To try LiteRT-LM, the easiest way is to use our CLI tool. # 1. Install the LiteRT-LM CLI tool: pip install litert-lm # 2. Download and run this model locally: # See: https://ai.google.dev/edge/litert-lm/cli litert-lm run \ --from-huggingface-repo=nqd145/Gemma-4-E2B-it-abliterated-litertlm \ model.litertlm \ --prompt="Write me a poem"
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use nqd145/Gemma-4-E2B-it-abliterated-litertlm with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "nqd145/Gemma-4-E2B-it-abliterated-litertlm" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "nqd145/Gemma-4-E2B-it-abliterated-litertlm", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/nqd145/Gemma-4-E2B-it-abliterated-litertlm
- SGLang
How to use nqd145/Gemma-4-E2B-it-abliterated-litertlm with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "nqd145/Gemma-4-E2B-it-abliterated-litertlm" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "nqd145/Gemma-4-E2B-it-abliterated-litertlm", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "nqd145/Gemma-4-E2B-it-abliterated-litertlm" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "nqd145/Gemma-4-E2B-it-abliterated-litertlm", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use nqd145/Gemma-4-E2B-it-abliterated-litertlm with Docker Model Runner:
docker model run hf.co/nqd145/Gemma-4-E2B-it-abliterated-litertlm
Gemma-4-E2B-it-abliterated (LiteRT-LM)
LiteRT-LM export of huihui-ai/Huihui-gemma-4-E2B-it-abliterated for on-device / edge inference workflows.
Model File
Gemma-4-E2B-it-abliterated.litertlm
Source
- Base checkpoint:
huihui-ai/Huihui-gemma-4-E2B-it-abliterated - Export pipeline:
safetensors-to-litertlm
Export Notes
- Export format:
.litertlm(LiteRT-LM bundle) - Quantization: INT8 profile (
dynamic_wi8_afp32) - Intended runtime:
litert-lmCLI / LiteRT-LM compatible apps
Quick Start (CPU)
litert-lm run ./Gemma-4-E2B-it-abliterated.litertlm --prompt "Hi" --backend cpu
Limitations
- Behavior may differ from the original HF checkpoint due to conversion/quantization/runtime differences.
- Some export profiles that reduce memory pressure can alter section topology and runtime behavior.
Safety
This model may generate unsafe or incorrect content. Evaluate carefully for your use case and apply application-level safeguards where needed.
License
Please follow the upstream license and usage terms of:
huihui-ai/Huihui-gemma-4-E2B-it-abliterated- underlying Gemma model family terms
# LiteRT-LM runs on various platforms (Android, iOS, Windows, Linux, macOS, IoT, Web/WASM) # and supports many APIs (C++, Python, Kotlin, Swift, JavaScript, Flutter). # For platform-specific integration guides, please refer to the official developer website: # https://ai.google.dev/edge/litert-lm # To try LiteRT-LM, the easiest way is to use our CLI tool. # 1. Install the LiteRT-LM CLI tool: pip install litert-lm # 2. Download and run this model locally: # See: https://ai.google.dev/edge/litert-lm/cli litert-lm run \ --from-huggingface-repo=nqd145/Gemma-4-E2B-it-abliterated-litertlm \ model.litertlm \ --prompt="Write me a poem"