Instructions to use prithivMLmods/Qwen3-VL-8B-Instruct-c_abliterated-v3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use prithivMLmods/Qwen3-VL-8B-Instruct-c_abliterated-v3 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="prithivMLmods/Qwen3-VL-8B-Instruct-c_abliterated-v3")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("prithivMLmods/Qwen3-VL-8B-Instruct-c_abliterated-v3")
model = AutoModelForMultimodalLM.from_pretrained("prithivMLmods/Qwen3-VL-8B-Instruct-c_abliterated-v3")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use prithivMLmods/Qwen3-VL-8B-Instruct-c_abliterated-v3 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "prithivMLmods/Qwen3-VL-8B-Instruct-c_abliterated-v3"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "prithivMLmods/Qwen3-VL-8B-Instruct-c_abliterated-v3",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/prithivMLmods/Qwen3-VL-8B-Instruct-c_abliterated-v3

SGLang

How to use prithivMLmods/Qwen3-VL-8B-Instruct-c_abliterated-v3 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "prithivMLmods/Qwen3-VL-8B-Instruct-c_abliterated-v3" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "prithivMLmods/Qwen3-VL-8B-Instruct-c_abliterated-v3",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "prithivMLmods/Qwen3-VL-8B-Instruct-c_abliterated-v3" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "prithivMLmods/Qwen3-VL-8B-Instruct-c_abliterated-v3",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use prithivMLmods/Qwen3-VL-8B-Instruct-c_abliterated-v3 with Docker Model Runner:
```
docker model run hf.co/prithivMLmods/Qwen3-VL-8B-Instruct-c_abliterated-v3
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Qwen3-VL-8B-Instruct-c_abliterated-v3

Qwen3-VL-8B-Instruct-c_abliterated-v3 is the third-generation evolution of the abliterated Qwen3-VL-8B series. This model utilizes Continual Abliteration (c_abliterated), a process involving successive iterations of training specifically designed to neutralize internal refusal mechanisms. The result is a high-capacity 8B model capable of unrestricted, detailed reasoning and captioning across even the most sensitive or complex visual data.

Key Highlights

Continual Abliteration (v3): Refined through multiple training passes to eliminate "hard-coded" refusals, ensuring the model prioritizes instruction-following over conventional content filtering.
8B Parameter Intelligence: Leverages the increased power of the 8B architecture for more nuanced reasoning, better object relationship understanding, and superior linguistic flair compared to smaller variants.
Uncensored Multimodal Reasoning: Designed for deep analysis of artistic, forensic, technical, or abstract content without the interference of safety-driven refusals.
High-Fidelity Captions: Generates dense, descriptive metadata suitable for high-quality dataset curation or accessibility applications.
Dynamic Resolution Support: Inherits Qwen3-VL's ability to process images of various aspect ratios and resolutions without significant loss of detail.

Base Model Signatures:

This model has been re-sharded and optimized for the latest Transformers version from the base model: https://huggingface.co/huihui-ai/Huihui-Qwen3-VL-8B-Instruct-abliterated.

Quick Start with Transformers

from transformers import Qwen3VLForConditionalGeneration, AutoProcessor
from qwen_vl_utils import process_vision_info
import torch

# Load the v3 8B c_abliterated model
model = Qwen3VLForConditionalGeneration.from_pretrained(
    "prithivMLmods/Qwen3-VL-8B-Instruct-c_abliterated-v3",
    torch_dtype="auto",
    device_map="auto"
)

processor = AutoProcessor.from_pretrained("prithivMLmods/Qwen3-VL-8B-Instruct-c_abliterated-v3")

messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "image",
                "image": "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-VL/assets/demo.jpeg",
            },
            {"type": "text", "text": "Provide a detailed caption and reasoning for this image."},
        ],
    }
]

text = processor.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True
)
image_inputs, video_inputs = process_vision_info(messages)

inputs = processor(
    text=[text],
    images=image_inputs,
    videos=video_inputs,
    padding=True,
    return_tensors="pt",
).to("cuda")

# Increased max_new_tokens for the 8B model's detailed output
generated_ids = model.generate(**inputs, max_new_tokens=256)

generated_ids_trimmed = [
    out_ids[len(in_ids):] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
]

output_text = processor.batch_decode(
    generated_ids_trimmed,
    skip_special_tokens=True,
    clean_up_tokenization_spaces=False
)

print(output_text)

Intended Use

Advanced Red-Teaming: Probing multimodal models for deep-seated biases or vulnerabilities without the "masking" effect of standard safety layers.
Complex Data Archiving: Detailed captioning for historical, medical, or artistic archives where raw descriptive accuracy is the priority.
Iterative Refusal Research: Studying the effects of "Continual Abliteration" on the weights and attention mechanisms of large-scale vision-language models.
Creative and Unfiltered Storytelling: Generating complex visual descriptions for world-building and narrative projects.

Limitations & Risks

Critical Note: This model is explicitly designed to bypass safety filters.

Exposure to Sensitive Content: The model will likely generate explicit or offensive descriptions if prompted with such visual material.
Ethical Responsibility: Users are responsible for the content generated; this model should only be used in controlled, professional, or research settings.
Hardware Requirements: As an 8B model, it requires significant VRAM for inference, especially when processing high-resolution images or long text sequences.