Instructions to use nightbloom/YandexGPT-5-Lite-8B-pretrainJB-ChatMl with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use nightbloom/YandexGPT-5-Lite-8B-pretrainJB-ChatMl with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("nightbloom/YandexGPT-5-Lite-8B-pretrainJB-ChatMl", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Unsloth Studio
How to use nightbloom/YandexGPT-5-Lite-8B-pretrainJB-ChatMl with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for nightbloom/YandexGPT-5-Lite-8B-pretrainJB-ChatMl to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for nightbloom/YandexGPT-5-Lite-8B-pretrainJB-ChatMl to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for nightbloom/YandexGPT-5-Lite-8B-pretrainJB-ChatMl to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="nightbloom/YandexGPT-5-Lite-8B-pretrainJB-ChatMl", max_seq_length=2048, )
Model Description / Описание модели
English: This model serves as a proof-of-concept for the vulnerability described in the paper "Attack via Overfitting: 10-shot Benign Fine-tuning to Jailbreak LLMs" (arXiv:2510.02833v2).
Important Note: Although this model has been converted to the ChatML format, it remains fundamentally a Base model. It was not fine-tuned for general instruction following. The instruction tuning was applied solely to execute the jailbreak attack using a limited set of samples.
Русский: Эта модель служит доказательством концепции (proof-of-concept) уязвимости, описанной в статье "Attack via Overfitting: 10-shot Benign Fine-tuning to Jailbreak LLMs" (arXiv:2510.02833v2).
Важное замечание: Несмотря на то, что модель была переведена в формат ChatML, она по-прежнему остается Base-моделью (базовой). Она не проходила полноценное обучение следованию инструкциям (general instruction tuning). Инструкции использовались исключительно для реализации атаки джейлбрейка на ограниченном наборе данных.
Methodology / Методология
English: The jailbreak was achieved via LoRA (Low-Rank Adaptation). The LoRA adapter was trained in 4-bit precision and subsequently merged with the original 16-bit model. Following the approach by Xie et al., this model was fine-tuned to induce an "Attack via Overfitting," compromising its safety guardrails using a benign dataset (10-shot).
Русский: Джейлбрейк был реализован с помощью LoRA (Low-Rank Adaptation). Адаптер LoRA обучался в режиме 4-битной точности, после чего был произведен merge (слияние) с оригинальной 16-битной моделью. Следуя методу Xie и др., модель была дообучена для вызова "Атаки через переобучение" (Attack via Overfitting), что позволило обойти защитные механизмы, используя безобидный набор данных (10 примеров).
Paper & Citation / Статья и Цитирование
Title: Attack via Overfitting: 10-shot Benign Fine-tuning to Jailbreak LLMs
Authors: Zhixin Xie, Xurui Song, Jun Luo (Nanyang Technological University)
Link: arXiv:2510.02833v2 [cs.CR]
@article{xie2025attack,
title={Attack via Overfitting: 10-shot Benign Fine-tuning to Jailbreak LLMs},
author={Xie, Zhixin and Song, Xurui and Luo, Jun},
journal={arXiv preprint arXiv:2510.02833},
year={2025}
}
- Downloads last month
- 3
Model tree for nightbloom/YandexGPT-5-Lite-8B-pretrainJB-ChatMl
Base model
yandex/YandexGPT-5-Lite-8B-pretrain