Instructions to use suhas9545/Qwen2.5-3B-SWE-Agent-QLoRA with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use suhas9545/Qwen2.5-3B-SWE-Agent-QLoRA with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="suhas9545/Qwen2.5-3B-SWE-Agent-QLoRA")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("suhas9545/Qwen2.5-3B-SWE-Agent-QLoRA", dtype="auto")

PEFT
How to use suhas9545/Qwen2.5-3B-SWE-Agent-QLoRA with PEFT:
```
Task type is invalid.
```
Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use suhas9545/Qwen2.5-3B-SWE-Agent-QLoRA with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "suhas9545/Qwen2.5-3B-SWE-Agent-QLoRA"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "suhas9545/Qwen2.5-3B-SWE-Agent-QLoRA",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/suhas9545/Qwen2.5-3B-SWE-Agent-QLoRA

SGLang

How to use suhas9545/Qwen2.5-3B-SWE-Agent-QLoRA with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "suhas9545/Qwen2.5-3B-SWE-Agent-QLoRA" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "suhas9545/Qwen2.5-3B-SWE-Agent-QLoRA",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "suhas9545/Qwen2.5-3B-SWE-Agent-QLoRA" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "suhas9545/Qwen2.5-3B-SWE-Agent-QLoRA",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use suhas9545/Qwen2.5-3B-SWE-Agent-QLoRA with Docker Model Runner:
```
docker model run hf.co/suhas9545/Qwen2.5-3B-SWE-Agent-QLoRA
```

Qwen2.5-3B-SWE-Agent-QLoRA / README.md

suhas9545

Update README.md

772126d verified about 1 month ago

preview code

raw

history blame contribute delete

3.46 kB

	---
	license: apache-2.0
	base_model: Qwen/Qwen2.5-Coder-3B-Instruct
	tags:
	- qwen
	- qwen2.5
	- code
	- coding-agent
	- lora
	- qlora
	- 4bit
	- software-engineering
	- swe
	- tool-use
	- transformers
	- peft
	language:
	- en
	pipeline_tag: text-generation
	library_name: transformers
	datasets:
	- suhas9545/Multi_Turn_SWE_dataset
	---

	# Qwen2.5-3B-SWE-Agent-QLoRA

	A QLoRA adapter trained on top of Qwen2.5-Coder-3B-Instruct for software engineering agent workflows, repository reasoning, and structured tool-based coding tasks.

	This adapter is optimized for:

	- multi-step repository reasoning
	- debugging workflows
	- codebase navigation
	- structured tool generation
	- autonomous coding agents
	- SWE-agent style trajectories
	- JSON-based tool planning

	---

	# Base Model

	- Qwen/Qwen2.5-Coder-3B-Instruct

	Base model link:

	https://huggingface.co/Qwen/Qwen2.5-Coder-3B-Instruct

	---

	# Training Dataset

	Trained on:

	- suhas9545/Multi_Turn_SWE_dataset

	Dataset link:

	https://huggingface.co/datasets/suhas9545/Multi_Turn_SWE_dataset

	The dataset contains structured multi-turn software engineering trajectories derived from SWE-agent style repository interactions and tool-use workflows.

	---

	# Quantization & Training

	This adapter was trained using QLoRA with:

	- 4-bit NF4 quantization
	- PEFT LoRA adapters
	- bitsandbytes
	- Transformers

	Recommended inference dtype:

	- float16
	- bfloat16

	---

	# Intended Use

	Recommended for:

	- coding assistants
	- SWE-agents
	- autonomous debugging systems
	- repository interaction agents
	- tool-calling agents
	- structured JSON generation
	- software engineering research

	---

	# Example Usage

	```python
	import torch
	from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
	from peft import PeftModel

	BASE_MODEL = "Qwen/Qwen2.5-Coder-3B-Instruct"
	ADAPTER = "YOUR_USERNAME/Qwen2.5-3B-SWE-Agent-QLoRA"

	tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL)

	bnb_config = BitsAndBytesConfig(
	load_in_4bit=True,
	bnb_4bit_quant_type="nf4",
	bnb_4bit_use_double_quant=True,
	bnb_4bit_compute_dtype=torch.float16,
	)

	model = AutoModelForCausalLM.from_pretrained(
	BASE_MODEL,
	device_map="auto",
	quantization_config=bnb_config,
	)

	model = PeftModel.from_pretrained(model, ADAPTER)

	prompt = "Fix failing tests in a Python repository."

	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

	outputs = model.generate(
	**inputs,
	max_new_tokens=512,
	)

	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	````

	---

	# Prompting Style

	The model performs best with concise task-oriented prompts.

	Examples:

	```text
	Fix failing tests in the repository.
	```

	```text
	Create a JSON tool plan to debug the issue.
	```

	```text
	Analyze the codebase and modify the failing function.
	```

	---

	# Limitations

	* Generated commands and patches should be reviewed before execution.
	* The model may hallucinate repository structure or tool outputs.
	* Performance depends heavily on prompt quality and inference settings.
	* Optimized primarily for coding and SWE-agent style tasks rather than general conversation.

	---

	# Citation


	```text
	@article{baumann2026swechat,
	title={SWE-chat: Coding Agent Interactions From Real Users in the Wild},
	author={Baumann, Joachim and Padmakumar, Vishakh and Li, Xiang and Yang, John and Yang, Diyi and Koyejo, Sanmi},
	year={2026},
	journal={arXiv preprint arXiv:2604.20779},
	url={https://arxiv.org/abs/2604.20779}
	}
	```

	```
	```