Qwen3 4B Thinking 2507 Heretic CodeFeedback — Agentic Tessa 1K LoRA

This repository contains an experimental LoRA adapter trained on top of:

JoaoZaokk/Qwen3-4B-Thinking-2507-Heretic-CodeFeedback

This adapter is a small continuation experiment using:

smirki/Agentic-Coding-Tessa

The goal was to test whether a small amount of agentic coding data could improve or preserve coding behavior without degrading strict code-output performance.

Status

This is a candidate / experimental adapter, not a claimed major improvement.

I'll be testing some datasets to make the model better for coding, it a tiny improvement, not a game changer, but compared to the previous one this model didn't get worse.

In a small local Python coding benchmark, this adapter preserved the previous score:

Model	Adapter	Passed	Pass rate	Avg tokens/s
Before	`heretic_F_lora_python5000_codefeedback5000`	9/10	90.00%	7.80
After	`heretic_F_lora_tessa_agentic_1000_test`	9/10	90.00%	7.86

Delta:

Metric	Value
Passes	0
Pass rate	0.00%
Avg tokens/s	+0.05

Unlike the OpenCodeInstruct continuation experiment, this Tessa-based adapter did not regress on the small strict-code benchmark.

Training configuration

Item	Value
Base model	`JoaoZaokk/Qwen3-4B-Thinking-2507-Heretic-CodeFeedback`
Input adapter	`heretic_F_lora_python5000_codefeedback5000`
Dataset	`smirki/Agentic-Coding-Tessa`
Samples used	1,000
Sequence length	1024
Epochs	1
Learning rate	1e-6
Training method	QLoRA / LoRA
Quantized loading during training	4-bit NF4

Benchmark files

Benchmark artifacts are included under:

benchmark/

Files:

benchmark/before_summary.md
benchmark/after_summary.md
benchmark/COMPARISON.md
benchmark/before_results.jsonl
benchmark/after_results.jsonl

Intended use

This adapter is intended for testing:

agentic coding behavior
coding assistance
code generation
code explanation
tool-use style coding responses
continued fine-tuning experiments

It should be compared against the main CodeFeedback model before use in any serious coding workflow.

Loading example

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch

base_model = "JoaoZaokk/Qwen3-4B-Thinking-2507-Heretic-CodeFeedback"
adapter = "JoaoZaokk/Qwen3-4B-Thinking-2507-Heretic-CodeFeedback-Agentic-Tessa-1K-LoRA"

tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True)

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True,
)

model = AutoModelForCausalLM.from_pretrained(
    base_model,
    quantization_config=bnb_config,
    device_map="auto",
    trust_remote_code=True,
)

model = PeftModel.from_pretrained(model, adapter)
model.eval()

Important notes

This is an experimental LoRA adapter.

The benchmark used here is small and should not be treated as a formal coding leaderboard. It is mainly useful for local before/after regression testing.

This adapter preserved the current local benchmark score, but further testing is needed before treating it as a better general-purpose coding model.

Downloads last month: 43

Model tree for JoaoZaokk/Qwen3-4B-Thinking-2507-Heretic-CodeFeedback-Agentic-Tessa-1K-LoRA

Base model

Qwen/Qwen3-4B-Thinking-2507

Finetuned

unsloth/Qwen3-4B-Thinking-2507

Finetuned

JoaoZaokk/Qwen3-4B-Thinking-2507-MiniMax-M2.1-Distill-heretic

Finetuned

JoaoZaokk/Qwen3-4B-Thinking-2507-Heretic-CodeFeedback

Adapter

(3)

this model

JoaoZaokk
/

Qwen3-4B-Thinking-2507-Heretic-CodeFeedback-Agentic-Tessa-1K-LoRA

Qwen3 4B Thinking 2507 Heretic CodeFeedback — Agentic Tessa 1K LoRA

Status

Training configuration

Benchmark files

Intended use

Loading example

Important notes

Model tree for JoaoZaokk/Qwen3-4B-Thinking-2507-Heretic-CodeFeedback-Agentic-Tessa-1K-LoRA

Dataset used to train JoaoZaokk/Qwen3-4B-Thinking-2507-Heretic-CodeFeedback-Agentic-Tessa-1K-LoRA