Qwen3 4B Thinking 2507 Heretic CodeFeedback — Agentic Tessa 1K LoRA

This repository contains an experimental LoRA adapter trained on top of:

JoaoZaokk/Qwen3-4B-Thinking-2507-Heretic-CodeFeedback

This adapter is a small continuation experiment using:

smirki/Agentic-Coding-Tessa

The goal was to test whether a small amount of agentic coding data could improve or preserve coding behavior without degrading strict code-output performance.

Status

This is a candidate / experimental adapter, not a claimed major improvement.

I'll be testing some datasets to make the model better for coding, it a tiny improvement, not a game changer, but compared to the previous one this model didn't get worse.

In a small local Python coding benchmark, this adapter preserved the previous score:

Model Adapter Passed Pass rate Avg tokens/s
Before heretic_F_lora_python5000_codefeedback5000 9/10 90.00% 7.80
After heretic_F_lora_tessa_agentic_1000_test 9/10 90.00% 7.86

Delta:

Metric Value
Passes 0
Pass rate 0.00%
Avg tokens/s +0.05

Unlike the OpenCodeInstruct continuation experiment, this Tessa-based adapter did not regress on the small strict-code benchmark.

Training configuration

Item Value
Base model JoaoZaokk/Qwen3-4B-Thinking-2507-Heretic-CodeFeedback
Input adapter heretic_F_lora_python5000_codefeedback5000
Dataset smirki/Agentic-Coding-Tessa
Samples used 1,000
Sequence length 1024
Epochs 1
Learning rate 1e-6
Training method QLoRA / LoRA
Quantized loading during training 4-bit NF4

Benchmark files

Benchmark artifacts are included under:

benchmark/

Files:

benchmark/before_summary.md
benchmark/after_summary.md
benchmark/COMPARISON.md
benchmark/before_results.jsonl
benchmark/after_results.jsonl

Intended use

This adapter is intended for testing:

  • agentic coding behavior
  • coding assistance
  • code generation
  • code explanation
  • tool-use style coding responses
  • continued fine-tuning experiments

It should be compared against the main CodeFeedback model before use in any serious coding workflow.

Loading example

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch

base_model = "JoaoZaokk/Qwen3-4B-Thinking-2507-Heretic-CodeFeedback"
adapter = "JoaoZaokk/Qwen3-4B-Thinking-2507-Heretic-CodeFeedback-Agentic-Tessa-1K-LoRA"

tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True)

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True,
)

model = AutoModelForCausalLM.from_pretrained(
    base_model,
    quantization_config=bnb_config,
    device_map="auto",
    trust_remote_code=True,
)

model = PeftModel.from_pretrained(model, adapter)
model.eval()

Important notes

This is an experimental LoRA adapter.

The benchmark used here is small and should not be treated as a formal coding leaderboard. It is mainly useful for local before/after regression testing.

This adapter preserved the current local benchmark score, but further testing is needed before treating it as a better general-purpose coding model.

Downloads last month
43
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for JoaoZaokk/Qwen3-4B-Thinking-2507-Heretic-CodeFeedback-Agentic-Tessa-1K-LoRA

Dataset used to train JoaoZaokk/Qwen3-4B-Thinking-2507-Heretic-CodeFeedback-Agentic-Tessa-1K-LoRA