---
base_model: Qwen/Qwen3-8B
library_name: peft
pipeline_tag: text-generation
tags:
  - qwen
  - qwen3
  - lora
  - peft
  - biomedical-entity-linking
  - clinical-nlp
  - concept-normalization
  - snomed-ct
  - rewriting
  - reasoning
  - reinforcement-learning
license: other
---

# Qwen3-8B-LoRA-ContextBioEL-Rewriter-RL

This repository provides a LoRA adapter for Qwen3-8B for the rewriter stage of a clinical biomedical entity linking pipeline.

This model rewrites a verbatim clinical mention into a more canonical, ontology-friendly term using the marked note context. It was further optimized with reinforcement learning (RL) for entity-linking-oriented rewriting behavior.

## Model type

- Base model: Qwen/Qwen3-8B
- Adapter type: LoRA
- Stage: Rewriter
- Training: RL
- Task: Context-aware biomedical entity linking / concept normalization

## Intended use

Input:
- `verbatim`
- `context_marked`, where the target mention is explicitly enclosed by `<mention>...</mention>`

Output:
- a short normalized SNOMED CT-style term in the `<answer>...</answer>` block

This model is intended for research use in biomedical entity linking pipelines.

## Important decoding note

This adapter was trained with reasoning-style outputs.

Please:
- enable thinking
- do not use greedy decoding

Recommended decoding:
- `do_sample=True`
- non-greedy decoding such as temperature/top-p sampling
- parse the final prediction from the `<answer>...</answer>` span


## Usage example

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

base_model_path = "Qwen/Qwen3-8B"
adapter_path = "Tao-AI-Informatics/Qwen3-8B-LoRA-ContextBioEL-Rewriter-RL"

tokenizer = AutoTokenizer.from_pretrained(base_model_path, trust_remote_code=True)
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_path,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)
model = PeftModel.from_pretrained(base_model, adapter_path)

messages = [
    {
        "role": "system",
        "content": (
            "You are a clinical terminology normalizer.\n"
            "Given a clinical context where the target mention is explicitly marked by "
            "<mention>...</mention>, rewrite/normalize that mention into a SNOMED CT-style expression.\n\n"
            "Requirements:\n"
            "1) Think before answer.\n"
            "2) Output MUST contain two parts in order:\n"
            "   <think> ... <\\think>\n"
            "   <answer> ... <\\answer>\n"
            "3) The answer should be short and term-like (close to SNOMED CT wording).\n"
            "4) Use the mention inside <mention>...</mention> in the context as the primary target.\n"
        ),
    },
    {
        "role": "user",
        "content": (
            "Input:\n"
            "verbatim:\nrenal failure\n\n"
            "context_marked:\n"
            "History significant for <mention>renal failure</mention> requiring dialysis.\n"
        ),
    },
]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
)

inputs = tokenizer(text, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=256,
        do_sample=True,
        temperature=0.6,
        top_p=0.95,
    )

print(tokenizer.decode(outputs[0], skip_special_tokens=False))

```
## Notes

- This is a LoRA adapter, not a standalone full model.

- The adapter is designed for the rewriting stage, not retrieval by itself.

- In downstream pipelines, the rewritten term is typically passed to a retriever or reranker.

## Limitations

- This model is intended for research use only.

- Performance may vary across ontologies, institutions, and note styles.

- The model should be evaluated carefully before any real-world deployment.

- The final normalized term should be extracted from the <answer>...</answer> block.

## Citation

If you use this model, please cite the associated paper when available.