Vokturz's picture
Update README.md
985c9a5 verified
|
Raw
History Blame Contribute Delete
2.31 kB
---
library_name: transformers
tags:
- vision-language
- ocr
- multimodal
- qwen
- lora
- instruction-tuning
datasets:
- Vokturz/sourceforge-app-screenshots-ocr
base_model:
- unsloth/Qwen3-VL-2B-Instruct-unsloth-bnb-4bit
---
# Model Card for Vokturz/Loyca-Qwen3-VL-2B-Instruct-OCR
## Model Details
### Model Description
**Loyca-Qwen3-VL-2B-Instruct-OCR** is a lightweight LoRA adapter built on top of **Qwen/Qwen3-VL-2B-Instruct**, fine-tuned for **visual text recognition (OCR)** and **screen content understanding**.
It enhances the base model’s ability to read and interpret text embedded in images — particularly screenshots and user interfaces — and respond with structured, instruction-following outputs.
### Model Sources
- **Repository:** [https://huggingface.co/Vokturz/Loyca-Qwen3-VL-2B-Instruct-OCR](https://huggingface.co/Vokturz/Loyca-Qwen3-VL-2B-Instruct-OCR)
- **Base model:** [https://huggingface.co/Qwen/Qwen3-VL-2B-Instruct](https://huggingface.co/Qwen/Qwen3-VL-2B-Instruct)
- **Fine-tuning run:** [W&B Experiment](https://wandb.ai/vokturz/Loyca-Qwen3-VL-2B-OCR)
---
## Uses
This model can be used directly for Optical Character Recognition (OCR) on screenshots, UI layouts, or application previews.
The model is **not designed** for:
* Handwritten OCR
* Scene text in natural environments (e.g., street signs)
* Legal or financial document processing without human review
---
## Training Details
### Training Data
The model was trained on **`Vokturz/sourceforge-app-screenshots-ocr`** (~1100 records), a custom dataset of annotated application screenshots containing readable text and UI elements.
The dataset focuses on **clean UI text extraction** rather than general image captioning.
### Training Hyperparameters
| Parameter | Value |
| --------------------- | ---------------- |
| Epochs | 8 |
| Batch size | 8 |
| Learning rate | 3e-4 |
| LoRA rank | 64 |
| LoRA alpha | 64 |
| Precision | bfloat16 (mixed) |
| Optimizer | AdamW |
| Scheduler | Cosine decay |
| Gradient accumulation | 2 |
| Weight decay | 0.01 |