Qwen 3.5 0.8B — Solana Baby Architect 👶🚀

This model is a highly specialized version of Qwen/Qwen3.5-0.8B-Base fine-tuned on 2,189 examples of Solana conceptual knowledge, structural Rust/Anchor code, and security audit patterns. It is designed as a lightweight, offline-capable "Baby" version that is still learning the nuances of the Solana ecosystem.


🛠 Project Status: Work in Progress (WIP)

This model is currently in a "Junior Architect" phase. It is a work in progress and serves as a technical preview of what is possible with extremely small (0.8B) models on consumer hardware.


🧠 Technical Limitations & Intelligence Growth

This model was trained on limited resources: an NVIDIA GeForce RTX 4090 Laptop GPU (16GB VRAM). To fit within these limits while training on a large dataset, certain trade-offs were made.

Current Intelligence (Junior Architect)

  • Sequence Length: 1024 tokens.
  • Training Depth: 3 epochs.
  • Known Issues: May exhibit "junior level" hallucinations on very complex Anchor programs or mix up deep-level Solana internals (e.g., specific PoH/Bank struct details).

How to Achieve "Senior Architect" Level

If you have access to professional-grade hardware (24GB+ VRAM or A100/H100), you can significantly upgrade this model's intelligence by adjusting the provided training scripts:

  1. Increase Sequence Length (MAX_SEQ_LENGTH):
    • Upgrade to: 2048 or 4096 tokens.
    • Result: Allows the model to synthesize entire multi-file smart contracts and maintain consistency across complex state structs.
  2. Increase Training Depth (EPOCHS):
    • Upgrade to: 6 to 10 epochs.
    • Result: Deepens technical intuition and eliminates conceptual hallucinations.
  3. Lower Learning Rate: Using a learning rate of 5e-5 with more epochs will result in much finer precision for security auditing.

🚀 Train This Yourself

We have included the full training infrastructure in this repository so the community can continue to improve this model.

Requirements

  • A CUDA-capable GPU (16GB+ VRAM recommended).
  • The train_solana_expert_v2.jsonl dataset (included).

Steps

  1. Environment: Run setup_train.bat to create the virtual environment.
  2. Run Training:
    venv\Scripts\python.exe train_solana_v2.py
    
  3. Customize: Edit train_solana_v2.py to increase MAX_SEQ_LENGTH and EPOCHS as described above.

📱 Deployment Targets

  • iOS: Optimized for high-precision FP16 GGUF inference on iPhone.
  • Web: Targets 4-bit ONNX for WebGPU-accelerated browser applications.

Trained by: harshitsiwach

Downloads last month
119
Safetensors
Model size
0.8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for harshitsiwach/qwen-3.5-0.8B-solana-baby-architect

Quantized
(107)
this model