--- base_model: unsloth/qwen2.5-14b-instruct-unsloth-bnb-4bit library_name: peft pipeline_tag: text-generation tags: - base_model:adapter:unsloth/qwen2.5-14b-instruct-unsloth-bnb-4bit - grpo - lora - transformers - trl - unsloth - fitness - nutrition - reasoning - rl license: apache-2.0 --- # Model Card for Fitness Agent (14B-Qwen2.5) This is a fine-tuned LoRA adapter for `unsloth/qwen2.5-14b-instruct-unsloth-bnb-4bit`, trained to act as a specialized **Fitness & Nutrition Agent**. The model was trained using **Group Relative Policy Optimization (GRPO)** to improve its reasoning capabilities in creating personalized workout plans, analyzing nutrition logs, and providing evidence-based health advice. ## Model Details ### Model Description This model is an RL-finetuned version of Qwen 2.5 14B designed to solve complex fitness and nutrition queries. Unlike standard LLMs, this agent was trained with specific rewards for: 1. **Reasoning Quality:** Producing logical, step-by-step explanations for its recommendations. 2. **Safety & Constraints:** Strictly adhering to dietary restrictions (allergies, preferences) and physical limitations. 3. **Format Compliance:** Generating structured JSON outputs for workout plans and diet logs when required. It uses the LangGraph framework to manage agent state and tool invocation during training. - **Developed by:** socaitcy - **Funded by [optional]:** Self-funded - **Model type:** LoRA Adapter (Fine-tuned Causal LM) - **Language(s) (NLP):** English - **License:** Apache 2.0 - **Finetuned from model:** unsloth/qwen2.5-14b-instruct-unsloth-bnb-4bit ### Model Sources [optional] - **Repository:** https://huggingface.co/socaitcy/fitness-agent-14B-qwen2.5-adapter ## Uses ### Direct Use This model is intended to be used as a conversational assistant or API backend for: - Generating personalized weekly workout routines. - Calculating macronutrient needs based on user stats. - Answering questions about exercise form and dietary science. ### Downstream Use [optional] Integrated into the `fitness-reasoning-rl-agent` system, where it can call external tools (search, database lookups) to augment its answers with real-time data. ### Out-of-Scope Use - **Medical Advice:** This model is for fitness and wellness coaching only. It is **not** a substitute for professional medical advice, diagnosis, or treatment. - **Extreme Diets:** The model should not be used to generate dangerous or extreme weight loss protocols. ## Bias, Risks, and Limitations - **Hallucination:** Like all LLMs, it can occasionally invent facts or exercises that do not exist. - **Knowledge Cutoff:** Its knowledge is limited to the base model's training data plus the fine-tuning dataset; it may not know the very latest fitness trends unless provided via context. - **User Physiology:** It relies on user-provided data (weight, age, etc.) and cannot verify physical health status. ### Recommendations Users should always consult with a physician before starting any new exercise or nutrition program generated by this model. ## How to Get Started with the Model Use the code below to get started with the model. from peft import PeftModel, PeftConfig from transformers import AutoModelForCausalLM, AutoTokenizer config = PeftConfig.from_pretrained("socaitcy/fitness-agent-14B-qwen2.5-adapter") base_model = AutoModelForCausalLM.from_pretrained("unsloth/qwen2.5-14b-instruct-unsloth-bnb-4bit", device_map="auto", load_in_4bit=True) model = PeftModel.from_pretrained(base_model, "socaitcy/fitness-agent-14B-qwen2.5-adapter") tokenizer = AutoTokenizer.from_pretrained("unsloth/qwen2.5-14b-instruct-unsloth-bnb-4bit") prompt = "Create a 3-day workout plan for a beginner with no equipment." inputs = tokenizer(prompt, return_tensors="pt").to("cuda") outputs = model.generate(**inputs, max_new_tokens=512) print(tokenizer.decode(outputs[0], skip_special_tokens=True))## Training Details ### Training Data The model was trained on a custom dataset of fitness scenarios (`data/fitness_scenarios.jsonl`), including: - Synthetic user profiles with specific goals (e.g., "Lose 5kg", "Marathon prep"). - Validated nutritional constraints (e.g., "Vegan", "Gluten-free"). - Correct vs. incorrect workout split logic. ### Training Procedure #### Preprocessing [optional] Data was formatted into specific prompt templates used by the agent system to simulate user interactions. #### Training Hyperparameters - **Training regime:** Mixed precision (bf16) with LoRA (Rank=8, Alpha=16). - **Optimizer:** AdamW 8-bit - **Method:** GRPO (Group Relative Policy Optimization) - **Quantization:** 4-bit (BitsAndBytes) ## Environmental Impact - **Hardware Type:** NVIDIA GPU (e.g., H100/A100/4090) - **Hours used:** ~2-10 hours (Estimated) - **Cloud Provider:** Private / Local - **Compute Region:** Local ## Citation [optional] **BibTeX:** @misc{fitness-agent-2025, author = {socaitcy}, title = {Fitness Agent 14B (Qwen2.5 LoRA)}, year = {2025}, publisher = {Hugging Face}, journal = {Hugging Face Repository}, howpublished = {\url{https://huggingface.co/socaitcy/fitness-agent-14B-qwen2.5-adapter}} }### Framework versions - PEFT 0.18.0 - Transformers - Unsloth - TRL