Instructions to use socaitcy/SOCAIT-Hermes-14B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use socaitcy/SOCAIT-Hermes-14B with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("unsloth/qwen2.5-14b-instruct-unsloth-bnb-4bit")
model = PeftModel.from_pretrained(base_model, "socaitcy/SOCAIT-Hermes-14B")

Transformers

How to use socaitcy/SOCAIT-Hermes-14B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="socaitcy/SOCAIT-Hermes-14B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("socaitcy/SOCAIT-Hermes-14B", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use socaitcy/SOCAIT-Hermes-14B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "socaitcy/SOCAIT-Hermes-14B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "socaitcy/SOCAIT-Hermes-14B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/socaitcy/SOCAIT-Hermes-14B

SGLang

How to use socaitcy/SOCAIT-Hermes-14B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "socaitcy/SOCAIT-Hermes-14B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "socaitcy/SOCAIT-Hermes-14B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "socaitcy/SOCAIT-Hermes-14B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "socaitcy/SOCAIT-Hermes-14B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Unsloth Studio new

How to use socaitcy/SOCAIT-Hermes-14B with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for socaitcy/SOCAIT-Hermes-14B to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for socaitcy/SOCAIT-Hermes-14B to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for socaitcy/SOCAIT-Hermes-14B to start chatting

Load model with FastModel

pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="socaitcy/SOCAIT-Hermes-14B",
    max_seq_length=2048,
)

Docker Model Runner
How to use socaitcy/SOCAIT-Hermes-14B with Docker Model Runner:
```
docker model run hf.co/socaitcy/SOCAIT-Hermes-14B
```

Model Card for Fitness Agent (14B-Qwen2.5)

This is a fine-tuned LoRA adapter for unsloth/qwen2.5-14b-instruct-unsloth-bnb-4bit, trained to act as a specialized Fitness & Nutrition Agent. The model was trained using Group Relative Policy Optimization (GRPO) to improve its reasoning capabilities in creating personalized workout plans, analyzing nutrition logs, and providing evidence-based health advice.

Model Details

Model Description

This model is an RL-finetuned version of Qwen 2.5 14B designed to solve complex fitness and nutrition queries. Unlike standard LLMs, this agent was trained with specific rewards for:

Reasoning Quality: Producing logical, step-by-step explanations for its recommendations.
Safety & Constraints: Strictly adhering to dietary restrictions (allergies, preferences) and physical limitations.
Format Compliance: Generating structured JSON outputs for workout plans and diet logs when required.

It uses the LangGraph framework to manage agent state and tool invocation during training.

Developed by: socaitcy
Funded by [optional]: Self-funded
Model type: LoRA Adapter (Fine-tuned Causal LM)
Language(s) (NLP): English
License: Apache 2.0
Finetuned from model: unsloth/qwen2.5-14b-instruct-unsloth-bnb-4bit

Model Sources [optional]

Repository: https://huggingface.co/socaitcy/fitness-agent-14B-qwen2.5-adapter

Uses

Direct Use

This model is intended to be used as a conversational assistant or API backend for:

Generating personalized weekly workout routines.
Calculating macronutrient needs based on user stats.
Answering questions about exercise form and dietary science.

Downstream Use [optional]

Integrated into the fitness-reasoning-rl-agent system, where it can call external tools (search, database lookups) to augment its answers with real-time data.

Out-of-Scope Use

Medical Advice: This model is for fitness and wellness coaching only. It is not a substitute for professional medical advice, diagnosis, or treatment.
Extreme Diets: The model should not be used to generate dangerous or extreme weight loss protocols.

Bias, Risks, and Limitations

Hallucination: Like all LLMs, it can occasionally invent facts or exercises that do not exist.
Knowledge Cutoff: Its knowledge is limited to the base model's training data plus the fine-tuning dataset; it may not know the very latest fitness trends unless provided via context.
User Physiology: It relies on user-provided data (weight, age, etc.) and cannot verify physical health status.

Recommendations

Users should always consult with a physician before starting any new exercise or nutrition program generated by this model.

How to Get Started with the Model

Use the code below to get started with the model.

from peft import PeftModel, PeftConfig from transformers import AutoModelForCausalLM, AutoTokenizer

config = PeftConfig.from_pretrained("socaitcy/fitness-agent-14B-qwen2.5-adapter") base_model = AutoModelForCausalLM.from_pretrained("unsloth/qwen2.5-14b-instruct-unsloth-bnb-4bit", device_map="auto", load_in_4bit=True) model = PeftModel.from_pretrained(base_model, "socaitcy/fitness-agent-14B-qwen2.5-adapter") tokenizer = AutoTokenizer.from_pretrained("unsloth/qwen2.5-14b-instruct-unsloth-bnb-4bit")

prompt = "Create a 3-day workout plan for a beginner with no equipment." inputs = tokenizer(prompt, return_tensors="pt").to("cuda") outputs = model.generate(**inputs, max_new_tokens=512) print(tokenizer.decode(outputs[0], skip_special_tokens=True))## Training Details

Training Data

The model was trained on a custom dataset of fitness scenarios (data/fitness_scenarios.jsonl), including:

Synthetic user profiles with specific goals (e.g., "Lose 5kg", "Marathon prep").
Validated nutritional constraints (e.g., "Vegan", "Gluten-free").
Correct vs. incorrect workout split logic.

Training Procedure

Preprocessing [optional]

Data was formatted into specific prompt templates used by the agent system to simulate user interactions.

Training Hyperparameters

Training regime: Mixed precision (bf16) with LoRA (Rank=8, Alpha=16).
Optimizer: AdamW 8-bit
Method: GRPO (Group Relative Policy Optimization)
Quantization: 4-bit (BitsAndBytes)

Environmental Impact

Hardware Type: NVIDIA GPU (e.g., H100/A100/4090)
Hours used: ~2-10 hours (Estimated)
Cloud Provider: Private / Local
Compute Region: Local

Citation [optional]

BibTeX:

@misc{fitness-agent-2025, author = {socaitcy}, title = {Fitness Agent 14B (Qwen2.5 LoRA)}, year = {2025}, publisher = {Hugging Face}, journal = {Hugging Face Repository}, howpublished = {\url{https://huggingface.co/socaitcy/fitness-agent-14B-qwen2.5-adapter}} }### Framework versions

PEFT 0.18.0
Transformers
Unsloth
TRL

Downloads last month: -