Reinforcement Learning
Transformers
Safetensors
qwen3
text-generation
qwen
grpo
reasoning
alignment-tax
trl
text-generation-inference
Instructions to use Shreyansh327/Qwen3-1.7B-grpo-gsm8k with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Shreyansh327/Qwen3-1.7B-grpo-gsm8k with Transformers:
# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Shreyansh327/Qwen3-1.7B-grpo-gsm8k") model = AutoModelForCausalLM.from_pretrained("Shreyansh327/Qwen3-1.7B-grpo-gsm8k") - Notebooks
- Google Colab
- Kaggle
Welcome to the community
The community tab is the place to discuss and collaborate with the HF community!