Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
mlxha
/
Qwen-2.5-3B-grpo-code
like
0
Text Generation
Transformers
Safetensors
open-r1/verifiable-coding-problems-python
qwen2
Generated from Trainer
open-r1
trl
grpo
conversational
text-generation-inference
arxiv:
2402.03300
Model card
Files
Files and versions
xet
Community
1
Deploy
Use this model
main
Qwen-2.5-3B-grpo-code
/
training_args.bin
Commit History
Training in progress, step 500
d702b7f
verified
mlxha
commited on
Apr 17, 2025
Training in progress, step 250
7eba746
verified
mlxha
commited on
Apr 17, 2025
Training in progress, step 225
f3b5551
verified
mlxha
commited on
Apr 16, 2025
Training in progress, step 25
1a12ca7
verified
mlxha
commited on
Apr 16, 2025