25 70 253

Yinxu Pan

cppowboy

https://github.com/Cppowboy

AI & ML interests

RL for LLM, Code&Math Reasoning, Function Calling, Code Interpreter, Vision-Language Pretraining

Recent Activity

upvoted an article about 11 hours ago

Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models

liked a dataset 8 days ago

TuringEnterprises/Turing-Open-Reasoning

liked a dataset 8 days ago

Anthropic/AnthropicInterviewer

View all activity

Organizations

upvoted an article about 11 hours ago

Article

Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models

about 13 hours ago

•

liked 2 datasets 8 days ago

TuringEnterprises/Turing-Open-Reasoning

Viewer • Updated 10 days ago • 50 • 12.1k • 133

Anthropic/AnthropicInterviewer

Viewer • Updated 7 days ago • 1.25k • 9.07k • 286

upvoted 2 papers 10 days ago

Qwen3-VL Technical Report

Paper • 2511.21631 • Published 19 days ago • 124

PretrainZero: Reinforcement Active Pretraining

Paper • 2512.03442 • Published 13 days ago • 44

upvoted a paper 13 days ago

DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

Paper • 2512.02556 • Published 14 days ago • 211

upvoted a paper 14 days ago

Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

Paper • 2512.01374 • Published 15 days ago • 89

liked a dataset 14 days ago

nvidia/ToolScale

Viewer • Updated 19 days ago • 4.06k • 2.75k • 139

liked a model 14 days ago

deepseek-ai/DeepSeek-V3.2

Text Generation • 685B • Updated 15 days ago • 64k • • 941

upvoted a collection 21 days ago

Olmo 3 Post-training

Collection

All artifacts for post-training Olmo 3. Datasets follow the model that resulted from training on them. • 32 items • Updated 6 days ago • 44

liked a dataset 21 days ago

allenai/Dolci-Think-RL-32B

Viewer • Updated 26 days ago • 102k • 1.08k • 14

upvoted a paper 21 days ago

DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research

Paper • 2511.19399 • Published 21 days ago • 57

liked a dataset 27 days ago

nvidia/Nemotron-RL-instruction_following-structured_outputs

Viewer • Updated 4 days ago • 9.95k • 766 • 23

liked a dataset 29 days ago

Seikaijyu/Beautiful-Chinese

Viewer • Updated Jun 19, 2024 • 810k • 718 • 83

liked 2 datasets about 1 month ago

CharlieDreemur/OpenManus-RL

Viewer • Updated Mar 15 • 48.9k • 124 • 81

facebook/principia-collection

Viewer • Updated Nov 9 • 554k • 1.35k • 38

upvoted a paper about 1 month ago

Training Long-Context, Multi-Turn Software Engineering Agents with Reinforcement Learning

Paper • 2508.03501 • Published Aug 5 • 59

liked a dataset about 1 month ago

FreedomIntelligence/medical-o1-reasoning-SFT

Viewer • Updated Apr 22 • 90.1k • 6.19k • 967

New activity in nebius/SWE-rebench about 1 month ago

How can I find all instance_ids that come with a Docker image?

#10 opened about 2 months ago by

KYLN24

liked a dataset about 1 month ago

meituan-longcat/AMO-Bench

Viewer • Updated 15 days ago • 50 • 810 • 23

Yinxu Pan

AI & ML interests

Recent Activity

Organizations

cppowboy's activity

Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models

How can I find all instance_ids that come with a Docker image?