2 9 66

Eli Chen

elichen3051

AI & ML interests

Learning Algorithm, Reinforcement Learning, Data Synthesize, Benchmarking

Organizations

upvoted an article 3 months ago

Article

Why Maybe We're Measuring LLM Compression Wrong

rishiraj

•

Jun 21, 2025

• 16

upvoted an article 8 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

eliebak, cmpatino, anton-l, edbeeching, m-ric, nouamanetazi, akseljoonas, guipenedo, hynky, clefourrier, SaylorTwift, kashif, qgallouedec, hlarcher, glutamatt, Xenova, reach-vb, ngxson, craffel, lewtun, loubnabnl, lvwerra, thomwolf

•

Jul 8, 2025

• 776

upvoted a paper 9 months ago

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2, 2025 • 238

upvoted 2 articles 12 months ago

Article

Gotchas in Tokenizer Behavior Every Developer Should Know

qgallouedec

•

Apr 18, 2025

• 72

Article

The 4 Things Qwen-3’s Chat Template Teaches Us

cfahlgren1

•

Apr 30, 2025

• 88

upvoted a collection over 1 year ago

Sparse Foundational Llama 2 Models

Collection

Sparse pre-trained and fine-tuned Llama models made by Neural Magic + Cerebras • 27 items • Updated Apr 16, 2025 • 10

upvoted an article over 1 year ago

Article

Efficient LLM Pretraining: Packed Sequences and Masked Attention

sirluk

•

Oct 7, 2024

• 71

upvoted 2 collections almost 2 years ago

🍷 FineWeb

Collection

7 items • Updated Jun 20, 2025 • 33

📚 FineWeb-Edu

Collection

FineWeb-Edu datasets, classifier and ablation model • 5 items • Updated Jun 12, 2024 • 20

Eli Chen

AI & ML interests

Organizations

elichen3051's activity

Why Maybe We're Measuring LLM Compression Wrong

SmolLM3: smol, multilingual, long-context reasoner

Gotchas in Tokenizer Behavior Every Developer Should Know

The 4 Things Qwen-3’s Chat Template Teaches Us

Efficient LLM Pretraining: Packed Sequences and Masked Attention