view article Article ⚡ nano-vLLM: Lightweight, Low-Latency LLM Inference from Scratch zamal • Jun 28, 2025 • 41
view article Article Continuous batching from first principles +1 ror, ArthurZ, mcpotato • Nov 25, 2025 • 384
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 lysandre, ArthurZ, cyrilvallez, reach-vb • Dec 1, 2025 • 311