Revisiting Replay and Gradient Alignment for Continual Pre-Training of Large Language Models Paper • 2508.01908 • Published Aug 3 • 3