Towards Efficient LLMs Annealing with Principled Sample Selection

Yuanjian Xu; Jianing Hao; Wanbo Zhang; Zhong Li; Guang Zhang

Towards Efficient LLMs Annealing with Principled Sample Selection

Yuanjian Xu ,
Jianing Hao ,
Wanbo Zhang ,
Zhong Li ,
Guang Zhang

ICML 2026 | May 2026

下载 BibTex

The annealing stage of Large Language Model (LLM) training is a critical phase where model loss drops sharply and downstream capabilities solidify. Despite its importance, current practices rely on empirical heuristics like quality filtering or context extension, lacking a principled understanding of the underlying optimization dynamics. We address this gap by providing a theoretical characterization of the spectral properties targeted during annealing. We demonstrate that effective annealing requires balancing global Hessian geometry with sample-wise gradient noise, navigating a landscape of highly anisotropic curvature. Based on these insights, we formulate sample selection as a constrained optimization problem to suppress noise in sharp directions while preserving descent signals in flat subspaces. Our method, solved via Successive Convex Programming (SCP), achieves state-of-the-art results across multiple model scales.

GitHub