Towards Efficient LLMs Annealing with Principled Sample Selection

  • Yuanjian Xu ,
  • Jianing Hao ,
  • Wanbo Zhang ,
  • Zhong Li ,
  • Guang Zhang

ICML 2026 |

Related File

The annealing stage of Large Language Model (LLM) training is a critical phase where model loss drops sharply and downstream capabilities solidify. Despite its importance, current practices rely on empirical heuristics like quality filtering or context extension, lacking a principled understanding of the underlying optimization dynamics. We address this gap by providing a theoretical characterization of the spectral properties targeted during annealing. We demonstrate that effective annealing requires balancing global Hessian geometry with sample-wise gradient noise, navigating a landscape of highly anisotropic curvature. Based on these insights, we formulate sample selection as a constrained optimization problem to suppress noise in sharp directions while preserving descent signals in flat subspaces. Our method, solved via Successive Convex Programming (SCP), achieves state-of-the-art results across multiple model scales.

GitHubGitHub