news

May 06, 2025 A new preprint on a decentralized method GT-NSGDm that achieves optimal convergence under heavy-tailed gradient noise.
Jan 22, 2025 A paper on the high-probability convergence of a general framework for nonlinear SGD under heavy-tailed gradient noise has been accepted to AISTATS 2025.
Nov 26, 2024 A new preprint on distributed sign momentum method for pre-training transformer models.
Oct 21, 2024 A new preprint on nonlinear SGD under heavy-tailed gradient noise, including analyses on large deviations, mean-sqaure and almost sure convergence rates.