May 06, 2025 | A new preprint on a decentralized method GT-NSGDm that achieves optimal convergence under heavy-tailed gradient noise. |
Jan 22, 2025 | A paper on the high-probability convergence of a general framework for nonlinear SGD under heavy-tailed gradient noise has been accepted to AISTATS 2025. |
Nov 26, 2024 | A new preprint on distributed sign momentum method for pre-training transformer models. |
Oct 21, 2024 | A new preprint on nonlinear SGD under heavy-tailed gradient noise, including analyses on large deviations, mean-sqaure and almost sure convergence rates. |