Shuhua Yu

Porter Hall B13
4815 Frew Street
Pittsburgh, PA 15213
I am a PhD student at the Electrical and Computer Engineering Department of Carnegie Mellon University, where I am fortunate to be supervised by Prof. Soummya Kar. My research interests primarily include optimization, machine learning, and distributed algorithms. Specific topics include adversarial attacks, heavy-tailed noise, and adaptive methods for transformers.
During my PhD, I interned at ByteDance Seed Foundation MLsys and Bosch Center for AI. Prior to my PhD, I received my BEng in computer science from The Chinese University of Hong Kong, Shenzhen in May 2019, and I interned at Baidu NLP.
news
May 06, 2025 | A new preprint on a decentralized method GT-NSGDm that achieves optimal convergence under heavy-tailed gradient noise. |
---|---|
Jan 22, 2025 | A paper on the high-probability convergence of a general framework for nonlinear SGD under heavy-tailed gradient noise has been accepted to AISTATS 2025. |
Nov 26, 2024 | A new preprint on distributed sign momentum method for pre-training transformer models. |
Oct 21, 2024 | A new preprint on nonlinear SGD under heavy-tailed gradient noise, including analyses on large deviations, mean-sqaure and almost sure convergence rates. |