Shuhua Yu

prof_pic.jpg

Porter Hall B13

4815 Frew Street

Pittsburgh, PA 15213

I am a PhD student at the Electrical and Computer Engineering Department of Carnegie Mellon University, where I am fortunate to be supervised by Prof. Soummya Kar. My research interests primarily include optimization, machine learning, and distributed algorithms. Specific topics include adversarial attacks, heavy-tailed noise, and adaptive methods for transformers.

During my PhD, I interned at ByteDance Seed Foundation MLsys and Bosch Center for AI. Prior to my PhD, I received my BEng in computer science from The Chinese University of Hong Kong, Shenzhen in May 2019, and I interned at Baidu NLP.

news

May 06, 2025 A new preprint on a decentralized method GT-NSGDm that achieves optimal convergence under heavy-tailed gradient noise.
Jan 22, 2025 A paper on the high-probability convergence of a general framework for nonlinear SGD under heavy-tailed gradient noise has been accepted to AISTATS 2025.
Nov 26, 2024 A new preprint on distributed sign momentum method for pre-training transformer models.
Oct 21, 2024 A new preprint on nonlinear SGD under heavy-tailed gradient noise, including analyses on large deviations, mean-sqaure and almost sure convergence rates.