Research
Research Interests
I work broadly on the mathematical and statistical foundations of machine learning and artificial intelligence, with a more recent additional emphasis on addressing real engineering challenges of scaling AI architectures and algorithms.
In particular, I am interested in
Algorithmic and engineering perspectives of large-scale distributed optimization methods, including stochastic, nonsmooth, nonconvex andor distributionally robust optimization, with applications to large-scale distributed pre-training of large language models, e.g., efficient optimizers and (data andor model) parallelism strategies
Theory and applications of optimization and sampling techniques to generative artificial intelligence (GenAI), e.g., efficient (pre-)training strategies of attention-based language and vision models, i.e., large language models (LLMs) and vision transformers (ViTs)
The interplay between optimization and sampling
High-dimensional statistical inference
Funding and Grants
Academic Services
Reviewer for
Conferences
NeurIPS 2020, 2021, 2022, 2023, 2024
ICML 2021, 2022, 2023, 2024
ICLR 2021, 2022, 2024
AISTATS 2020, 2021, 2022, 2024, 2025
|