Research
Research Interests
I work broadly on the mathematical and statistical foundations of machine learning and artificial intelligence, with a more recent additional emphasis on addressing real engineering challenges of scaling AI architectures and algorithms.
In particular, I am interested in
Algorithmic and engineering perspectives of large-scale distributed optimization methods, including stochastic, nonsmooth, nonconvex and/or distributionally robust optimization, with applications to large-scale distributed pre-training of large language models, e.g., efficient optimizers and (data and/or model) parallelism strategies
Theory and applications of optimization and sampling techniques to generative artificial intelligence (GenAI), e.g., efficient (pre-)training strategies of attention-based language and vision models, i.e., large language models (LLMs), vision transformers (ViTs) and multi-modal models (e.g., VLMs)
The interplay between optimization and sampling
High-dimensional statistical inference for modern GenAI
Funding and Grants
Academic Services
Reviewer for
Conferences
NeurIPS 2020, 2021, 2022, 2023, 2024
ICML 2021, 2022, 2023, 2024, 2025
ICLR 2021, 2022, 2024
AISTATS 2020, 2021, 2022, 2024, 2025
|