Research

Research Interests

I work broadly on the mathematical and statistical foundations of machine learning and artificial intelligence, with a more recent additional emphasis on addressing real engineering challenges of scaling AI architectures and algorithms.

In particular, I am interested in

  • Algorithmic and engineering perspectives of large-scale distributed optimization methods, including stochastic, nonsmooth, nonconvex andor distributionally robust optimization, with applications to large-scale distributed pre-training of large language models, e.g., efficient optimizers and (data andor model) parallelism strategies

  • Theory and applications of optimization and sampling techniques to generative artificial intelligence (GenAI), e.g., efficient (pre-)training strategies of attention-based language and vision models, i.e., large language models (LLMs) and vision transformers (ViTs)

  • The interplay between optimization and sampling

  • High-dimensional statistical inference

Funding and Grants

  • The University of Chicago Data Science Institute — AI + Science Research Initiative:

    • Project Support Funds (Principal Investigator) with $20,000 equivalent of GPU compute (2024)

    • Project Title: Advancing state-of-the-art large-scale distributed training methods in the era of generative AI

Academic Services

Reviewer for

  • Conferences

    • NeurIPS 2020, 2021, 2022, 2023, 2024

    • ICML 2021, 2022, 2023, 2024

    • ICLR 2021, 2022, 2024

    • AISTATS 2020, 2021, 2022, 2024, 2025