Efficient Algorithms for Bilevel Optimization Problems with Application in Machine Learning
Author
Abolfazli, NazaninIssue Date
2025Keywords
Algorithm DevelopmentBilevel Optimization
Conditional Gradient Method
Distributed Optimization
Gradient Flow
Machine Learning
Advisor
Yazdandoost Hamedani, Erfan
Metadata
Show full item recordPublisher
The University of Arizona.Rights
Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction, presentation (such as public display or performance) of protected items is prohibited except with permission of the author.Embargo
Release after 05/09/2027Abstract
Bilevel optimization is an important class of optimization problems where one optimization problem is nested within another. It has become a powerful framework in a variety of machine learning applications, including signal processing, meta-learning, hyperparameter optimization, reinforcement learning, and network architecture search. The first chapter provides an introduction and overview of existing bilevel optimization frameworks, specifically highlighting simple bilevel optimization and general bilevel optimization methods, along with related research and their applications. The second chapter shifts focus to asynchronous multi-agent consensus optimization with nonlinear constraints. By introducing an accelerated primal-dual algorithm that achieves an optimal $\mathcal{O}(1/K)$ convergence rate for suboptimality, infeasibility, and consensus violation, we demonstrate a robust and efficient scheme for distributed optimization problems under limited or sporadic communication. This discussion also illustrates how the considered multi-agent problem can be recast as a simple bilevel optimization task, showing that our proposed method applies in that scenario as well. Next, we address the complexity of general bilevel optimization. While several methods have emerged for the unconstrained case, fewer tackle constraints at either the upper or lower level—a gap our work fills. We provide a thorough theoretical treatment of these extended bilevel models and introduce advanced, principled algorithms that manage upper-level constraints more efficiently and effectively. Our novel single-loop projection-free method, which employs a nested approximation technique, not only reduces iteration complexity relative to prior approaches but also retains optimal convergence guarantees comparable to the best known methods for single-level convex problems. Finally, we extend the scope of our research and propose a control-theoretic approach to solving bilevel optimization problems. Our method combines a gradient flow mechanism for upper-level minimization with a safety filter for enforcing lower-level constraints. Through Lyapunov analysis, we prove convergence to a neighborhood of the global optimum. This continuous-time insight then informs the design of an efficient discrete-time, gradient-based algorithm for large-scale bilevel problems, pairing computational tractability with theoretical rigor. This approach provides a unifying perspective and a novel tool for solving a broad class of nonconvex–nonconvex bilevel optimization problems. Overall, we demonstrate that the algorithms proposed in this thesis achieve optimal or superior convergence rate guarantees compared to competitive methods for the problem classes considered. Their effectiveness and advantages are further validated through numerical experiments on a variety of real-world problems.Type
textElectronic Dissertation
Degree Name
Ph.D.Degree Level
doctoralDegree Program
Graduate CollegeSystems & Industrial Engineering