Department of Computer Science The University of British Columbia
Greed is Good: Greedy Optimization Methods for Large-Scale Structured Problems
This work looks at large-scale machine learning, with a particular focus on greedy methods. A recent trend caused by big datasets is to use optimization methods that have a cheap iteration cost. In this category are (block) coordinate descent and Kaczmarz methods, as the updates of these methods only rely on a reduced subspace of the problem at each iteration. Prior to our work, the literature cast greedy variations of these methods as computationally expensive with comparable convergence rates to randomized versions. In this dissertation, we show that greed is good. Specifically, we show that greedy coordinate descent and Kaczmarz methods have efficient implementations and can be faster than their randomized counterparts for certain common problem structures in machine learning. We show linear convergence for greedy (block) coordinate descent methods under a revived relaxation of strong convexity from 1963, which we call the Polyak-Lojasiewicz (PL) inequality. Of the proposed relaxations of strongconvexity in the recent literature, we show that the PL inequality is the weakest condition that still ensures a global minimum. Further, we highlight the exploitable flexibility in block coordinate descent methods, not only in the different types of selection rules possible, but also in the types of updates we can use. We show that using second-order or exact updates with greedy block coordinate descent methods can lead to superlinear or finite convergence (respectively) for popular machine learning problems. Finally, we introduce the notion of “active-set complexity”, which we define as the number of iterations required before an algorithm is guaranteed to reach the optimal active manifold, and show explicit bounds for two common problem instances when using the proximal gradient or the proximal coordinate descent method.
Dr. Nutini was born and raised in the small ski town of Rossland, British Columbia, located in the West Kootenays. She graduated from the University of British Columbia (Okanagan Campus) in 2010 with a BSc in General Mathematics (with honors), and in 2012 with a MSc in Mathematical Optimization (Governor General’s Gold Medal recipient). Her MSc thesis, supervised by Warren Hare, focused on derivative-free optimization methods for finite minimax problems. Her PhD was supervised by Mark Schmidt at the University of British Columbia in Vancouver. Her research area is optimization and machine learning with a focus on coordinate descent methods.