Benign Landscape of Robust Principal Component Analysis

alt text 

This work is concerned with the positive robust principal component analysis (PCA), where the goal is to recover the dominant positive principal component of a data matrix exactly, given a number of measurements that are grossly corrupted with sparse and arbitrarily large noise values. Most of the known techniques for solving the robust PCA rely on the convex relaxation by lifting the problem to a higher dimension and hence, significantly increasing its number of variables. On the other hand, the well-known Burer-Monteiro approach can be used to cast the robust PCA as a non-convex and non-smooth l1 optimization problem with significantly smaller number of variables. In this work, we show that the low-dimensional formulation of the symmetric and asymmetric positive robust PCA based on the Burer-Monteiro approach has benign landscape, i.e., it does not have any spurious local solution and has a unique stationary and global solution.

This implies that the convex relaxation of the problem is not crucial to guarantee the global optimality of the solution since simple local search algorithms are guaranteed to achieve zero optimality gap when directly applied to the low-dimensional formulation. Our approach is fundamentally different from the recently celebrated results that show the absence of spurious local minima for smooth norms. Furthermore, we provide strong deterministic and statistical guarantees for the exact recovery of the true principal component; for instance, we show that a constant fraction of the measurements can be grossly corrupted and yet they do not introduce any spurious local solution to the problem [1].

Related publications:

1- S. Fattahi and S. Sojoudi, Exact Gurantees on the Absence of Spurious Local Minima for Non-Negative Robust Principal Component Analysis, submitted to Journal of Machine Learning Research, 2019.

Graphical Models in Machine Learning

alt text 

There has been a pressing need in developing new and efficient computational methods to analyze and learn the characteristics of high-dimensional data with a structured or randomized nature. Real-world datasets are often overwhelmingly complex, and therefore it is important to obtain a simple description of the data that can be processed efficiently. Graphical Lasso (GL) is a popular method for learning the structure of an undirected graphical model, which is based on an l1 regularization technique. The first goal of this work is to study the behavior of the optimal solution of GL as a function of its regularization coefficient. We show that if the number of samples is not too small compared to the number of parameters, the sparsity pattern of the optimal solution of GL changes gradually when the regularization coefficient increases from 0 to infinity. More precisely, it is proved that each change in the sparsity pattern corresponds to the addition or removal of a single edge of the graph, under generic conditions.

The second objective of this work is to compare the computationally-heavy GL technique with a numerically-cheap heuristic method for learning graphical models that is based on simply thresholding the sample correlation matrix. To this end, two notions of sign-consistent and inverse-consistent matrices are developed, and then it is shown that the thresholding and GL methods are equivalent if: (i) the thresholded sample correlation matrix is both sign-consistent and inverse-consistent, and (ii) the gap between the largest thresholded and the smallest un-thresholded entries of the sample correlation matrix is not too small. By building upon this result, it is proved that the GL method–as a conic optimization problem–has an explicit closed-form solution if the thresholded sample correlation matrix has an acyclic structure. This result is then generalized to arbitrary sparse support graphs, where a formula is found to obtain an approximate solution of GL. The closed-form solution approximately satisfies the KKT conditions for the GL problem and, more importantly, the approximation error decreases exponentially fast with respect to the length of the minimum-length cycle of the sparsity graph. The developed results are demonstrated on synthetic data, electrical circuits, functional MRI data, and traffic flows for transportation networks [1,4].

We then show that the GL and thresholding equivalence is enough to reduce the GL to a maximum determinant matrix completion problem and drive a recursive closed-form solution for the GL when the thresholded sample covariance matrix has a chordal structure. For large-scale problems with up to 450 million variables, the proposed method can solve the GL problem in less than 2 minutes, while the state-of-the-art methods converge in more than 2 hours [2,3].

Related publications:

1- S. Fattahi and S. Sojoudi, Graphical Lasso and Thresholding: Equivalence and Closed-form Solutions, Journal of Machine Learning Research (JMLR), 2019 - Winner of INFORMS Data Mining best paper award and Katta G. Murty best paper award

2- R. Y. Zhang, S. Fattahi, and S. Sojoudi, Large-Scale Sparse Inverse Covariance Estimation via Thresholding and Max-Det Matrix Completion, International Conference on Machine Learning (ICML), 2018

3- S. Fattahi, R. Y. Zhang and S. Sojoudi, Sparse Inverse Covariance Estimation for Chordal Structures, European Control Conference (ECC), 2018

4- S. Fattahi and S. Sojoudi, Graphical Lasso: Sparsity Path and Closed-Form Solution, American Control Conference (ACC), 2018 - Finalist for the best student paper award

Learning-Based System Identification and Distributed Control

alt text 

In this work, our first goal is to study the system identification problem for sparse linear time-invariant systems. We propose a sparsity promoting block-regularized estimator to identify the dynamics of the system with only a limited number of input-state data samples. Using contemporary results on high-dimensional statistics, we show that the proposed estimator results in a small element-wise error, provided that the number of sample trajectories is above a threshold. This threshold depends polynomially on the size of each block and the number of nonzero elements at different rows of input and state matrices, but only logarithmically on the system dimension. Furthermore, we show that, unlike the recently celebrated least-squares estimators for system identification problems, the method developed in this work is capable of exact recovery of the underlying sparsity structure of the system[1,2].

Next, we propose a robust approach to design localized controllers for the unknown-but-sparse system based on the estimated one. By taking into account the uncertainty in the system identification step, we introduce a robust and decomposable optimization framework that can design a controller with an arbitrary user-defined sparsity structure. We provide sharp end-to-end guarantees on the stability and performance of the designed localized controller and prove that for sparse systems, the number of sample trajectories to guarantee the high performance of the controller can be much smaller than the dimension of the system [3,4].

Related publications:

1- S. Fattahi and S. Sojoudi, Data-Driven Sparse System Identification, Annual Allerton Conference on Communication, Control, and Computing, 2018

2- Salar Fattahi and Somayeh Sojoudi, Sample Complexity of Sparse System Identification Problem, submitted to IEEE Transactions on Automatic Control, 2018

3- S. Fattahi, N. Matni, and S. Sojoudi, Learning Sparse Dynamical Systems from a Single Sample Trajectory, to appear in IEEE Conference on Decision and Control (CDC), 2019

4- S. Fattahi, N. Matni, and S. Sojoudi, ‘‘Sample Complexity Bounds for the Distributed Linear Quadratic Regulator’’, to be submitted to SIAM Journal on Optimization and Control, 2019

Generalized Network Flow Problem

alt text 

The area of “network flows” plays a central role in operations research, computer science and engineering. This area is motivated by many real-word applications in assignment, transportation, communication networks, electrical power distribution, production scheduling, financial budgeting, and aircraft routing, to name only a few. There are important real-world flow networks that are lossy, where the loss is a nonlinear function of the flows. An example is electrical power networks for which the loss over each line (under fixed voltage magnitudes at both ends) is given by a parabolic function due to Kirchhoff’s circuit laws. In an effort to find the optimal operating points of lossy flow networks, we consider a general minimum-cost flow problem over an arbitrary flow network. In this problem, each node is associated with some possibly unknown injection and each line has two unknown flows at its ends that are related to each other via a nonlinear function. Moreover, all injections and flows must satisfy certain box constraints. This problem, named generalized network flow (GNF), is highly non-convex due to its nonlinear equality constraints.

Under the assumption of monotonicity and convexity of the flow and cost functions, a convex relaxation is proposed, which is shown to always obtain globally optimal injections. This relaxation may fail to find optimal flows because the mapping from injections to flows is not unique in general. We show that the proposed relaxation, named convexified GNF (CGNF), obtains a globally optimal flow vector if the optimal injection vector is a Pareto point. More generally, the network can be decomposed into two subgraphs such that the lines between the subgraphs are congested at optimality and that CGNF finds correct optimal flows over all lines of one of these subgraphs. We also fully characterize the set of all globally optimal flow vectors, based on the optimal injection vector found via CGNF. In particular, we show that this solution set is a subset of the boundary of a convex set, and may include an exponential number of disconnected components [1,2].

Related publications:

1- S. Sojoudi, S. Fattahi and J. Lavaei, Convexification of Generalized Network Flow Problem, Mathematical Programming, 2019

2- S. Fattahi and J. Lavaei, Convex Analysis of Generalized Flow Networks, IEEE Conference on Decision and Control (CDC), 2015

Real-Time scheduling in Power Systems

alt text 

The unit commitment (UC) problem aims to find an optimal schedule of generating units subject to demand and operating constraints for an electricity grid. The objective of first part of this project is to obtain a convex model of polynomial size for practical instances of the UC problem. To this end, we develop a convex conic relaxation of the UC problem, referred to as a strengthened semidefinite program (SDP) relaxation. The major benefit of the proposed method compared to the existing techniques is threefold: (i) the proposed formulation is a single convex model with polynomial size and, hence, its global minimum can be found efficiently using well-established first-and second-order methods by starting from any arbitrary initial state, (ii) unlike heuristic methods and local-search algorithms that return local minima whose closeness to a global solution cannot be measured efficiently, the proposed formulation aims at obtaining global minima, (iii) the proposed convex model can be used in convex-hull pricing to minimize uplift payments made to generating units in energy markets [1,4].

In the second part of this project, we study the optimal transmission switching (OTS) problem for power systems, where certain lines are fixed (uncontrollable) and the remaining ones are controllable via on or off switches. The goal is to identify a topology of the power grid that minimizes the cost of the system operation while satisfying the physical and operational constraints. Most of the existing methods for the problem are based on first converting the OTS into a mixed-integer linear program (MILP) or mixed-integer quadratic program (MIQP), and then iteratively solving a series of its convex relaxations. The performance of these methods depends heavily on the strength of the MILP or MIQP formulations. Here, it is shown that finding MILP or MIQP formulation of the OTS based on the big-M or McCormick inequalities is NP-hard. Furthermore, it is proven that unless P = NP, there is no constant-factor approximation algorithm for constructing these variable bounds. Despite the inherent difficulty of obtaining the strongest bounds in general, we present a simple bound strengthening method to strengthen the convex relaxation of the problem when there exists a connected spanning subnetwork of the system with fixed lines. The proposed method can be treated as a preprocessing step that is independent of the solver to be later used for numerical calculations and can be carried out offline before initiating the solver. A remarkable speedup in the runtime of the mixed-integer solvers is obtained using the proposed bound strengthening method for medium- and large-scale real world systems [2,3].

Related publications:

1- S. Fattahi, M. Ashraphijou, J. Lavaei and A. Atamturk, Conic Relaxation of the Unit Commitment Problem, Energy, 2017

2- S. Fattahi, J. Lavaei and A. Atamturk, A Bound Strengthening Method for Optimal Transmission Switching in Power Systems, IEEE Transactions on Power Systems, 2018.

3- S. Fattahi, J. Lavaei and A. Atamturk, Promises of Conic Relaxations in Optimal Transmission Switching of Power Systems, IEEE Conference on Decision and Control (CDC), 2017

4- M. Ashraphijou, S. Fattahi, J. Lavaei and A. Atamturk, A Strong Semidefinite Programming Relaxation of the Unit Commitment Problem, IEEE Conference on Decision and Control (CDC), 2016

Distributed Control Policies in Large-Scale Dynamical Systems

alt text 

The area of distributed control has been created to address computation and communication challenges in the control of large-scale real-world systems. The main objective is to design a controller with a prescribed structure, as opposed to the traditional centralized controller, for an interconnected system consisting of an arbitrary number of interacting local subsystems. In this work, we design a stabilizing static distributed controller whose performance is close to that of the optimal centralized controller. To this end, we first consider deterministic systems, where the initial state is either given or belongs to a known bounded region. Given an arbitrary centralized controller, we derive a condition under which there exists a distributed controller that generates input and state trajectories close to their counterparts in the centralized closed-loop system. This condition for the design of a distributed controller is translated into an optimization problem, where the optimal objective value of this problem quantifies the closeness of the designed distributed and given centralized control systems.

The results are then extended to stochastic systems that are subject to input disturbance and measurement noise. The mathematical framework developed in this work is utilized to design a near-globally optimal distributed controller based on the optimal centralized controller, and strong theoretical lower bounds on the global optimality guarantee of the obtained distributed controller are derived. To illustrate the effectiveness of the proposed method, case studies on aircraft formation and frequency control of power systems are offered [1,3,5,6]. We extend our method to cooperative distributed model predictive control systems, where the goal is to design appropriate terminal cost-to-go functions and invariant sets that comply with the coupling pattern of the network [2].

Finally, we target the theoretical complexity of solving stochastic optimal decentralized control problem. We show that if either the measurement noise covariance or the input weighting matrix is not too small, the problem is locally convex. Under such circumstances, the design of a decentralized controller with a bounded norm subject to an arbitrary sparsity pattern is naturally a convex problem. We also study the problem of designing a sparse controller using a regularization technique, where the control structure is not pre-specified but penalized in the objective function. Under some genericity assumptions, we prove that this method is able to design a decentralized controller with any arbitrary sparsity level [4].

Related publications:

1- S. Fattahi, G. Fazelnia and J. Lavaei, Transformation of Optimal Centralized Controllers Into Near-Global Static Distributed Controllers, IEEE Transactions on Automatic Control, 2018

2- G. Darivianakis, S. Fattahi, J. Lygeros and J. Lavaei, High-Performance Cooperative Distributed Model Predictive Control for Linear Systems, American Control Conference (ACC), 2018

3- S. Fattahi, J. Lavaei and M. Arcak, A Scalable Method for Designing Distributed Controllers for Systems with Unknown Initial States, IEEE Conference on Decision and Control (CDC), 2017

4- S. Fattahi and J. Lavaei, On the Convexity of Optimal Decentralized Control Problem and Sparsity Path, American Control Conference (ACC), pp. 3359-3366, 2017

5- S. Fattahi, J. Lavaei, Theoretical Guarantees for the Design of Near Globally Optimal Static, 54th Annual Allerton Conference on Communication, Control, and Computing, 2016

6- S. Fattahi, G. Fazelnia and J. Lavaei, Transformation of Optimal Centralized Controllers Into Near-Global Static Distributed Controllers, IEEE Conference on Decision and Control (CDC), 2015