Solving Structured Convex Quadratic Programs by Interior Point Methods with Application to Support Vector Machines and Portfolio Optimization

In this paper we consider the application of an interior point method (IPM) to the convex quadratic programming (QP) problem with upper bounds on the variables:

where the symmetric positive semidefinite matrix the rank m matrix are the data for the problem, and is the vector of variables. Three aspects of this topic are explored.

The first concerns a property of the Cholesky factorization of the normal equation matrix that arises at each iteration of an IPM applied to problem (P). In [11] it was shown that when linear programming (LP) problems are solved by an IPM, the unit lower triangular matrix L in the Cholesky factorization of the normal equations matrix remains uniformly bounded (in norm) as the iterates converge to the solution. This result holds even if the condition number of the normal equations matrix becomes unbounded. In [12] this result was extended to the case of second-order cone (SOCP) programming under the assumption that the IPM iterates stay in a neighborhood of the central path. Here we extend the LP result (without any assumptions) to the case of convex QPs.

The second part of the paper presents efficient methods for implementing the linear algebra at each IPM iteration to exploit structure in the Hessian matrix Q. IPMs have an advantage over active set methods in that they usually obtain an (approximate) solution to a problem after a small number of iterations (often less than 50). However the cost of each iteration is signi¯cantly higher than that of an active set method. The major part of this cost comes from the need to solve at each iteration two symmetric positive definite systems of equations (D2 + Q)u1 = w1 and A(D2 + Q)- 1AT u2 = w2, one n-by-n and the other m by m (D2 is a diagonal matrix). If n is large, then at least one of these systems is large-scaled. When the matrices Q and A are sparse, solving each system is relatively inexpensive and IPM iterations are efficient. The problem arises when one of the systems is dense. This can happen either because Q or Q - 1 is dense or because A contains dense columns. The latter case is similar to the case of the LP problems with dense columns. In [11] this was extensively studied and an efficient method for handling dense columns was proposed.

There are a number of applications of convex quadratic programming problems where the density of the systems that are solved comes from Q, and not from A. Often, however, the matrix Q can be represented as the sum of a diagonal matrix and a low rank dense matrix. We will show how one can use this special structure of Q to compute the factorization of the matrices D2+Q and A(D2+Q)- 1 AT efficiently, effectively extending the results in [11] to such QPs.

There are also cases where the matrix Q is not a low rank matrix, but can be approximated by such a matrix. In this case one might choose to solve the approximate problem, since it can be solved much faster. We give a bound on the error in the objective function when such an approximation is used. We also discuss the possibility of using an active set method to "correct" the approximate solution.

Finally, the last part of the paper addresses two applications where Q has the special structure described above. One such application is in an area of machine learning called support vector machines [?], .... Support vector machines (SVM) is an approach for solving classification problems, whose essence lies in solving a convex quadratic programming problem. The problem (in its dual form) has bound constraints and only one equality constraint. The dimension is the same as the size of the data that has already been classified. Clearly being able to solve large instances results in better classification results. However, the matrix Q is a totally dense matrix, and it becomes prohibitively expensive to solve problems with n > 1000 by IPMs without using special linear algebraic techniques. As we will explain in Section 6, even though Q is fully dense, it often has low rank. Therefore, the methods described in Section 4 apply naturally in the SVM setting.

Another application of convex QP where the Hessian is specially structured is portfolio optimization. Portfolio optimization involves selecting a portfolio of assets so as to maximize the return of the portfolio while minimizing the risk. For standard ways of measuring return and risk, several models for treating these conflicting objectives give rise to convex QPs. Structured Hessians of the form considered in Section 4 result from the assumption that the individual asset returns are primarily driven by underlying market factors.

The paper is organized as follows. In Section 2 we discuss interior point methods for the QP problem (P) and two approaches for computing the step direction at each iteration: one based on solving the so-called "augmented system" of linear equations and one based on the "normal equations". In Section 3 we extend the stability results of [11] to the Cholesky factors of the normal equations matrices for the QP case. In Section 4 we introduce methods for solving these systems via low-rank updates. Section 5 contains a discussion of approximating the Hessian matrix Q in problem (P) by a low rank matrix and derives a bound on the error in the optimal value that results from such an approximation. Sections 6 and 7 contain descriptions of two applications of the results in the previous sections, support vectors machines and portfolio optimization, respectively.

By: D. Goldfarb; K. Scheinberg

Published in: RC23773 in 2005

LIMITED DISTRIBUTION NOTICE:

This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.

rc23773.pdf

Questions about this service can be mailed to reports@us.ibm.com .