By default, the functions conelp() and coneqp() exploit no problem structure except (to some limited extent) sparsity. Two mechanisms are provided for implementing customized solvers that take advantage of problem structure.
![]() | (8.3) |
(with P = 0 in conelp()). The matrix W depends on the current iterates and is defined as follows. We use the notation of sections 8.1 and 8.2. Suppose
Then W is a block-diagonal matrix,
with the following diagonal blocks.
This transformation is symmetric:
where
These transformations are also symmetric:
In general, this operation is not symmetric:
It is often possible to exploit problem structure to solve (8.3) faster than by standard methods. The last argument kktsolver of conelp() and coneqp() allows the user to supply a Python function for solving the KKT equations. This function will be called as ”f = kktsolver(W)”, where W is a dictionary that contains the parameters of the scaling:
The function call ”f = kktsolver(W)” should return a routine for solving the KKT system (8.3) defined by W. It will be called as ”f(bx, by, bz)”. On entry, bx, by, bz contain the righthand side. On exit, they should contain the solution of the KKT system, with the last component scaled, i.e., on exit,
In other words, the function returns the solution of
If the argument G of conelp() or coneqp() is a Python function, it should be defined as follows:
G(x, y [, alpha[, beta[, trans]]])
This evaluates the matrix-vector products
The default values of the optional arguments must be alpha = 1.0, beta = 0.0, trans = ’N’.
Similarly, if the argument A is a Python function, then it must be defined as follows.
A(x, y [, alpha[, beta[, trans]]])
This evaluates the matrix-vector products
The default values of the optional arguments must be alpha = 1.0, beta = 0.0, trans = ’N’.
If the argument P of coneqp() is a Python function, then it must be defined as follows:
P(x, y [, alpha[, beta]])
This evaluates the matrix-vector products
The default values of the optional arguments must be alpha = 1.0, beta = 0.0.
If G, A or P are Python functions, then the argument kktsolver must also be provided.
We illustrate these features with three applications.
The optimization problem
can be formulated as a linear program
By exploiting the structure in the inequalities, the cost of an iteration of an interior-point method can be reduced to the cost of least-squares problem of the same dimensions. (See section 11.8.2 in the book Convex Optimization.) The code belows takes advantage of this fact.
from cvxopt import base, blas, lapack, solvers
from cvxopt.base import matrix, spmatrix, mul, div def l1(P, q): """ Returns the solution u, w of the l1 approximation problem (primal) minimize ||P*u - q||_1 (dual) maximize q’*w subject to P’*w = 0 ||w||_infty <= 1. """ m, n = P.size # Solve the equivalent LP # # minimize [0; 1]’ * [u; v] # subject to [P, -I; -P, -I] * [u; v] <= [q; -q] # # maximize -[q; -q]’ * z # subject to [P’, -P’]*z = 0 # [-I, -I]*z + 1 = 0 # z >= 0. c = matrix(n*[0.0] + m*[1.0]) def G(x, y, alpha = 1.0, beta = 0.0, trans = ’N’): if trans==’N’: # y := alpha * [P, -I; -P, -I] * x + beta*y u = P*x[:n] y[:m] = alpha * ( u - x[n:]) + beta * y[:m] y[m:] = alpha * (-u - x[n:]) + beta * y[m:] else: # y := alpha * [P’, -P’; -I, -I] * x + beta*y y[:n] = alpha * P.T * (x[:m] - x[m:]) + beta * y[:n] y[n:] = -alpha * (x[:m] + x[m:]) + beta * y[n:] h = matrix([q, -q]) dims = {’l’: 2*m, ’q’: [], ’s’: []} def F(W): """ Returns a function f(x, y, z) that solves [ 0 0 P’ -P’ ] [ x[:n] ] [ bx[:n] ] [ 0 0 -I -I ] [ x[n:] ] [ bx[n:] ] [ P -I -D1^{-1} 0 ] [ z[:m] ] = [ bz[:m] ] [-P -I 0 -D2^{-1} ] [ z[m:] ] [ bz[m:] ] where D1 = diag(di[:m])^2, D2 = diag(di[m:])^2 and di = W[’di’]. """ # Factor A = 4*P’*D*P where D = d1.*d2 ./(d1+d2) and d1 = di[:m].^2, d2 = di[m:].^2. di = W[’di’] d1, d2 = di[:m]**2, di[m:]**2 D = div( mul(d1,d2), d1+d2 ) A = P.T * spmatrix(4*D, range(m), range(m)) * P lapack.potrf(A) def f(x, y, z): """ On entry bx, bz are stored in x, z. On exit x, z contain the solution, with z scaled: z./di is returned instead of z. """" # Solve for x[:n]: # # A*x[:n] = bx[:n] + P’ * ( ((D1-D2)*(D1+D2)^{-1})*bx[n:] # + (2*D1*D2*(D1+D2)^{-1}) * (bz[:m] - bz[m:]) ). x[:n] += P.T * ( mul(div(d1-d2, d1+d2), x[n:]) + mul(2*D, z[:m]-z[m:]) ) lapack.potrs(A, x) # x[n:] := (D1+D2)^{-1} * (bx[n:] - D1*bz[:m] - D2*bz[m:] + (D1-D2)*P*x[:n]) u = P*x[:n] x[n:] = div(x[n:] - mul(d1, z[:m]) - mul(d2, z[m:]) + mul(d1-d2, u), d1+d2) # z[:m] := d1[:m] .* ( P*x[:n] - x[n:] - bz[:m]) # z[m:] := d2[m:] .* (-P*x[:n] - x[n:] - bz[m:]) z[:m] = mul(d[:m], u - x[n:] - z[:m]) z[m:] = mul(d[m:], -u - x[n:] - z[m:]) return f sol = solvers.conelp(c, G, h, dims, kktsolver = F) return sol[’x’][:n], sol[’z’][m:] - sol[’z’][:m] |
The SDP
can be solved efficiently by exploiting properties of the diag operator.
from cvxopt import base, blas, lapack, solvers
from cvxopt.base import matrix def mcsdp(w): """ Returns solution x, z to (primal) minimize sum(x) subject to w + diag(x) >= 0 (dual) maximize -tr(w*z) subject to diag(z) = 1 z >= 0. """ n = w.size[0] c = matrix(1.0, (n,1)) def G(x, y, alpha = 1.0, beta = 0.0, trans = ’N’): """ y := alpha*(-diag(x)) + beta*y. """ if trans==’N’: # x is a vector; y is a symmetric matrix in column major order. y *= beta y[::n+1] -= alpha * x else: # x is a symmetric matrix in column major order; y is a vector. y *= beta y -= alpha * x[::n+1] def cngrnc(r, x, alpha = 1.0): """ Congruence transformation x := alpha * r’*x*r. r and x are square matrices. """ # Scale diagonal of x by 1/2. x[::n+1] *= 0.5 # a := tril(x)*r a = +r tx = matrix(x, (n,n)) blas.trmm(tx, a, side = ’L’) # x := alpha*(a*r’ + r*a’) blas.syr2k(r, a, tx, trans = ’T’, alpha = alpha) x[:] = tx[:] dims = {’l’: 0, ’q’: [], ’s’: [n]} def F(W): """ Returns a function f(x, y, z) that solves -diag(z) = bx -diag(x) - r*r’*z*r*r’ = bz where r = W[’r’][0] = W[’rti’][0]^{-T}. """ rti = W[’rti’][0] # t = rti*rti’ as a nonsymmetric matrix. t = matrix(0.0, (n,n)) blas.gemm(rti, rti, t, transB = ’T’) # Cholesky factorization of tsq = t.*t. tsq = t**2 lapack.potrf(tsq) def f(x, y, z): """ On entry, x contains bx, y is empty, and z contains bz stored in column major order. On exit, they contain the solution, with z scaled (vec(r’*z*r) is returned instead of z). We first solve ((rti*rti’) .* (rti*rti’)) * x = bx - diag(t*bz*t) and take z = - rti’ * (diag(x) + bz) * rti. """ # tbst := t * bz * t tbst = +z cngrnc(t, tbst) # x := x - diag(tbst) = bx - diag(rti*rti’ * bz * rti*rti’) x -= tbst[::n+1] # x := (t.*t)^{-1} * x = (t.*t)^{-1} * (bx - diag(t*bz*t)) lapack.potrs(tsq, x) # z := z + diag(x) = bz + diag(x) z[::n+1] += x # z := -vec(rti’ * z * rti) # = -vec(rti’ * (diag(x) + bz) * rti cngrnc(rti, z, alpha = -1.0) return f sol = solvers.conelp(c, G, w[:], dims, kktsolver = F) return sol[’x’], sol[’z’] |
The code below is efficient, if we assume that the number of rows in A is greater than or equal to the number of columns.
def qcl1(A, b):
""" Returns the solution u, z of (primal) minimize || u ||_1 subject to || A * u - b ||_2 <= 1 (dual) maximize b^T z - ||z||_2 subject to || A’*z ||_inf <= 1. Exploits structure, assuming A is m by n with m >= n. """ m, n = A.size # Solve equivalent cone LP with variables x = [u; v]. # # minimize [0; 1]’ * x # subject to [ I -I ] * x <= [ 0 ] (componentwise) # [-I -I ] * x <= [ 0 ] (componentwise) # [ 0 0 ] * x <= [ 1 ] (SOC) # [-A 0 ] [ -b ] # # maximize -t + b’ * w # subject to z1 - z2 = A’*w # z1 + z2 = 1 # z1 >= 0, z2 >=0, ||w||_2 <= t. c = matrix(n*[0.0] + n*[1.0]) h = matrix( 0.0, (2*n + m + 1, 1)) h[2*n] = 1.0 h[2*n+1:] = -b def G(x, y, alpha = 1.0, beta = 0.0, trans = ’N’): y *= beta if trans==’N’: # y += alpha * G * x y[:n] += alpha * (x[:n] - x[n:2*n]) y[n:2*n] += alpha * (-x[:n] - x[n:2*n]) y[2*n+1:] -= alpha * A*x[:n] else: # y += alpha * G’*x y[:n] += alpha * (x[:n] - x[n:2*n] - A.T * x[-m:]) y[n:] -= alpha * (x[:n] + x[n:2*n]) def Fkkt(W): """ Returns a function f(x, y, z) that solves [ 0 G’ ] [ x ] = [ bx ] [ G -W’*W ] [ z ] [ bz ]. """ # First factor # # S = G’ * W**-1 * W**-T * G # = [0; -A]’ * W3^-2 * [0; -A] + 4 * (W1**2 + W2**2)**-1 # # where # # W1 = diag(d1) with d1 = W[’d’][:n] = 1 ./ W[’di’][:n] # W2 = diag(d2) with d2 = W[’d’][n:] = 1 ./ W[’di’][n:] # W3 = beta * (2*v*v’ - J), W3^-1 = 1/beta * (2*J*v*v’*J - J) # with beta = W[’beta’][0], v = W[’v’][0], J = [1, 0; 0, -I]. # As = W3^-1 * [ 0 ; -A ] = 1/beta * ( 2*J*v * v’ - I ) * [0; A] beta, v = W[’beta’][0], W[’v’][0] As = 2 * v * (v[1:].T * A) As[1:,:] *= -1.0 As[1:,:] -= A As /= beta # S = As’*As + 4 * (W1**2 + W2**2)**-1 S = As.T * As d1, d2 = W[’d’][:n], W[’d’][n:] d = 4.0 * (d1**2 + d2**2)**-1 S[::n+1] += d lapack.potrf(S) def f(x, y, z): # z := - W**-T * z z[:n] = -div( z[:n], d1 ) z[n:2*n] = -div( z[n:2*n], d2 ) z[2*n:] -= 2.0*v*( v[0]*z[2*n] - blas.dot(v[1:], z[2*n+1:]) ) z[2*n+1:] *= -1.0 z[2*n:] /= beta # x := x - G’ * W**-1 * z x[:n] -= div(z[:n], d1) - div(z[n:2*n], d2) + As.T * z[-(m+1):] x[n:] += div(z[:n], d1) + div(z[n:2*n], d2) # Solve for x[:n]: # # S*x[:n] = x[:n] - (W1**2 - W2**2)(W1**2 + W2**2)^-1 * x[n:] x[:n] -= mul( div(d1**2 - d2**2, d1**2 + d2**2), x[n:]) lapack.potrs(S, x) # Solve for x[n:]: # # (d1**-2 + d2**-2) * x[n:] = x[n:] + (d1**-2 - d2**-2)*x[:n] x[n:] += mul( d1**-2 - d2**-2, x[:n]) x[n:] = div( x[n:], d1**-2 + d2**-2) # z := z + W^-T * G*x z[:n] += div( x[:n] - x[n:2*n], d1) z[n:2*n] += div( -x[:n] - x[n:2*n], d2) z[2*n:] += As*x[:n] return f dims = {’l’: 2*n, ’q’: [m+1], ’s’: []} sol = solvers.conelp(c, G, h, dims, kktsolver = Fkkt) if sol[’status’] == ’optimal’: return sol[’x’][:n], sol[’z’][-m:] else: return None, None |
with variable x. The problem is equivalent to the quadratic program
with variables x and u. The implementation below is efficient when A has many more columns than rows.
from cvxopt.base import matrix, spdiag, mul, div
from cvxopt import base, blas, lapack, solvers import math def l1regls(A, y): """ Returns the solution of l1-norm regularized least-squares problem minimize || A*x - y ||_2^2 + || x ||_1. """ m, n = A.size q = matrix(1.0, (2*n,1)) q[:n] = -2.0 * A.T * y def P(u, v, alpha = 1.0, beta = 0.0 ): """ v := alpha * 2.0 * [ A’*A, 0; 0, 0 ] * u + beta * v """ v *= beta v[:n] += alpha * 2.0 * A.T * (A * u[:n]) def G(u, v, alpha=1.0, beta=0.0, trans=’N’): """ v := alpha*[I, -I; -I, -I] * u + beta * v (trans = ’N’ or ’T’) """ v *= beta v[:n] += alpha*(u[:n] - u[n:]) v[n:] += alpha*(-u[:n] - u[n:]) h = matrix(0.0, (2*n,1)) # Customized solver for the KKT system # # [ 2.0*A’*A 0 I -I ] [x[:n] ] [bx[:n] ] # [ 0 0 -I -I ] [x[n:] ] = [bx[n:] ]. # [ I -I -D1^-1 0 ] [zl[:n]] [bzl[:n]] # [ -I -I 0 -D2^-1 ] [zl[n:]] [bzl[n:]] # # where D1 = W[’di’][:n]**2, D2 = W[’di’][:n]**2. # # We first eliminate zl and x[n:]: # # ( 2*A’*A + 4*D1*D2*(D1+D2)^-1 ) * x[:n] = # bx[:n] - (D2-D1)*(D1+D2)^-1 * bx[n:] + # D1 * ( I + (D2-D1)*(D1+D2)^-1 ) * bzl[:n] - # D2 * ( I - (D2-D1)*(D1+D2)^-1 ) * bzl[n:] # # x[n:] = (D1+D2)^-1 * ( bx[n:] - D1*bzl[:n] - D2*bzl[n:] ) # - (D2-D1)*(D1+D2)^-1 * x[:n] # # zl[:n] = D1 * ( x[:n] - x[n:] - bzl[:n] ) # zl[n:] = D2 * (-x[:n] - x[n:] - bzl[n:] ). # # The first equation has the form # # (A’*A + D)*x[:n] = rhs # # and is equivalent to # # [ D A’ ] [ x:n] ] = [ rhs ] # [ A -I ] [ v ] [ 0 ]. # # It can be solved as # # ( A*D^-1*A’ + I ) * v = A * D^-1 * rhs # x[:n] = D^-1 * ( rhs - A’*v ). S = matrix(0.0, (m,m)) Asc = matrix(0.0, (m,n)) v = matrix(0.0, (m,1)) def Fkkt(W): # Factor # # S = A*D^-1*A’ + I # # where D = 2*D1*D2*(D1+D2)^-1, D1 = d[:n]**-2, D2 = d[n:]**-2. d1, d2 = W[’di’][:n]**2, W[’di’][n:]**2 # ds is square root of diagonal of D ds = math.sqrt(2.0) * div( mul( W[’di’][:n], W[’di’][n:]), base.sqrt(d1+d2) ) d3 = div(d2 - d1, d1 + d2) # Asc = A*diag(d)^-1/2 Asc = A * spdiag(ds**-1) # S = I + A * D^-1 * A’ blas.syrk(Asc, S) S[::m+1] += 1.0 lapack.potrf(S) def g(x, y, z): x[:n] = 0.5 * ( x[:n] - mul(d3, x[n:]) + mul(d1, z[:n] + mul(d3, z[:n])) - mul(d2, z[n:] - mul(d3, z[n:])) ) x[:n] = div( x[:n], ds) # Solve # # S * v = 0.5 * A * D^-1 * ( bx[:n] - # (D2-D1)*(D1+D2)^-1 * bx[n:] + # D1 * ( I + (D2-D1)*(D1+D2)^-1 ) * bzl[:n] - # D2 * ( I - (D2-D1)*(D1+D2)^-1 ) * bzl[n:] ) blas.gemv(Asc, x, v) lapack.potrs(S, v) # x[:n] = D^-1 * ( rhs - A’*v ). blas.gemv(Asc, v, x, alpha=-1.0, beta=1.0, trans=’T’) x[:n] = div(x[:n], ds) # x[n:] = (D1+D2)^-1 * ( bx[n:] - D1*bzl[:n] - D2*bzl[n:] ) # - (D2-D1)*(D1+D2)^-1 * x[:n] x[n:] = div( x[n:] - mul(d1, z[:n]) - mul(d2, z[n:]), d1+d2 )\ - mul( d3, x[:n] ) # zl[:n] = D1^1/2 * ( x[:n] - x[n:] - bzl[:n] ) # zl[n:] = D2^1/2 * ( -x[:n] - x[n:] - bzl[n:] ). z[:n] = mul( W[’di’][:n], x[:n] - x[n:] - z[:n] ) z[n:] = mul( W[’di’][n:], -x[:n] - x[n:] - z[n:] ) return g return solvers.coneqp(P, q, G, h, kktsolver = Fkkt)[’x’][:n] |