7.4 Example: Covariance Selection

This example illustrates the use of the routines for sparse Cholesky factorization. We consider the problem

minimize   - logdetK + tr(KY )
subject to Kij = 0, (i,j) ⁄∈ S.
(7.5)

The optimization variable is a symmetric matrix K of order n and the domain of the problem is the set of positive definite matrices. The matrix Y and the index set S are given. We assume that all the diagonal positions are included in S. This problem arises in maximum likelihood estimation of the covariance matrix of a zero-mean normal distribution, with constraints that specify that pairs of variables are conditionally independent.

We can express K as

K (x) = E diag (x)ET + E diag (x)ET
        1         2    2        1

where x are the nonzero elements in the lower triangular part of K, with the diagonal elements scaled by 1/2, and

E1 = [ ei1 ei2 ⋅⋅⋅ eiq ],    E2 = [ ej1 ej2 ⋅⋅⋅  ejq ],

where (ik, jk) are the positions of the nonzero entries in the lower-triangular part of K. With this notation, we can solve problem (7.5) by solving the unconstrained problem

minimize f (x) = - logdetK (x)+ tr(K (x)Y ).

The code below implements Newton’s method with a backtracking line search. The gradient and Hessian of the objective function are given by

 ∇f(x)  =  2diag(ET1 (Y(- K(x)-1))E2))
        =  2diag(YIJ - K (x)-1 IJ)
 2            T     -1       T     -1        T     -1       T    - 1
∇ f(x)  =  2((E1 K(x-)1) E1)(∘ (E2- K1)(x) E2()+ 2(-E11) K (x)( E2)-∘1)(E 2 K (x ) E1 )
        =  2 K (x)   II ∘ K (x )  JJ + 2 K(x)   IJ ∘ K (x)  JI ,
where o denotes Hadamard product.
from cvxopt.base import matrix, spmatrix, log, mul  
from cvxopt import blas, lapack, amd, cholmod  
 
def covsel(Y):  
    """  
    Returns the solution of  
 
         minimize    -logdet K + Tr(KY)  
         subject to  K_{ij}=0,  (i,j) not in indices listed in I,J.  
 
    Y is a symmetric sparse matrix with nonzero diagonal elements.  
    I = Y.I,  J = Y.J.  
    """  
 
    I, J = Y.I, Y.J  
    n, m = Y.size[0], len(I)  
    N = I + J*n         # non-zero positions for one-argument indexing  
    D = [k for k in xrange(m) if I[k]==J[k]]  # position of diagonal elements  
 
    # starting point: symmetric identity with nonzero pattern I,J  
    K = spmatrix(0, I, J)  
    K[::n+1] = 1  
 
    # Kn is used in the line search  
    Kn = spmatrix(0, I, J)  
 
    # symbolic factorization of K  
    F = cholmod.symbolic(K)  
 
    # Kinv will be the inverse of K  
    Kinv = matrix(0.0, (n,n))  
 
    for iters in xrange(100):  
 
        # numeric factorization of K  
        cholmod.numeric(K, F)  
        d = cholmod.diag(F)  
 
        # compute Kinv by solving K*X = I  
        Kinv[:] = 0  
        Kinv[::n+1] = 1  
        cholmod.solve(F, Kinv)  
 
        # solve Newton system  
        grad = 2*(Y.V - Kinv[N])  
        hess = 2*(mul(Kinv[I,J],Kinv[J,I]) + mul(Kinv[I,I],Kinv[J,J]))  
        v = -grad  
        lapack.posv(hess,v)  
 
        # stopping criterion  
        sqntdecr = -blas.dot(grad,v)  
        print "Newton decrement squared:%- 7.5e" %sqntdecr  
        if (sqntdecr < 1e-12):  
            print "number of iterations: ", iters+1  
            break  
 
        # line search  
        dx = +v  
        dx[D] *= 2      # scale the diagonal elems  
        f = -2.0 * sum(log(d))    # f = -log det K  
        s = 1  
        for lsiter in xrange(50):  
            Kn.V = K.V + s*dx  
            try:  
                cholmod.numeric(Kn, F)  
            except ArithmeticError:  
                s *= 0.5  
            else:  
                d = cholmod.diag(F)  
                fn = -2.0 * sum(log(d)) + 2*s*blas.dot(v,Y.V)  
                if (fn < f - 0.01*s*sqntdecr):  
                     break  
                s *= 0.5  
 
        K.V = Kn.V  
 
    return K