Lecture 9: Symmetric eigenvalue problem and SVD

Recap of the first part

Plan for today

Today we will talk about:

Schur form computation

Hessenberg form

The matrix $A$ is said to be in the Hessenberg form, if

$$a_{ij} = 0, \quad \mbox{if } i \geq j+2.$$$$H = \begin{bmatrix} * & * & * & * & * \\ * & * & * & * & * \\ 0 & * & * & * & *\\ 0 & 0 & * & * & *\\ 0 & 0 & 0 & * & * \\ \end{bmatrix}.$$

Reduction any matrix to Hessenberg form

$$U^* A U = H$$

Symmetric (Hermitian) case

QR algorithm: iterations

$$A_k = Q_k R_k, \quad A_{k+1} = R_k Q_k.$$

Let us see..

Tridiagonal form

$$A_k = Q_k R_k, \quad A_{k+1} = R_k Q_k,$$

in the case when $A_k = A^*_k$ and is also tridiagonal.

Theorem on implicit QR iteration

All the implicit QR algorithms are based on the following theorem:

Let

$$Q^* A Q = H$$

be an irreducible upper Hessenberg matrix. Then, the first column of the matrix $Q$ defines all of its other columns. It can be found from the equation

$$A Q = Q H. $$

Convergence of the QR-algorithm

Summary. If we have a decomposition of the form

$$A = X \Lambda X^{-1}, \quad A = \begin{bmatrix}A_{11} & A_{12} \\ A_{21} & A_{22}\end{bmatrix}$$

and

$$ \Lambda = \begin{bmatrix} \Lambda_1 & 0 \\ 0 & \Lambda_2 \end{bmatrix}, \quad \lambda(\Lambda_1)=\{\lambda_1,\dots,\lambda_m\}, \ \lambda(\Lambda_2)=\{\lambda_{m+1},\dots,\lambda_r\}, $$

and there is a gap between the eigenvalues of $\Lambda_1$ and $\Lambda_2$ ($|\lambda_1|\geq \dots \geq |\lambda_m| > |\lambda_{m+1}| \geq\dots \geq |\lambda_r| >0$), then the $A^{(k)}_{21}$ block of $A_k$ in the QR-iteration goes to zero with

$$\Vert A^{(k)}_{21} \Vert \leq C q^k, \quad q = \left| \frac{\lambda_{m+1}}{\lambda_{m}} \right |,$$

where $m$ is the size of $\Lambda_1$.

So we need to increase the gap! It can be done by the QR algorithm with shifts.

QR-algorithm with shifts

$$A_{k} - s_k I = Q_k R_k, \quad A_{k+1} = R_k Q_k + s_k I$$

The convergence rate for a shifted version is then

$$\left| \frac{\lambda_{m+1} - s_k}{\lambda_{m} - s_k} \right |,$$

where $\lambda_m$ is the $m$-th largest eigenvalue of the matrix in modulus.

Shifts and power method

$$x_{k+1} := A x_k, \quad x_{k+1} := \frac{x_{k+1}}{\Vert x_{k+1} \Vert}.$$ $$ A := A - \lambda_k I$$

and the corresponding eigenvalue becomes small (but we need large).

Inverse iteration and Rayleigh quotient iteration

$$x_{k+1} = (A - \lambda I)^{-1} x_k,$$

where $\lambda$ is the shift which is approximation to the eigenvalue that we want.

$$x_{k+1} = (A - \lambda_k I)^{-1} x_k,$$$$\lambda_k = \frac{(Ax_k, x_k)}{(x_k, x_k)}$$

Singular values and eigenvalues (1)

$$A = U \Sigma V^*$$

exists for any matrix.

$$\Sigma = U^* A V.$$

Singular values and eigenvalues (2)

  1. Work with the tridiagonal matrix
$$T = B^* B$$

  1. Work with the extended matrix
$$T = \begin{bmatrix} 0 & B \\ B^* & 0 \end{bmatrix}$$

Algorithms for the SEV (symmetric eigenvalue problem)

Done:

Next slides:

Divide-and-conquer

$$T = \begin{bmatrix} T'_1 & B \\ B^{\top} & T'_2 \end{bmatrix}$$ $$T = \begin{bmatrix} T_1 & 0 \\ 0 & T_2 \end{bmatrix} + b_m v v^*$$

where $vv^*$ is rank 1 matrix, $v = (0,\dots,0,1,1,0,\dots,0)^T$.

$$T_1 = Q_1 \Lambda_1 Q^*_1, \quad T_2 = Q_2 \Lambda_2 Q^*_2$$ $$\begin{bmatrix} Q^*_1 & 0 \\ 0 & Q^*_2 \end{bmatrix} T\begin{bmatrix} Q_1 & 0 \\ 0 & Q_2 \end{bmatrix} = D + \rho u u^{*}, \quad D = \begin{bmatrix} \Lambda_1 & 0 \\ 0 & \Lambda_2\end{bmatrix}$$

Diagonal-plus-low-rank matrix

It is tricky to compute the eigenvalues of the matrix

$$D + \rho u u^* $$

The characteristic polynomial has the form

$$\det(D + \rho uu^* - \lambda I) = \det(D - \lambda I)\det(I + \rho (D - \lambda I)^{-1} uu^*) = 0.$$

Then (prove!!)

$$\det(I + \rho (D - \lambda I)^{-1} uu^*) = 1 + \rho \sum_{i=1}^n \frac{|u_i|^2}{d_i - \lambda} = 0$$

Hint: find $\det(I + w u^*)$ using the fact that $\text{det}(C) = \prod_{i=1}^n\lambda_i(C)$ and $\text{trace}(C) = \sum_{i=1}^n \lambda_i$.

Characteristic equation

$$1 + \rho \sum_{i=1}^n \frac{|u_i|^2}{d_i - \lambda} = 0$$

How to find the roots?

How to find the root

$$f(\lambda) \approx c_0 + \frac{c_1}{d_i - \lambda} + \frac{c_2}{d_{i+1} - \lambda}.$$

Important issues

$$(D - \alpha_i I)^{-1}u,$$

where $\alpha_i$ is the computed root.

Loewner theorem

If $\alpha_i$ and $d_i$ satisfy the interlacing theorem

$$d_n < \alpha_n < \ldots < d_{i+1} < \alpha_{i+1} \ldots$$

Then there exists a vector $\widehat{u}$ such that $\alpha_i$ are exact eigenvalues of the matrix

$$\widehat{D} = D + \widehat{u} \widehat{u}^*.$$

Divide and conquer and the Fast Multipole Method

In the computations of divide and conquer we have to evaluate the sums of the form

$$f(\lambda) = 1 + \rho \sum_{i=1}^n \frac{|u_i|^2}{(d_i - \lambda)},$$

and have to do it at least for $n$ points.

Lets explain a little bit...

Few more algorithms

$$(\nu, \zeta, \pi),$$

where $\nu$ is the number of negative, $\zeta$ - zero and $\pi$ - positive eigenvalues.

$$Inertia(A) = Inertia(X^* A X)$$

Bisection via Gaussian elimination

$$A - zI = L D L^*,$$

and inertia of the diagonal matrix is trivial to compute.

Jacobi method

$$\begin{pmatrix} \cos \phi & \sin \phi \\ -\sin \phi & \cos \phi \end{pmatrix},$$

and in the $n$-dimensional case we select two variables $i$ and $j$ and rotate.

Jacobi method (cont.)

$$\Gamma(A) = \mathrm{off}( U^* A U), \quad \mathrm{off}^2(X) = \sum_{i \ne j} \left|X_{ij}\right|^2 = \|X \|^2_F - \sum\limits_{i=1}^n x^2_{ii}.$$

by applying succesive Jacobi rotations $U$ to zero off-diagonal elements.

Jacobi method: convergence

$$ \text{off}(B) < \text{off}(A), $$

where $B = U^*AU$.

$ \Gamma^2(A) = \text{off}^2(B) = \|B\|^2_F - \sum\limits_{i=1}^n b^2_{ii} = \| A \|^2_F - \sum\limits_{i \neq p, q} b^2_{ii} - (b^2_{pp} + b^2_{qq}) = \| A \|^2_F - \sum\limits_{i \neq p, q} a^2_{ii} - (a^2_{pp} + 2a^2_{pq} + a^2_{qq}) = \| A \|^2_F - \sum\limits_{i =1}^n a^2_{ii} - 2a^2_{pq} = \text{off}^2(A) - 2a^2_{pq} < \text{off}^2(A)$

$$ |a_{ij}| \leq \gamma, $$

thus

$$ \text{off}(A)^2 \leq 2 N \gamma^2, $$

where $2N = n(n-1)$ is the number of off-diagonal elements.

$$2\gamma^2 \geq \frac{\text{off}^2(A)}{N}.$$

Now we use relations $\Gamma^2(A) = \text{off}^2(A) - 2\gamma^2 \leq \text{off}^2(A) - \dfrac{\text{off}^2(A)}{N}$ and get

$$ \Gamma(A) \leq \sqrt{\left(1 - \frac{1}{N}\right)} \text{off}(A). $$ $$\left(1 - \frac{1}{N}\right)^{\frac{N}{2}} \approx e^{-\frac{1}{2}},$$

i.e. linear convergence. However, the convergence is locally quadratic (given without proof here).

Jacobi: summary

Jacobi method was the first numerical method for the eigenvalues, proposed in 1846.

Summary for this part

Next lecture

Questions?