Lecture 2. Matrix norms and unitary matrices

Recap of the previous lecture

Notations

We use notation

$$A= \begin{bmatrix} a_{11} & \dots & a_{1m} \\ \vdots & \ddots & \vdots \\ a_{n1} & \dots & a_{nm} \end{bmatrix} \equiv \{a_{ij}\}_{i,j=1}^{n,m}\in \mathbb{C}^{n\times m}.$$

$A^*\stackrel{\mathrm{def}}{=}\overline{A^\top}$.

Matrices and norms

$$ \Vert A \Vert_F \stackrel{\mathrm{def}}{=} \Big(\sum_{i=1}^n \sum_{j=1}^m |a_{ij}|^2\Big)^{1/2} $$

Matrix norms

$\Vert \cdot \Vert$ is called a matrix norm if it is a vector norm on the vector space of $n \times m$ matrices:

  1. $\|A\| \geq 0$ and if $\|A\| = 0$ then $A = O$
  2. $\|\alpha A\| = |\alpha| \|A\|$
  3. $\|A+B\| \leq \|A\| + \|B\|$ (triangle inequality)

Additionally some norms can satisfy the submultiplicative property

$$ \|A\|_C = \displaystyle{\max_{i,j}}\, |a_{ij}| $$

Operator norms

$$ \Vert A \Vert_{*,**} = \sup_{x \ne 0} \frac{\Vert A x \Vert_*}{\Vert x \Vert_{**}}, $$

where $\Vert \cdot \Vert_*$ and $\| \cdot \|_{**}$ are vector norms.

Matrix $p$-norms

Important case of operator norms are matrix $p$-norms, which are defined for $\|\cdot\|_* = \|\cdot\|_{**} = \|\cdot\|_p$.

Among all $p$-norms three norms are the most common ones:

Let us check it for $p=\infty$ on a blackboard.

Spectral norm

$$ \Vert A \Vert_2 = \sigma_1(A) = \sqrt{\lambda_\max(A^*A)} $$

where $\sigma_1(A)$ is the largest singular value of the matrix $A$ and $^*$ is a conjugate transpose of the matrix.

Examples

Several examples of optimization problems where matrix norms arise:

$$ (P_\Omega)_{ij} = \begin{cases} 1 & i,j\in\Omega \\ 0 & \text{otherwise}, \end{cases} $$

where $\odot$ denotes Hadamard product (elementwise)

Scalar product

While norm is a measure of distance, the scalar product takes angle into account.

It is defined as

$$ \Vert x \Vert_2 = \sqrt{(x, x)}, $$

or it is said the norm is induced by the scalar product.

$$ (A, B)_F = \displaystyle{\sum_{i=1}^{n}\sum_{j=1}^{m}} \overline{a}_{ij} b_{ij} \equiv \mathrm{trace}(A^* B), $$

where $\mathrm{trace}(A)$ denotes the sum of diagonal elements of $A$. One can check that $\|A\|_F = \sqrt{(A, A)_F}$.

Remark. The angle between two vectors is defined as

$$ \cos \phi = \frac{(x, y)}{\Vert x \Vert_2 \Vert y \Vert_2}. $$

Similar expression holds for matrices.

$$|(x, y)| \leq \Vert x \Vert_2 \Vert y \Vert_2,$$

and thus the angle between two vectors is defined properly.

Matrices preserving the norm

$$ \frac{\Vert x - \widehat{x} \Vert}{\Vert x \Vert} \leq \varepsilon. $$ $$ y = U x, \quad \widehat{y} = U \widehat{x}. $$ $$ \frac{\Vert y - \widehat{y} \Vert}{\Vert y \Vert } = \frac{\Vert U ( x - \widehat{x}) \Vert}{\Vert U x\Vert} \leq \varepsilon. $$
$$ \frac{\Vert U ( x - \widehat{x}) \Vert}{\Vert U x\Vert} = \frac{ \|x - \widehat{x}\|}{\|x\|}. $$

Unitary (orthogonal) matrices

$$ U^* U = I_n, $$

where $I_n$ is an identity matrix $n\times n$.

$$ U^*U = UU^* = I_n, $$

which means that columns and rows of unitary matrices both form orthonormal basis in $\mathbb{C}^{n}$.

$$ U^TU = UU^T = I $$

are called orthogonal.

Unitary matrices

Important property: a product of two unitary matrices is a unitary matrix:

$$(UV)^* UV = V^* (U^* U) V = V^* V = I,$$

Unitary invariance of $\|\cdot\|_2$ and $\|\cdot\|_F$ norms

$$ \| UAV\|_2 = \| A \|_2 \qquad \| UAV\|_F = \| A \|_F.$$

Examples of unitary matrices

Householder matrices

$$H \equiv H(v) = I - 2 vv^*,$$

where $v$ is an $n \times 1$ column and $v^* v = 1$.

$$ Hx = x - 2(v^* x) v$$

Important property of Householder reflection

$$ H \begin{bmatrix} \times \\ \times \\ \times \\ \times \end{bmatrix} = \begin{bmatrix} \times \\ 0 \\ 0 \\ 0 \end{bmatrix}. $$

Proof (for real case). Let $e_1 = (1,0,\dots, 0)^T$, then we want to find $v$ such that

$$ H x = x - 2(v^* x) v = \alpha e_1, $$

where $\alpha$ is an unknown constant. Since $\|\cdot\|_2$ is unitary invariant we get

$$\|x\|_2 = \|Hx\|_2 = \|\alpha e_1\|_2 = |\alpha|.$$

and $$\alpha = \pm \|x\|_2$$

Also, we can express $v$ from $x - 2(v^* x) v = \alpha e_1$:

$$v = \dfrac{x-\alpha e_1}{2 v^* x}$$

Multiplying the latter expression by $x^*$ we get

$$x^* x - 2 (v^* x) x^* v = \alpha x_1; $$

or

$$ \|x\|_2^2 - 2 (v^* x)^2 = \alpha x_1. $$

Therefore,

$$ (v^* x)^2 = \frac{\|x\|_2^2 - \alpha x_1}{2}. $$

So, $v$ exists and equals

$$ v = \dfrac{x \mp \|x\|_2 e_1}{2v^* x} = \dfrac{x \mp \|x\|_2 e_1}{\pm\sqrt{2(\|x\|_2^2 \mp \|x\|_2 x_1)}}. $$

Householder algorithm for QR decomposition

Using the obtained property we can make arbitrary matrix $A$ lower triangular:

$$ H_2 H_1 A = \begin{bmatrix} \times & \times & \times & \times \\ 0 & \times & \times & \times \\ 0 & 0 & \boldsymbol{\times} & \times\\ 0 &0 & \boldsymbol{\times} & \times \\ 0 &0 & \boldsymbol{\times} & \times \end{bmatrix} $$

then finding $H_3=\begin{bmatrix}I_2 & \\ & {\widetilde H}_3 \end{bmatrix}$ such that

$$ {\widetilde H}_3 \begin{bmatrix} \boldsymbol{\times}\\ \boldsymbol{\times} \\ \boldsymbol{\times} \end{bmatrix} = \begin{bmatrix} \times \\ 0 \\ 0 \end{bmatrix}. $$

we get

$$ H_3 H_2 H_1 A = \begin{bmatrix} \times & \times & \times & \times \\ 0 & \times & \times & \times \\ 0 & 0 & {\times} & \times\\ 0 &0 & 0 & \times \\ 0 &0 & 0 & \times \end{bmatrix} $$

Finding $H_4$ by analogy we arrive at upper-triangular matrix.

Since product and inverse of unitary matrices is a unitary matrix we get:

Corollary: (QR decomposition) Every $A\in \mathbb{C}^{n\times m}$ can be represented as

$$ A = QR, $$

where $Q$ is unitary and $R$ is upper triangular.

See poster, what are the sizes of $Q$ and $R$ for $n>m$ and $n<m$.

Givens (Jacobi) matrix

$$ G = \begin{bmatrix} \cos \alpha & -\sin \alpha \\ \sin \alpha & \cos \alpha \end{bmatrix},$$

which is a rotation.

$$ x' = G x, $$

only in the $i$-th and $j$-th positions:

$$ x'_i = x_i\cos \alpha - x_j\sin \alpha , \quad x'_j = x_i \sin \alpha + x_j\cos\alpha, $$

with all other $x_i$ remain unchanged.

$$ \cos \alpha = \frac{x_i}{\sqrt{x_i^2 + x_j^2}}, \quad \sin \alpha = -\frac{x_j}{\sqrt{x_i^2 + x_j^2}} $$

QR via Givens rotations

Similarly we can make matrix upper-triangular using Givens rotations:

$$\begin{bmatrix} \times & \times & \times \\ \bf{*} & \times & \times \\ \bf{*} & \times & \times \end{bmatrix} \to \begin{bmatrix} * & \times & \times \\ * & \times & \times \\ 0 & \times & \times \end{bmatrix} \to \begin{bmatrix} \times & \times & \times \\ 0 & * & \times \\ 0 & * & \times \end{bmatrix} \to \begin{bmatrix} \times & \times & \times \\ 0 & \times & \times \\ 0 & 0 & \times \end{bmatrix} $$

Givens vs. Householder transformations

Singular Value Decomposition

SVD will be considered later in more details.

Theorem. Any matrix $A\in \mathbb{C}^{n\times m}$ can be written as a product of three matrices:

$$ A = U \Sigma V^*, $$

where

Moreover, if $\text{rank}(A) = r$, then $\sigma_{r+1} = \dots = \sigma_{\min (m,n)} = 0$.

See poster for the visualization.

Summary

Questions?