# Iterative Methods: Part 1

Once again, we consider solving the equation $A\vec{x} = \vec{b}$, where $A \in \mathbb{R}^{n \times n}$ is nonsingular. When we use a factorization method for dense matrices, such as LU, Cholesky, SVD, and QR, we make $O(n^{3})$ operations and store $O(n^{2})$ matrix entries. As a result, it’s hard to accurately model a real-life problem by adding more variables.

Luckily, when the matrix is sparse (e.g. because we used finite difference method or finite element method), we can use an iterative method to approximate the solution efficiently. Typically, an iterative method incurs $O(n^{2})$ operations, if not fewer, and requires $O(n)$ storage.

Over the next few posts, we will look at 5 iterative methods:

• Jacobi
• Gauss-Seidel
• Successive Over-Relaxation (SOR)
• Symmetric SOR (SSOR)

In addition, we will solve 2D Poisson’s equation using these methods. We will compare the approximate solutions to the exact to illustrate the accuracy of each method.

# 1. Poisson’s equation

We consider a partial differential equation (PDE), called Poisson’s equation, in 2D. This way, we can capture any features that we might miss in 1D, but learn iterative methods more easily than in 3D. For simplicity, we set zero Dirichlet boundary condition (BC). However, with additional work, we can also solve Poisson’s equation with nonzero BCs, Neumann BCs, or mixed BCs.

### 2D Poisson’s equation with zero Dirichlet BC

Given a function $f = f(x,\,y)$ defined over the domain $\Omega$, find $u \in C^{2}(\overline{\Omega})$ such that,

$\boxed{\left\{\begin{array}{rl} -\Delta u(x,\,y) = f(x,\,y), & \forall\,(x,\,y) \in \Omega \\[12pt] u(x,\,y) = 0, & \forall\,(x,\,y) \in \partial\Omega. \end{array}\right.}$

Again, for simplicity, we choose a square domain $\Omega = (0,\,1) \times (0,\,1)$ and a nice RHS function:

$\boxed{f(x,\,y) = (a^{2} + b^{2})\pi^{2}\sin(a\pi x)\sin(b\pi y)}$,

where $a,\,b$ are positive integers.

Under these conditions, we have the exact solution:

$\boxed{u_{exact}(x,\,y) = \sin(a\pi x)\sin(b\pi y)}$.

As an aside, we can generalize the notion of an eigenvalue problem from matrices to functions. We see that $u_{exact}$ is an eigenfunction of the operator $-\Delta$ corresponding to the eigenvalue $(a^{2} + b^{2})\pi^{2}$. This coincidence will later come into play when we study Conjugate Gradient method.

## a. Motivation

Poisson’s equation models several engineering problems. The solution $u$ can represent the temperature in a steady-state heat flow, the displacement of an elastic membrane under a vertical load, or the electric potential given a charge distribution. The equation is also a good one to study, since the Laplace operator,

$\displaystyle\Delta := \frac{\partial^{2}}{\partial x^{2}} + \frac{\partial^{2}}{\partial y^{2}}$,

shows up often in PDEs. We know its properties well.

That said, it’s difficult to solve Poisson’s equation analytically. One, we do not know the exact solution $u_{exact}$ for a general $\Omega$ and $f$. Even if we do, it may be costly to evaluate it at a point $(x,\,y)$, e.g. because $u_{exact}$ is cast as an infinite series. Two, the problem is too strict. It asks for a solution that satisfies the differential equation and BC at infinitely many points in the domain and on its boundary. On top of that, the solution has to be twice-differentiable inside and continuous outside. Talk about demanding! We call our original problem, the strong form.

We bypass these difficulties by solving the problem numerically and approximately. One approach is to satisfy the differential equation and BC at finitely many points. This leads to a finite difference method or a collocation method. Another is to make the residual function $f + \Delta u$ equal to 0 on average. This gives us a finite element method. All of these methods will result in a sparse matrix. We call our easier, nicer problem, the weak form.

## b. Finite difference method

To solve a PDE with a finite difference method, we need to discretize the domain and the differential operators in the PDE. In our problem, the domain $\Omega = [0,\,1] \times [0,\,1]$ is square and the only differential operator is the Laplace operator, $\Delta$.

First, we discretize the domain with a uniform mesh. Place $N$ points inside the domain along each axis, equally spaced.

For brevity, we denote the values that the functions $u$ and $f$ take at the point $(x_{i},\,y_{j})$ by $u_{i,\,j}$ and $f_{i,\,j}$. In other words,

$\left\{\begin{array}{l} u_{i,\,j} := u(x_{i},\,y_{j}) \\[12pt] f_{i,\,j} := f(x_{i},\,y_{j}) \end{array}\right.,\,\,\forall\,\,i,\,j = 0,\,\cdots,\,N + 1$.

Often, the mesh size, denoted $h$ and defined to be the maximum distance between two neighboring points along any axis, determines how fast our approximate solution can approach the exact when we increase the problem size (i.e. add more points). Since the points are equally spaced, the mesh size is,

$\boxed{h = \displaystyle\frac{1}{N + 1}}$.

The second step is to discretize the differential operator $\Delta$. We must do so, since the continuous version does not make sense in the absence of neighborhoods of points. The most simple approximation is given by the 5-point stencil:

$\displaystyle -\Delta u_{i,\,j} \approx \frac{-u_{i-1,\,j} - u_{i+1,\,j} + 4u_{i,\,j} - u_{i,\,j-1} - u_{i,\,j+1}}{h^{2}}$

Notice that we approximated the second derivative of $u$ at the point $(x_{i},\,y_{j})$ by taking a linear combination of the values of $u$ at the point and its four immediate neighbors. We can visualize the points and their contributions with a graph:

## c. Summary

We have all the ingredients to turn the Poisson’s equation into a matrix equation now, so let’s review what we just learned.

First, we required that the differential equation is satisfied at $N^{2}$ points (finitely many) in the domain:

$-\Delta u(x_{i},\,y_{j}) = f(x_{i},\,y_{j}),\,\,\forall\,\,i,\,j = 1,\,\cdots,\,N$.

Then, we replaced the Laplace operator with a discrete version by using the 5-point stencil:

$\boxed{-u_{i-1,\,j} - u_{i+1,\,j} + 4u_{i,\,j} - u_{i,\,j - 1} - u_{i,\,j + 1} = h^{2}f_{ij},\,\,\forall\,\,i,\,j = 1,\,\cdots,\,N}$.

Afterwards, we apply the zero Dirichlet BC. We can simply set $u_{i,\,j} = 0$ when either of its indices is 0 or $N + 1$. However, if we have nonzero Dirichlet BCs, we must move the known variables to the RHS.

Finally, we gather the unknowns in a natural order (e.g. traverse along the $x$-direction, then $y$). We can then write a matrix equation,

$\boxed{K\vec{u} = \vec{f}}$,

where the entries of $\vec{f}$ include $h^{2}$. The matrix $K \in \mathbb{R}^{N^{2} \times N^{2}}$ is sparse, because each row has at most 5 nonzero entries.

Note that, when we write a program for an iterative method, it is easier if we store the entries of $\vec{u}$ in a $(N + 2) \times (N + 2)$ double array (with extra size to handle boundaries). The double array also feels natural, since it mimics the layout of points on the mesh. The same goes for the entries of $\vec{f}$.

# 2. Classical iterative methods

Iterative methods such as Jacobi, Gauss-Seidel, SOR, and SSOR rely on decomposing (splitting) the matrix $K$ into a sum of matrices. They are so-called classical, and differ by how we split the matrix. On the other hand, Conjugate Gradient relies on creating subspaces, so we will study this method separately.

## a. Splitting

Given the equation $A\vec{x} = \vec{b}$, we can write $A$ as a sum of two matrices,

$A = P - Q$,

with one condition that $P$ is nonsingular.

Since $(P - Q)\vec{x} = \vec{b}$, we see that $\vec{x}$ satisfies the equation,

$\vec{x} = P^{-1}(Q\vec{x} + \vec{b})$.

We call an equation of the form $\vec{x} = f(\vec{x})$, such as that above, a fixed-point equation. Fixed-point equations are nice because we can solve them iteratively:

### Fixed-Point Method

$\begin{array}{l} \mbox{input: initial guess}\,\,\vec{x}_{0} \\[12pt] \mbox{for}\,\,k = 0,\,1,\,\cdots \\[12pt] \mbox{\hspace{0.6cm}} \vec{x}_{k + 1} := f(\vec{x}_{k}) = P^{-1}(Q\vec{x}_{k} + \vec{b}) \\[12pt] \mbox{end} \end{array}$

The question is, as $k \rightarrow \infty$, will $\vec{x}_{k}$ converge to the exact solution $\vec{x}$? Better yet, can we stop after finite iterations and be confident that the error $||\vec{x}_{k} - \vec{x}||$ is small?

Yes, if the matrix

$\boxed{R := P^{-1}Q}$

meets a certain condition.

## b. Convergence

Thanks to the Contraction Mapping Theorem, we know when the fixed-point method converges to the exact solution.

### Theorem 1. Convergence Criterion I

If $||R|| < 1$ for some induced matrix norm $||\cdot||$, then $\vec{x}_{k} \rightarrow \vec{x}$ as $k \rightarrow \infty$ for any initial guess $\vec{x}_{0}$. The solution $\vec{x}$ exists and is unique.

In fact, we can show that,

$||\vec{x}_{k} - \vec{x}|| \leq ||R||^{k} \cdot ||\vec{x}_{0} - \vec{x}||$,

i.e. with each iteration, we can reduce the initial absolute error by a factor of $||R||$.

However, this criterion is not useful in practice, since we do not know which induced matrix norm results in $||R|| < 1$. It is a sufficient condition for convergence, but not a necessary one. If we happen to get $||R|| \geq 1$ with the 1-norm, 2-norm, and $\infty$-norm, then we cannot conclude that the fixed-point method will not work.

Luckily, thanks to eigenvalues, we have another criterion, one that is a necessary and sufficient condition for convergence. The spectral radius, denoted $\rho(R)$, is the size of the largest eigenvalue of $R$. The largest eigenvalue is something that we can compute easily. These lemmas motivate using the spectral radius to decide convergence:

### Lemma (i).

For any induced matrix norm $||\cdot||$, we have $\rho(R) \leq ||R||$.

### Lemma (ii).

For any tolerance $\varepsilon > 0$, there exists an induced matrix norm $||\cdot||$ such that $||R|| < \rho(R) + \varepsilon$.

### Proof for Lemma (i).

Let $\lambda$ be the largest eigenvalue of $R$, and $\vec{y}$ a corresponding unit eigenvector. Then,

$\rho(R) = |\lambda| = ||\lambda\vec{y}|| = ||R\vec{y}|| \leq ||R||$.

We see that the spectral radius is a very good estimate of an induced matrix norm. Furthermore, we can say the following:

### Theorem 2. Convergence Criterion II

The fixed-point method $\vec{x}_{k + 1} = P^{-1}(Q\vec{x}_{k} + \vec{b})$ converges to $\vec{x}$ for any initial guess $\vec{x}_{0}$, if and only if, $\rho(R) < 1$.

### Proof for Theorem 2.

($\Leftarrow$)  If $\rho(R) < 1$, we can find an $\varepsilon$ such that $\rho(R) + \varepsilon < 1$ (density of real numbers). Lemma (ii) guarantees the existence of an induced matrix norm such that $||R|| < 1$. Finally, we apply Theorem 1.

($\Rightarrow$)  We prove by contrapositive. We assume that $\rho(R) \geq 1$ and will find an initial guess $\vec{x}_{0}$ for which the fixed-point method does not converge.

Select the initial guess to be,

$\vec{x}_{0} = \vec{x} - \vec{y}$,

where $\vec{x}$ is the exact solution and $\vec{y}$ is a unit eigenvector corresponding to the largest eigenvalue of $R$, denoted $\lambda$.

Then,

\begin{array}{l} \begin{aligned} \vec{x}_{1} &= P^{-1}\Bigl[Q(\vec{x} - \vec{y}) + \vec{b}\Bigr] \\[12pt] &= P^{-1}(Q\vec{x} + \vec{b}) - R\vec{y} \\[12pt] &= \vec{x} - \lambda\vec{y} \end{aligned} \\ \\ \\ \begin{aligned} \vec{x}_{2} &= P^{-1}\Bigl[Q(\vec{x} - \lambda\vec{y}) + \vec{b}\Bigr] \\[12pt] &= P^{-1}(Q\vec{x} + \vec{b}) - \lambda R\vec{y} \\[12pt] &= \vec{x} - \lambda^{2}\vec{y} \end{aligned} \\ \\ \\ \vdots \\ \\ \\ \vec{x}_{k} = \vec{x} - \lambda^{k}\vec{y} \end{array}

Therefore, as $k \rightarrow \infty$,

$||\vec{x}_{k} - \vec{x}|| = |\lambda|^{k} \cdot ||\vec{y}|| = \rho(R)^{k} \rightarrow \infty$.

# Notes

Whew. That was probably a lot to read and digest. I suggest that we stop here today and meet each classical iterative method next time. Much like how $||R||$ indicates the rate of convergence, the spectral radius $\rho(R)$ can also. We will analyze the spectral radius for each method and see its effect on convergence for Poisson’s equation.

We can also discretize the operator $-\Delta$ with a different stencil to study how it affects the matrix equation $K\vec{u} = \vec{f}$ and the approximate solution $\vec{u}$. For example, there is a 9-point stencil:

Finite difference method is great for solving PDEs on a simple domain such as a line, a rectangle, or a box, but not so much on a complex domain with slant edges or curves. In general, we use finite element method to solve PDEs. Nonetheless, finite difference method is great for studying iterative methods!