In mathematics, the generalized minimal residual method (usually abbreviated GMRES) is an iterative method for the numerical solution of a nonsymmetric system of linear equations. The method approximates the solution by the vector in a Krylov subspace with minimal residual. The Arnoldi iteration is used to find this vector.
The GMRES method was developed by Yousef Saad and Martin H. Schultz in 1986.^{[1]} GMRES is a generalization of the MINRES method developed by Chris Paige and Michael Saunders in 1975. GMRES also is a special case of the DIIS method developed by Peter Pulay in 1980. DIIS is also applicable to nonlinear systems.
Contents

The method 1

Convergence 2

Extensions of the method 3

Comparison with other solvers 4

Solving the least squares problem 5

See also 6

Notes 7

References 8
The method
Denote the Euclidean norm of any vector v by \v\. Denote the system of linear equations to be solved by

Ax = b. \,
The matrix A is assumed to be invertible of size mbym. Furthermore, it is assumed that b is normalized, i.e., that b = 1.
The nth Krylov subspace for this problem is

K_n = K_n(A,b) = \operatorname{span} \, \{ b, Ab, A^2b, \ldots, A^{n1}b \}. \,
GMRES approximates the exact solution of Ax = b by the vector x_{n} ∈ K_{n} that minimizes the Euclidean norm of the residual r_{n} = Ax_{n} − b.
The vectors b, Ab, …, A^{n−1}b might be almost linearly dependent, so instead of this basis, the Arnoldi iteration is used to find orthonormal vectors

q_1, q_2, \ldots, q_n \,
which form a basis for K_{n}. Hence, the vector x_{n} ∈ K_{n} can be written as x_{n} = Q_{n}y_{n} with y_{n} ∈ R^{n}, where Q_{n} is the mbyn matrix formed by q_{1}, …, q_{n}.
The Arnoldi process also produces an (n+1)byn upper Hessenberg matrix \tilde{H}_n with

AQ_n = Q_{n+1} \tilde{H}_n. \,
Because columns of Q_n are orthogonal, we have

\ Ax_n  b \ = \ \tilde{H}_ny_n  \beta e_1 \, \,
where

e_1 = (1,0,0,\ldots,0)^T \,
is the first vector in the standard basis of R^{n+1}, and

\beta = \bAx_0\ \, ,
x_0 being the first trial vector (usually zero). Hence, x_n can be found by minimizing the Euclidean norm of the residual

r_n = \tilde{H}_n y_n  \beta e_1.
This is a linear least squares problem of size n.
This yields the GMRES method. On the nth iteration:

calculate q_n with the Arnoldi method;

find the y_n which minimizes r_{n};

compute x_n = Q_n y_n ;

repeat if the residual is not yet small enough.
At every iteration, a matrixvector product Aq_{n} must be computed. This costs about 2m^{2} floatingpoint operations for general dense matrices of size m, but the cost can decrease to O(m) for sparse matrices. In addition to the matrixvector product, O(n m) floatingpoint operations must be computed at the nth iteration.
Convergence
The nth iterate minimizes the residual in the Krylov subspace K_{n}. Since every subspace is contained in the next subspace, the residual decreases monotonically. After m iterations, where m is the size of the matrix A, the Krylov space K_{m} is the whole of R^{m} and hence the GMRES method arrives at the exact solution. However, the idea is that after a small number of iterations (relative to m), the vector x_{n} is already a good approximation to the exact solution.
This does not happen in general. Indeed, a theorem of Greenbaum, Pták and Strakoš states that for every monotonically decreasing sequence a_{1}, …, a_{m−1}, a_{m} = 0, one can find a matrix A such that the r_{n} = a_{n} for all n, where r_{n} is the residual defined above. In particular, it is possible to find a matrix for which the residual stays constant for m − 1 iterations, and only drops to zero at the last iteration.
In practice, though, GMRES often performs well. This can be proven in specific situations. If A is positive definite, then

\r_n\ \leq \left( 1\frac{\lambda_{\mathrm{min}}^2(1/2(A^T + A))}{ \lambda_{\mathrm{max}}(A^T A)} \right)^{n/2} \r_0\,
where \lambda_{\mathrm{min}}(M) and \lambda_{\mathrm{max}}(M) denote the smallest and largest eigenvalue of the matrix M, respectively.
If A is symmetric and positive definite, then we even have

\r_n\ \leq \left( \frac{2\kappa_2(A)1}{2\kappa_2(A)} \right)^{n/2} \r_0\.
where \kappa_2(A) denotes the condition number of A in the Euclidean norm.
In the general case, where A is not positive definite, we have

\r_n\ \le \inf_{p \in P_n} \p(A)\ \le \kappa_2(V) \inf_{p \in P_n} \max_{\lambda \in \sigma(A)} p(\lambda) \r_0\, \,
where P_{n} denotes the set of polynomials of degree at most n with p(0) = 1, V is the matrix appearing in the spectral decomposition of A, and σ(A) is the spectrum of A. Roughly speaking, this says that fast convergence occurs when the eigenvalues of A are clustered away from the origin and A is not too far from normality.^{[2]}
All these inequalities bound only the residuals instead of the actual error, that is, the distance between the current iterate x_{n} and the exact solution.
Extensions of the method
Like other iterative methods, GMRES is usually combined with a preconditioning method in order to speed up convergence.
The cost of the iterations grow as O(n^{2}), where n is the iteration number. Therefore, the method is sometimes restarted after a number, say k, of iterations, with x_{k} as initial guess. The resulting method is called GMRES(k) or Restarted GMRES.
Comparison with other solvers
The Arnoldi iteration reduces to the Lanczos iteration for symmetric matrices. The corresponding Krylov subspace method is the minimal residual method (MinRes) of Paige and Saunders. Unlike the unsymmetric case, the MinRes method is given by a threeterm recurrence relation. It can be shown that there is no Krylov subspace method for general matrices, which is given by a short recurrence relation and yet minimizes the norms of the residuals, as GMRES does.
Another class of methods builds on the unsymmetric Lanczos iteration, in particular the BiCG method. These use a threeterm recurrence relation, but they do not attain the minimum residual, and hence the residual does not decrease monotonically for these methods. Convergence is not even guaranteed.
The third class is formed by methods like CGS and BiCGSTAB. These also work with a threeterm recurrence relation (hence, without optimality) and they can even terminate prematurely without achieving convergence. The idea behind these methods is to choose the generating polynomials of the iteration sequence suitably.
None of these three classes is the best for all matrices; there are always examples in which one class outperforms the other. Therefore, multiple solvers are tried in practice to see which one is the best for a given problem.
Solving the least squares problem
One part of the GMRES method is to find the vector y_n which minimizes

\ \tilde{H}_n y_n  \beta e_1 \. \,
Note that \tilde{H}_n is an (n+1)byn matrix, hence it gives an overconstrained linear system of n+1 equations for n unknowns.
The minimum can be computed using a QR decomposition: find an (n+1)by(n+1) orthogonal matrix Ω_{n} and an (n+1)byn upper triangular matrix \tilde{R}_n such that

\Omega_n \tilde{H}_n = \tilde{R}_n.
The triangular matrix has one more row than it has columns, so its bottom row consists of zero. Hence, it can be decomposed as

\tilde{R}_n = \begin{bmatrix} R_n \\ 0 \end{bmatrix},
where R_n is an nbyn (thus square) triangular matrix.
The QR decomposition can be updated cheaply from one iteration to the next, because the Hessenberg matrices differ only by a row of zeros and a column:

\tilde{H}_{n+1} = \begin{bmatrix} \tilde{H}_n & h_{n+1} \\ 0 & h_{n+2,n+1} \end{bmatrix},
where h_{n+1} = (h_{1,n+1}, …, h_{n+1,n+1})^{T}. This implies that premultiplying the Hessenberg matrix with Ω_{n}, augmented with zeroes and a row with multiplicative identity, yields almost a triangular matrix:

\begin{bmatrix} \Omega_n & 0 \\ 0 & 1 \end{bmatrix} \tilde{H}_{n+1} = \begin{bmatrix} R_n & r_{n+1} \\ 0 & \rho \\ 0 & \sigma \end{bmatrix}
This would be triangular if σ is zero. To remedy this, one needs the Givens rotation

G_n = \begin{bmatrix} I_{n} & 0 & 0 \\ 0 & c_n & s_n \\ 0 & s_n & c_n \end{bmatrix}
where

c_n = \frac{\rho}{\sqrt{\rho^2+\sigma^2}} \quad\mbox{and}\quad s_n = \frac{\sigma}{\sqrt{\rho^2+\sigma^2}}.
With this Givens rotation, we form

\Omega_{n+1} = G_n \begin{bmatrix} \Omega_n & 0 \\ 0 & 1 \end{bmatrix}.
Indeed,

\Omega_{n+1} \tilde{H}_{n+1} = \begin{bmatrix} R_n & r_{n+1} \\ 0 & r_{n+1,n+1} \\ 0 & 0 \end{bmatrix} \quad\text{with}\quad r_{n+1,n+1} = \sqrt{\rho^2+\sigma^2}
is a triangular matrix.
Given the QR decomposition, the minimization problem is easily solved by noting that

\ \tilde{H}_n y_n  \beta e_1 \ = \ \Omega_n (\tilde{H}_n y_n  \beta e_1) \ = \ \tilde{R}_n y_n  \beta \Omega_n e_1 \.
Denoting the vector \beta\Omega_ne_1 by

\tilde{g}_n = \begin{bmatrix} g_n \\ \gamma_n \end{bmatrix}
with g_{n} ∈ R^{n} and γ_{n} ∈ R, this is

\ \tilde{H}_n y_n  \beta e_1 \ = \ \tilde{R}_n y_n  \beta \Omega_n e_1 \ = \left\ \begin{bmatrix} R_n \\ 0 \end{bmatrix} y_n  \begin{bmatrix} g_n \\ \gamma_n \end{bmatrix} \right\.
The vector y that minimizes this expression is given by

y_n = R_n^{1} g_n.
Again, the vectors g_n are easy to update.^{[3]}
See also
Notes

^ Saad and Schultz

^ Trefethen & Bau, Thm 35.2

^ Stoer and Bulirsch, §8.7.2
References

A. Meister, Numerik linearer Gleichungssysteme, 2nd edition, Vieweg 2005, ISBN 9783528131357.

Y. Saad, Iterative Methods for Sparse Linear Systems, 2nd edition, Society for Industrial and Applied Mathematics, 2003. ISBN 9780898715347.

Y. Saad and M.H. Schultz, "GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems", SIAM J. Sci. Stat. Comput., 7:856869, 1986. doi:10.1137/0907058.

J. Stoer and R. Bulirsch, Introduction to numerical analysis, 3rd edition, Springer, New York, 2002. ISBN 9780387954523.

Lloyd N. Trefethen and David Bau, III, Numerical Linear Algebra, Society for Industrial and Applied Mathematics, 1997. ISBN 9780898713619.

Templates for the Solution of Linear Systems: Building Blocks for Iterative MethodsDongarra et al. , , 2nd Edition, SIAM, Philadelphia, 1994


Key concepts



Problems



Hardware



Software



This article was sourced from Creative Commons AttributionShareAlike License; additional terms may apply. World Heritage Encyclopedia content is assembled from numerous content providers, Open Access Publishing, and in compliance with The Fair Access to Science and Technology Research Act (FASTR), Wikimedia Foundation, Inc., Public Library of Science, The Encyclopedia of Life, Open Book Publishers (OBP), PubMed, U.S. National Library of Medicine, National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health (NIH), U.S. Department of Health & Human Services, and USA.gov, which sources content from all federal, state, local, tribal, and territorial government publication portals (.gov, .mil, .edu). Funding for USA.gov and content contributors is made possible from the U.S. Congress, EGovernment Act of 2002.
Crowd sourced content that is contributed to World Heritage Encyclopedia is peer reviewed and edited by our editorial staff to ensure quality scholarly research articles.
By using this site, you agree to the Terms of Use and Privacy Policy. World Heritage Encyclopedia™ is a registered trademark of the World Public Library Association, a nonprofit organization.