SSMFE_EXPERT - Sparse Symmetric Matrix-Free Eigensolver (Expert interface)
#include <spral_ssmfe.h> /* or <spral.h> for all packages */
Purpose
This package computes extreme (leftmost and/or rightmost) eigenpairs \(\{\lambda_i, x_i\}\) of the following eigenvalue problems:
the standard eigenvalue problem
\[A x = \lambda x,\]the generalized eigenvalue problem
\[A x = \lambda B x,\]the buckling problem
\[B x = \lambda A x,\]
where \(A\) and \(B\) are real symmetric (or Hermitian) matrices and \(B\) is positive definite.
The package SSMFE provides a more user-friendly wrapper around this code. Conversely, SSMFE_CORE provides a lower level implementation of the core solver, which this package provides a wrapper for.
Major version history
- 2014-11-20 Version 1.0.0
Initial release
[for detail please see ChangeLog]
Usage overview
The eigensolver subroutines behind this package implement a block iterative algorithm. The block nature of this algorithm allows the user to benefit from highly optimized linear algebra subroutines and from the ubiquitous multicore architecture of modern computers. It also makes this algorithm more reliable than Krylov-based algorithms employed e.g. by ARPACK in the presence of clustered eigenvalues. However, convergence of the iterations may be slow if the density of the spectrum is high.
Thus, good performance (in terms of speed) is contingent on the following two factors:
the number of desired eigenpairs must be substantial (e.g. not fewer than the number of CPU cores), and
the employment of a convergence acceleration technique.
The acceleration techniques that can be used are shift-and-invert and preconditioning.
The former requires the direct solution of linear systems with the matrix \(A\) or its linear combination with \(B\), for which a sparse symmetric indefinite solver (such as HSL_MA97 or SPRAL_SSIDS) can be employed.
The latter applies to the case of positive definite \(A\) and requires a matrix or an operator \(T\), called a preconditioner, such that the vector \(v = T f\) is an approximation to the solution \(u\) of the system \(A u = f\) (see the simple example below). Note: This technique is only recommended for experienced users.
In this expert interface, the user must handle storage of all vectors, facilitating advanced memory handling techniques required for parallel, hybrid and/or out-of-core execution. If there is no requirement to store these vectors, consider using the simplified interface of SSMFE instead.
Subroutines
To use the solver procedures, the user must maintain a workspace of (kw+1)
blocks each containing m vectors of size n. For notational convienience
we refer to this workspace as a C array W[kw+1][m][n]
, but the user
is free to store it as they wish. The following table provides minimum values
of kw for each setup:
minAprod=T
minAprod=F
Problem
minBprod=T
minBprod=F
minBprod=T
minBprod=F
standard
7
5
3
3
standard_shift
7
5
N/A
N/A
generalized
7
5
5
3
generalized_shift
7
7
N/A
N/A
buckling
7
7
N/A
N/A
Further, the user must also store the converged eigenvectors \(X\), and
(for generalised problems) their \(B\)-images \(BX\) using
separate storage, e.g. X[mep][n], BX[mep][n]
.
For convienence of notation we use the convention that x[i:j]
denotes indices i through j (inclusive) of the vector x[].
In addition to being output, the routine may need to
reorthagonalise against these from time to time.
-
void spral_ssmfe_default_options(struct spral_ssmfe_options *options)
Intialises members of options structure to default values.
- Parameters:
options – Structure to be initialised.
-
void spral_ssmfe_expert_standard_double(struct spral_ssmfe_rcid *rci, int left, int mep, double *lambda, int m, double *rr, int *ind, void **keep, const struct spral_ssmfe_options *options, struct spral_ssmfe_inform *inform)
Computes the left-most eigenpairs of the standard eigenvalue problem
\[Ax = \lambda x\]Optionally uses preconditioning.
Uses reverse-communication. Upon return the user must perform a task specified by the rci parameter and recall the routine. Possible values of rci and associated tasks are:
rci.job
Task to be performed
-3
None. Fatal error, see
inform.flag
.-2
Failed to converge, see
inform.flag
.-1
None. Computation complete.
1
Calculate \(\bar{V} = AU\).
2
Apply preconditioner \(\bar{V} = TU\). (Copy if T=I).
5
Copy converged eigenvectors \(X\) to user storage:
If rci.i>0:
W[rci.kx][rci.jx:rci.jx+rci.nx-1][:]
.Else:
W[rci.kx][rci.jx-rci.nx+1:rci.jx][:]
.
11
If rci.i.eq.0, copy \(\bar{V} = U\).
Otherwise, reorder columns of block rci.kx such that column ind[j] becomes the new column j for j=1, …, rci.nx
Note: if
rci.kx==rci.ky
, only reorder once.12
Compute the dot products
\[r_{ii} = U_i \cdot \bar{V}_i\]13
Perform the scalings
\[U_i = U_i/\sqrt{(U_i\cdot \bar{V}_i)}\]and
\[\bar{V}_i = \bar{V}_i/\sqrt{(U_i\cdot \bar{V}_i)}\]for each column \(U_i\) and \(\bar{V}_i\) of \(U\) and \(\bar{V}\).
Note: if
rci.kx == rci.ky
, only scale once.14
Perform the updates
\[\bar{V}_i = \bar{V}_i + r_{ii} U_i\]for each column \(\bar{V}_i\) of \(\bar{V}\)
15
Perform the update
\[R = \alpha U^T V + \beta R\]16
Perform the update
\[V = \alpha U R + \beta V\]17
Perform the update
\[U = \alpha U R\]Note: \(V\) may be used as a workspace
21
Orthogonalize columns of \(V\) to all vectors \(X\) by solving
\[(X^TX) Q = X^T \bar{V}\]for \(Q\) and updating
\[U = U - XQ\]22
Orthogonalize columns of \(U\) to all vectors \(X\) by solving
\[(X^TX) Q = X^T U\]for \(Q\) and updating
\[U = U - XQ\]999
Restart:
If rci.k>0: Restart suggested with block size m >= rci.nx + rci.i + rci.j, adjusting workspace size to match. Set rci.i=0 and rci.j=0 and recall the routine. If a restart is not desirable, routine may be recalled with no change to parameters.
If rci.k=0: Restart required with the same block size.
In both cases, the first block
W[0][:][:]
should retain vectorsrci.jx:rci.jx+rci.nx-1
, filling remaining vectors randomly such that the entire set of columns is linearly independent from each other and also from the converged eigenvectors.The matrices are defined as follows:
\(U\) =
W[rci.kx][rci.jx:rci.jx+rci.nx-1][:]
\(V\) =
W[rci.ky][rci.jy:rci.jy+rci.ny-1][:]
\(\bar{V}\) =
W[rci.ky][rci.jy:rci.jy+rci.nx-1][:]
\(R\) =
rr[rci.k][rci.j:rci.j+rci.ny-1][rci.i:rci.i+rci.nx-1]
and \(\alpha\) and \(\beta\) are given by
rci.alpha
andrci.beta
respectively. We use the notation \(r_{ii}\) to refer to the \(i\)-th diagonal element of \(R\), beingrr[rci.k][rci.j+i-1][rci.i+i-1]
.- Parameters:
rci – Reverse communication type.
rci.job
must be set to 0 before the first call.left – Number of left eigenpairs to find.
mep – Number of working eigenpairs. See method section for guidance on selecting a good value. Must be at least left.
lambda[mep] – Current eigenvalue estimates in ascending order.
m – Block size of workspace W. Must be at least 2.
rr[3][2*m][2*m] – reverse communication workspace.
ind[m] – reverse communication workspace.
keep – Internal workspace used by routine.
options – specifies algorithm options to be used.
inform – returns information about the exection of the routine.
-
void spral_ssmfe_expert_standard_double_complex(struct spral_ssmfe_rciz *rci, int left, int mep, double *lambda, int m, double complex *rr, int *ind, void **keep, const struct spral_ssmfe_options *options, struct spral_ssmfe_inform *inform)
As
spral_ssmfe_expert_standard_double()
, but types ofrci
, andrr
changed to support typedouble complex
.
-
void spral_ssmfe_expert_standard_shift_double(struct spral_ssmfe_rcid *rci, double sigma, int left, int right, int mep, double *lambda, int m, double *rr, int *ind, void **keep, const struct spral_ssmfe_options *options, struct spral_ssmfe_inform *inform)
Computes eigenpairs of the standard eigenvalue problem
\[Ax = \lambda x\]in the vicinity of a given value \(sigma\).
Uses reverse-communication. Upon return the user must perform a task specified by the rci parameter and recall the routine. Possible values of rci and associated tasks are:
rci.job
Task to be performed
-3
None. Fatal error, see
inform.flag
.-2
Failed to converge, see
inform.flag
.-1
None. Computation complete.
2
Apply preconditioner \(\bar{V} = TU\). (Copy if T=I).
5
Copy converged eigenvectors \(X\) to user storage:
If rci.i>0:
W[rci.kx][rci.jx:rci.jx+rci.nx-1][:]
.Else:
W[rci.kx][rci.jx-rci.nx+1:rci.jx][:]
.
9
Compute \(V = (A-\sigma I)^{-1} U\)
11
If rci.i == 0, copy \(\bar{V} = U\).
Otherwise, reorder columns of block rci.kx such that column ind[j] becomes the new column j for j=0, …, rci.nx-1
Note: if
rci.kx == rci.ky
, only reorder once.12
Compute the dot products
\[r_{ii} = U_i \cdot \bar{V}_i\]13
Perform the scalings
\[U_i = U_i/\sqrt{(U_i\cdot \bar{V}_i)}\]and
\[\bar{V}_i = \bar{V}_i/\sqrt{(U_i\cdot \bar{V}_i)}\]for each column \(U_i\) and \(\bar{V}_i\) of \(U\) and \(\bar{V}\).
Note: if
rci.kx == rci.ky
, only scale once.14
Perform the updates
\[\bar{V}_i = \bar{V}_i + r_{ii} U_i\]for each column \(\bar{V}_i\) of \(\bar{V}\)
15
Perform the update
\[R = \alpha U^T V + \beta R\]16
Perform the update
\[V = \alpha U R + \beta V\]17
Perform the update
\[U = \alpha U R\]Note: \(V\) may be used as a workspace
21
Orthogonalize columns of \(V\) to all vectors \(X\) by solving
\[(X^TX) Q = X^T \bar{V}\]for \(Q\) and updating
\[U = U - XQ\]22
Orthogonalize columns of \(U\) to all vectors \(X\) by solving
\[(X^TX) Q = X^T U\]for \(Q\) and updating
\[U = U - XQ\]999
Restart:
If rci.k>0: Restart suggested with block size m >= rci.nx + rci.i + rci.j, adjusting workspace size to match. Set rci.i=0 and rci.j=0 and recall the routine. If a restart is not desirable, routine may be recalled with no change to parameters.
If rci.k=0: Restart required with the same block size.
In both cases, the first block
W[0][:][:]
should retain vectorsrci.jx:rci.jx+rci.nx-1
, filling remaining vectors randomly such that the entire set of columns is linearly independent from each other and also from the converged eigenvectors.The matrices are defined as follows:
\(U\) =
W[rci.kx][rci.jx:rci.jx+rci.nx-1][:]
\(V\) =
W[rci.ky][rci.jy:rci.jy+rci.ny-1][:]
\(\bar{V}\) =
W[rci.ky][rci.jy:rci.jy+rci.nx-1][:]
\(R\) =
rr[rci.k][rci.j:rci.j+rci.ny-1][rci.i:rci.i+rci.nx-1]
and \(\alpha\) and \(\beta\) are given by
rci.alpha
andrci.beta
respectively. We use the notation \(r_{ii}\) to refer to the \(i\)-th diagonal element of \(R\), beingrr[rci.k][rci.j+i-1][rci.i+i-1]
.- Parameters:
rci – Reverse communication type.
rci.job
must be set to 0 before the first call.sigma – Shift value \(sigma\).
left – Number of left eigenpairs to find.
right – Number of right eigenpairs to find.
mep – Number of working eigenpairs. See method section for guidance on selecting a good value. Must be at least left+right.
lambda[mep] – Current eigenvalue estimates in ascending order.
m – Block size of workspace W. Must be at least 2.
rr[3][2*m][2*m] – reverse communication workspace.
ind[m] – reverse communication workspace.
keep – Internal workspace used by routine.
options – specifies algorithm options to be used.
inform – returns information about the exection of the routine.
-
void spral_ssmfe_expert_standard_shift_double_complex(struct spral_ssmfe_rciz *rci, double sigma, int left, int right, int mep, double *lambda, int m, double complex *rr, int *ind, void **keep, const struct spral_ssmfe_options *options, struct spral_ssmfe_inform *inform)
As
spral_ssmfe_expert_standard_shift_double()
, but types ofrci
, andrr
changed to support typedouble complex
.
-
void spral_ssmfe_expert_generalized_double(struct spral_ssmfe_rcid *rci, int left, int mep, double *lambda, int m, double *rr, int *ind, void **keep, const struct spral_ssmfe_options *options, struct spral_ssmfe_inform *inform)
Computes the left-most eigenpairs of the generalized eigenvalue problem
\[Ax = \lambda B x\]Optionally uses preconditioning.
Uses reverse-communication. Upon return the user must perform a task specified by the rci parameter and recall the routine. Possible values of rci and associated tasks are:
rci.job
Task to be performed
-3
None. Fatal error, see
inform.flag
.-2
Failed to converge, see
inform.flag
.-1
None. Computation complete.
1
Calculate \(\bar{V} = AU\).
2
Apply preconditioner \(\bar{V} = TU\). (Copy if T=I).
3
Compute \(\bar{V} = BU\)
5
Copy converged eigenvectors \(X\) to user storage:
If rci.i>0:
W[rci.kx][rci.jx:rci.jx+rci.nx-1][:]
.Else:
W[rci.kx][rci.jx-rci.nx+1:rci.jx][:]
.
Optionally save their \(B\)-images:
If rci.i>0:
W[rci.ky][rci.jx:rci.jx+rci.nx-1][:]
.Else:
W[rci.ky][rci.jx-rci.nx+1:rci.jx][:]
.
11
If rci.i == 0, copy \(\bar{V} = U\).
Otherwise, reorder columns of block rci.kx such that column ind[j] becomes the new column j for j=0, …, rci.nx-1
Note: if
rci.kx == rci.ky
, only reorder once.12
Compute the dot products
\[r_{ii} = U_i \cdot \bar{V}_i\]13
Perform the scalings
\[U_i = U_i/\sqrt{(U_i\cdot \bar{V}_i)}\]and
\[\bar{V}_i = \bar{V}_i/\sqrt{(U_i\cdot \bar{V}_i)}\]for each column \(U_i\) and \(\bar{V}_i\) of \(U\) and \(\bar{V}\).
Note: if
rci.kx == rci.ky
, only scale once.14
Perform the updates
\[\bar{V}_i = \bar{V}_i + r_{ii} U_i\]for each column \(\bar{V}_i\) of \(\bar{V}\)
15
Perform the update
\[R = \alpha U^T V + \beta R\]16
Perform the update
\[V = \alpha U R + \beta V\]17
Perform the update
\[U = \alpha U R\]Note: \(V\) may be used as a workspace
21
\(B\)-orthogonalize columns of \(V\) to all vectors \(X\) by solving
\[(X^TBX) Q = X^T \bar{V}\]for \(Q\) and updating
\[\begin{split}U & = & U - XQ \\ \bar{V} & = & \bar{V} - BXQ\end{split}\]The update of \(\bar{V}\) may be replaced by
\[\bar{V} = BU\]22
\(B\)-orthogonalize columns of \(U\) to all vectors \(X\) by solving
\[(X^TBX) Q = X^T U\]for \(Q\) and updating
\[U = U - BXQ\]999
Restart:
If rci.k>0: Restart suggested with block size m >= rci.nx + rci.i + rci.j, adjusting workspace size to match. Set rci.i=0 and rci.j=0 and recall the routine. If a restart is not desirable, routine may be recalled with no change to parameters.
If rci.k=0: Restart required with the same block size.
In both cases, the first block
W[0][:][:]
should retain vectorsrci.jx:rci.jx+rci.nx-1
, filling remaining vectors randomly such that the entire set of columns is linearly independent from each other and also from the converged eigenvectors.The matrices are defined as follows:
\(U\) =
W[rci.kx][rci.jx:rci.jx+rci.nx-1][:]
\(V\) =
W[rci.ky][rci.jy:rci.jy+rci.ny-1][:]
\(\bar{V}\) =
W[rci.ky][rci.jy:rci.jy+rci.nx-1][:]
\(R\) =
rr[rci.k][rci.j:rci.j+rci.ny-1][rci.i:rci.i+rci.nx-1]
and \(\alpha\) and \(\beta\) are given by
rci.alpha
andrci.beta
respectively. We use the notation \(r_{ii}\) to refer to the \(i\)-th diagonal element of \(R\), beingrr[rci.k][rci.j+i-1][rci.i+i-1]
.- Parameters:
rci – Reverse communication type.
rci.job
must be set to 0 before the first call.left – Number of left eigenpairs to find.
mep – Number of working eigenpairs. See method section for guidance on selecting a good value. Must be at least left.
lambda[mep] – Current eigenvalue estimates in ascending order.
m – Block size of workspace W. Must be at least 2.
rr[3][2*m][2*m] – reverse communication workspace.
ind[m] – reverse communication workspace.
keep – Internal workspace used by routine.
options – specifies algorithm options to be used.
inform – returns information about the exection of the routine.
-
void spral_ssmfe_expert_generalized_double_complex(struct spral_ssmfe_rciz *rci, int left, int mep, double *lambda, int m, double complex *rr, int *ind, void **keep, const struct spral_ssmfe_options *options, struct spral_ssmfe_inform *inform)
As
spral_ssmfe_expert_generalized_double()
, but types ofrci
andrr
changed to support typedouble complex
.
-
void spral_ssmfe_expert_generalized_shift_double(struct spral_ssmfe_rcid *rci, double sigma, int left, int right, int mep, double *lambda, int m, double *rr, int *ind, void **keep, const struct spral_ssmfe_options *options, struct spral_ssmfe_inform *inform)
Computes eigenpairs of the generalized eigenvalue problem
\[Ax = \lambda B x\]in the vicinity of a given value \(\sigma\).
Uses reverse-communication. Upon return the user must perform a task specified by the rci parameter and recall the routine. Possible values of rci and associated tasks are:
rci.job
Task to be performed
-3
None. Fatal error, see
inform.flag
.-2
Failed to converge, see
inform.flag
.-1
None. Computation complete.
3
Compute \(\bar{V} = BU\)
5
Copy converged eigenvectors \(X\) to user storage:
If rci.i>0:
W[rci.kx][rci.jx:rci.jx+rci.nx-1][:]
.Else:
W[rci.kx][rci.jx-rci.nx+1:rci.jx][:]
.
Optionally save their \(B\)-images:
If rci.i>0:
W[rci.ky][rci.jx:rci.jx+rci.nx-1][:]
.Else:
W[rci.ky][rci.jx-rci.nx+1:rci.jx][:]
.
9
Compute \(V = (A-\sigma B)^{-1} U\)
11
If rci.i == 0, copy \(\bar{V} = U\).
Otherwise, reorder columns of block rci.kx such that column ind[j] becomes the new column j for j=0, …, rci.nx-1
Note: if
rci.kx == rci.ky
, only reorder once.12
Compute the dot products
\[r_{ii} = U_i \cdot \bar{V}_i\]13
Perform the scalings
\[U_i = U_i/\sqrt{(U_i\cdot \bar{V}_i)}\]and
\[\bar{V}_i = \bar{V}_i/\sqrt{(U_i\cdot \bar{V}_i)}\]for each column \(U_i\) and \(\bar{V}_i\) of \(U\) and \(\bar{V}\).
Note: if
rci.kx == rci.ky
, only scale once.14
Perform the updates
\[\bar{V}_i = \bar{V}_i + r_{ii} U_i\]for each column \(\bar{V}_i\) of \(\bar{V}\)
15
Perform the update
\[R = \alpha U^T V + \beta R\]16
Perform the update
\[V = \alpha U R + \beta V\]17
Perform the update
\[U = \alpha U R\]Note: \(V\) may be used as a workspace
21
\(B\)-orthogonalize columns of \(V\) to all vectors \(X\) by solving
\[(X^TBX) Q = X^T \bar{V}\]for \(Q\) and updating
\[\begin{split}U & = & U - XQ \\ \bar{V} & = & \bar{V} - BXQ\end{split}\]The update of \(\bar{V}\) may be replaced by
\[\bar{V} = BU\]22
\(B\)-orthogonalize columns of \(U\) to all vectors \(X\) by solving
\[(X^TBX) Q = X^T U\]for \(Q\) and updating
\[U = U - BXQ\]999
Restart:
If rci.k>0: Restart suggested with block size m >= rci.nx + rci.i + rci.j, adjusting workspace size to match. Set rci.i=0 and rci.j=0 and recall the routine. If a restart is not desirable, routine may be recalled with no change to parameters.
If rci.k=0: Restart required with the same block size.
In both cases, the first block
W[0][:][:]
should retain vectorsrci.jx:rci.jx+rci.nx-1
, filling remaining vectors randomly such that the entire set of columns is linearly independent from each other and also from the converged eigenvectors.The matrices are defined as follows:
\(U\) =
W[rci.kx][rci.jx:rci.jx+rci.nx-1][:]
\(V\) =
W[rci.ky][rci.jy:rci.jy+rci.ny-1][:]
\(\bar{V}\) =
W[rci.ky][rci.jy:rci.jy+rci.nx-1][:]
\(R\) =
rr[rci.k][rci.j:rci.j+rci.ny-1][rci.i:rci.i+rci.nx-1]
and \(\alpha\) and \(\beta\) are given by
rci.alpha
andrci.beta
respectively. We use the notation \(r_{ii}\) to refer to the \(i\)-th diagonal element of \(R\), beingrr[rci.k][rci.j+i-1][rci.i+i-1]
.- Parameters:
rci – Reverse communication type.
rci.job
must be set to 0 before the first call.sigma – Shift value \(sigma\).
left – Number of left eigenpairs to find.
right – Number of right eigenpairs to find.
mep – Number of working eigenpairs. See method section for guidance on selecting a good value. Must be at least left+right.
lambda[mep] – Current eigenvalue estimates in ascending order.
m – Block size of workspace W. Must be at least 2.
rr[3][2*m][2*m] – reverse communication workspace.
ind[m] – reverse communication workspace.
keep – Internal workspace used by routine.
options – specifies algorithm options to be used.
inform – returns information about the exection of the routine.
-
void spral_ssmfe_expert_generalized_shift_double_complex(struct spral_ssmfe_rciz *rci, double sigma, int left, int right, int mep, double *lambda, int m, double complex *rr, int *ind, void **keep, const struct spral_ssmfe_options *options, struct spral_ssmfe_inform *inform)
As
spral_ssmfe_expert_generalized_shift_double()
, but types ofrci
, andrr
changed to support typedouble complex
.
-
void spral_ssmfe_expert_buckling_double(struct spral_ssmfe_rcid *rci, double sigma, int left, int right, int mep, double *lambda, int m, double *rr, int *ind, void **keep, const struct spral_ssmfe_options *options, struct spral_ssmfe_inform *inform)
Computes the eigenpairs of the buckling problem
\[Bx = \lambda A x\]in the vicinity of a given value \(\sigma\).
Uses reverse-communication. Upon return the user must perform a task specified by the rci parameter and recall the routine. Possible values of rci and associated tasks are:
rci.job
Task to be performed
-3
None. Fatal error, see
inform.flag
.-2
Failed to converge, see
inform.flag
.-1
None. Computation complete.
3
Compute \(\bar{V} = BU\)
5
Copy converged eigenvectors \(X\) to user storage:
If rci.i>0:
W[rci.kx][rci.jx:rci.jx+rci.nx-1][:]
.Else:
W[rci.kx][rci.jx-rci.nx+1:rci.jx][:]
.
Optionally save their \(B\)-images:
If rci.i>0:
W[rci.ky][rci.jx:rci.jx+rci.nx-1][:]
.Else:
W[rci.ky][rci.jx-rci.nx+1:rci.jx][:]
.
9
Compute \(V = (B-\sigma A)^{-1} U\)
11
If rci.i == 0, copy \(\bar{V} = U\).
Otherwise, reorder columns of block rci.kx such that column ind[j] becomes the new column j for j=0, …, rci.nx-1
Note: if
rci.kx == rci.ky
, only reorder once.12
Compute the dot products
\[r_{ii} = U_i \cdot \bar{V}_i\]13
Perform the scalings
\[U_i = U_i/\sqrt{(U_i\cdot \bar{V}_i)}\]and
\[\bar{V}_i = \bar{V}_i/\sqrt{(U_i\cdot \bar{V}_i)}\]for each column \(U_i\) and \(\bar{V}_i\) of \(U\) and \(\bar{V}\).
Note: if
rci.kx == rci.ky
, only scale once.14
Perform the updates
\[\bar{V}_i = \bar{V}_i + r_{ii} U_i\]for each column \(\bar{V}_i\) of \(\bar{V}\)
15
Perform the update
\[R = \alpha U^T V + \beta R\]16
Perform the update
\[V = \alpha U R + \beta V\]17
Perform the update
\[U = \alpha U R\]Note: \(V\) may be used as a workspace
21
\(B\)-orthogonalize columns of \(V\) to all vectors \(X\) by solving
\[(X^TBX) Q = X^T \bar{V}\]for \(Q\) and updating
\[\begin{split}U & = & U - XQ \\ \bar{V} & = & \bar{V} - BXQ\end{split}\]The update of \(\bar{V}\) may be replaced by
\[\bar{V} = BU\]22
\(B\)-orthogonalize columns of \(U\) to all vectors \(X\) by solving
\[(X^TBX) Q = X^T U\]for \(Q\) and updating
\[U = U - BXQ\]999
Restart:
If rci.k>0: Restart suggested with block size m >= rci.nx + rci.i + rci.j, adjusting workspace size to match. Set rci.i=0 and rci.j=0 and recall the routine. If a restart is not desirable, routine may be recalled with no change to parameters.
If rci.k=0: Restart required with the same block size.
In both cases, the first block
W[0][:][:]
should retain vectorsrci.jx:rci.jx+rci.nx-1
, filling remaining vectors randomly such that the entire set of columns is linearly independent from each other and also from the converged eigenvectors.The matrices are defined as follows:
\(U\) =
W[rci.kx][rci.jx:rci.jx+rci.nx-1][:]
\(V\) =
W[rci.ky][rci.jy:rci.jy+rci.ny-1][:]
\(\bar{V}\) =
W[rci.ky][rci.jy:rci.jy+rci.nx-1][:]
\(R\) =
rr[rci.k][rci.j:rci.j+rci.ny-1][rci.i:rci.i+rci.nx-1]
and \(\alpha\) and \(\beta\) are given by
rci.alpha
andrci.beta
respectively. We use the notation \(r_{ii}\) to refer to the \(i\)-th diagonal element of \(R\), beingrr[rci.k][rci.j+i-1][rci.i+i-1]
.- Parameters:
rci – Reverse communication type.
rci.job
must be set to 0 before the first call.sigma – Shift value \(sigma\).
left – Number of left eigenpairs to find.
right – Number of right eigenpairs to find.
mep – Number of working eigenpairs. See method section for guidance on selecting a good value. Must be at least left+right.
lambda[mep] – Current eigenvalue estimates in ascending order.
m – Block size of workspace W. Must be at least 2.
rr[3][2*m][2*m] – reverse communication workspace.
ind[m] – reverse communication workspace.
keep – Internal workspace used by routine.
options – specifies algorithm options to be used.
inform – returns information about the exection of the routine.
-
void spral_ssmfe_expert_buckling_double_complex(struct spral_ssmfe_rciz *rci, double sigma, int left, int right, int mep, double *lambda, int m, double complex *rr, int *ind, void **keep, const struct spral_ssmfe_options *options, struct spral_ssmfe_inform *inform)
As
spral_ssmfe_expert_buckling_double()
, but types ofrci
, andrr
changed to support typedouble complex
.
-
void spral_ssmfe_expert_free(void **keep, struct spral_ssmfe_inform *inform)
Free memory allocated in keep and inform.
- Parameters:
keep – Workspace to be freed.
inform – Information type to be freed.
Warning
As memory in
keep
andinform
has been allocated using Fortran functions, this routine must be called to avoid a memory leak.
Derived types
- struct spral_ssmfe_rcid
Real-valued reverse communication interface (RCI) type.
-
int job
Reverse-communication task to perform.
-
int jx
First column of \(U\) in block.
-
int kx
Block to which \(U\) belongs.
-
int nx
Number of columns in \(U\) and \(\bar{V}\), and number of rows in \(R\).
-
int jy
First column of \(V\) in block.
-
int ky
Block to which \(V\) belongs.
-
int ny
Number of columns in \(V\) and \(R\).
-
int i
First row of \(R\) in
rr(:,:,:)
.
-
int j
First column of \(R\) in
rr(:,:,:)
.
-
int k
Block of \(R\) in
rr(:,:,:)
.
-
double alpha
Coefficient for matrix multiplication.
-
double beta
Coefficient for matrix multiplication.
-
int job
- struct spral_ssmfe_rciz
Complex-valued reverse communication interface (RCI) type.
-
int job
Reverse-communication task to perform.
-
int jx
First column of \(U\) in block.
-
int kx
Block to which \(U\) belongs.
-
int nx
Number of columns in \(U\) and \(\bar{V}\), and number of rows in \(R\).
-
int jy
First column of \(V\) in block.
-
int ky
Block to which \(V\) belongs.
-
int ny
Number of columns in \(V\) and \(R\).
-
int i
First row of \(R\) in
rr(:,:,:)
.
-
int j
First column of \(R\) in
rr(:,:,:)
.
-
int k
Block of \(R\) in
rr(:,:,:)
.
-
double complex alpha
Coefficient for matrix multiplication.
-
double complex beta
Coefficient for matrix multiplication.
-
int job
- struct spral_ssmfe_options
Options that control the algorithm.
-
double abs_tol_lambda
Absolute tolerance for estimated eigenvalue convergence test, see method section. Negative values are treated as the default. Default is 0.0.
-
double abs_tol_residual
Absolute tolerance for residual convergence test, see method section. Negative values are treated as the default. Default is 0.0.
-
int max_iterations
Maximum number of iterations. Default is 100.
-
double rel_tol_lambda
Relative tolerance for estimated eigenvalue error convergence test, see method section. Negative values are treated as the default. Default is 0.0.
-
double rel_tol_residual
Relative tolerance for residual convergence test, see method section. If both abs_tol_residual and rel_tol_residual are 0.0, then the residual norms are not taken into consideration by the convergence test. Negative values are treated as the default. Default is 0.0.
-
double tol_x
Tolerance for estimated eigenvector error convergence test, see method section. If tol_x is set to 0.0, the eigenvector error is not estimated. If a negative value is assigned, the tolerance is set to sqrt(DBL_EPSILON). Default is -1.0.
-
int print_level
Amount of printing. Possible values are:
<0
no printing
0
error and warning messages only
1
the type (standard or generalized) and the size of the problem, the number of eigenpairs requested, the error tolerances and the size of the subspace are printed before the iterations start
2
as above but, for each eigenpair tested for convergence, the iteration number, the index of the eigenpair, the eigenvalue, whether it has converged, the residual norm, and the error estimates are also printed
>2
as 1 but with all eigenvalues, whether converged, residual norms and eigenvalue/eigenvector error estimates printed on each iteration.
Note that for eigenpairs that are far from convergence, ‘rough’ error estimates are printed (the estimates that are actually used by the stopping criteria, see Section [ssmfe:method], only become available on the last few iterations).
Default is 0.
-
int unit_error
Fortran unit number for error messages. Printing suppressed if negative. Default is 6 (stdout).
-
int unit_diagnostic
Fortran unit number for diagnostic messages. Printing suppressed if negative. Default is 6 (stdout).
-
int unit_warning
Fortran unit number for warning messages. Printing suppressed if negative. Default is 6 (stdout).
-
int err_est
Error estimation scheme, one of:
1
Residual error bounds: modified Davis-Kahan estimate for eigenvector error and Lehmann bounds for eigenvale error (see method section).
2 (default)
Convergence curve-based estimate.
Default is 2.
-
int extra_left
Number of extra approximate eigenvectors corresponding to leftmost eigenvalues used to enhance convergence. Default is 0.
-
int extra_right
Number of extra approximate eigenvectors corresponding to rightmost eigenvalues used to enhance convergence. Default is 0.
-
double left_gap
Minimal acceptable distance between last computed left eigenvalue and rest of spectrum. For
spral_ssmfe_expert_standard_double()
andspral_ssmfe_expert_generalized_double()
the last computed left eigenvalue is the rightmost of those computed. For other routines it is the leftmost. If set to a negative value \(\delta\), the minimal distance is taken as \(|\delta|\) times the average distance between the computed eigenvalues. Note that for this option to have any effect, the value of mep must be larger than left+right. See method section for further explanation. Default is 0.
-
int max_left
Number of eigenvalues to left of \(\sigma\), or a negative value if not known. Default is -1.
-
int max_right
Number of eigenvalues to right of \(\sigma\), or a negative value if not known. Default is -1.
-
bool minAprod
If true, minimize number of multiplications with \(A\) by requiring 2 additional blocks of memory for the workspace
W[:][:][:]
. Must be true for calls tospral_ssmfe_expert_standard_shift_double()
,spral_ssmfe_expert_generalized_shift_double()
, andspral_ssmfe_expert_buckling_double()
. Default is true.
-
bool minBprod
If true, minimize number of multiplications with \(B\) by requiring 2 additional blocks of memory for the workspace
W[:][:][:]
. Default is true.
-
double right_gap
As
options.left_gap
, but for right eigenvalues. Default is 0.0.
-
int user_x
Number of eigenvectors for which an initial guess is supplied in x[:][:] on the first call. Such eigenvectors must be lineraly independent. Default is 0.
-
double abs_tol_lambda
- struct spral_ssmfe_inform
Information on progress of the algorithm.
-
int converged[mep]
Convergence status.
If
converged[j]>0
, the eigenpair (lambda(j), X(j)) converged on iteration converged[j].If
converged[j]=0
, the eigenpair (lambda(j), X(j)) is still converging.If
converged[j]<0
, the eigenpair (lambda(j), X(j)) stagnated at iteration -converged[j].
This component is allocated by the routine.
- double err_lambda[mep]:
Estimated eigenvalue errors for converged and stagnated eigenvalues. This component is allocated by the routine.
-
double err_x[mep]
Estimated eigenvector errors for converged and stagnated eigenvectors. This component is allocated by the routine.
-
int flag
Return status of algorithm. See table below.
-
int iteration
Number of iterations.
-
int left
Number of converged left eigenvalues.
-
double next_left
Upon completion, next left eigenvalue in spectrum (see options.left_gap).
-
double next_right
Upon completion, next right eigenvalue in spectrum (see options.right_gap).
-
double residual_norms[mep]
Euclidean norms of residuals for (lambda[:], X[:]) on return with
rci.job=5
. This component is allocated by the routine.
-
int non_converged
Number of non-converged eigenpairs.
-
int right
Number of converged right eigenvalues.
-
int stat
Fortran allocation status in event of failure.
inform.flag
-1
rci.job is out-of-range.
-2
m is out-of-range.
-3
options.err_est is out-of-range.
-4
options.minAprod is incompatible with selected routine.
-5
options.extra_left or options.extra_right is out-of-range.
-6
options.min_gap is out-of-range.
-11
left is out-of-range.
-12
right is out-of-range.
-13
mep is less than the number of desired eigenpairs.
-100
Not enough memory; inform.stat contains the value of the Fortran stat parameter.
-200
\(B\) is not positive definite or user_x>0 and linearly dependent initial guesses were supplied.
+1
The iterations have been terminated because no further improvement in accuracy is possible (this may happen if \(B\) or the preconditioner is not positive definite, or if the components of the residual vectors are so small that the round-off errors make them essentially random). The value of inform.non_converged is set to the number of non-converged eigenpairs.
+2
The maximum number of iterations max_iterations has been exceeded. The value of inform.non_converged is set to the number of non-converged eigenpairs.
+3
The solver had run out of storage space for the converged eigenpairs before the gap in the spectrum required by options.left_gap and/or options.right_gap was reached. The value of inform.non_converged is set to the number of non-converged eigenpairs.
-
int converged[mep]
Examples
Preconditioning example
The following code computes the 5 leftmost eigenpairs of the matrix \(A\) of order 100 that approximates the two-dimensional Laplacian operator on a 20-by-20 grid. One forward and one backward Gauss-Seidel update are used for preconditioning, which halves the number of iterations compared with solving the same problem without preconditioning. The header laplace2d.h (examples/C/ssmfe/laplace2d.h) supplies the subroutine apply_laplacian() that multiplies a block of vectors by \(A\), and the subroutine apply_gauss_seidel_step() that computes \(y = T x\) for a block of vectors \(x\) by applying one forward and one backward update of the Gauss-Seidel method to the system \(A y = x\).
/* examples/C/ssmfe/precond_expert.c */
/* Laplacian on a square grid (using SPRAL_SSMFE_EXPERT routines) */
#include "spral.h"
#include <math.h>
#include <stdlib.h>
#include <stdio.h>
#include <cblas.h>
/* Header that implements Laplacian and preconditioners */
#include "laplace2d.h"
int main(void) {
const int ngrid = 20; /* grid points along each side */
const int n = ngrid*ngrid; /* problem size */
const int nep = 5; /* eigenpairs wanted */
const int m = 3; /* dimension of the iterated subspace */
int state = SPRAL_RANDOM_INITIAL_SEED; /* PRNG state */
int *ind = malloc(m * sizeof(*ind)); /* permutation index */
double *lambda = malloc(n * sizeof(*lambda)); /* eigenvalues */
double (*X)[n][n] = malloc(sizeof(*X)); /* eigenvectors */
/* Work arrays */
double (*rr)[3][2*m][2*m] = malloc(sizeof(*rr));
double (*W)[6][m][n] = malloc(sizeof(*W));
double (*U)[m][n] = malloc(sizeof(*U));
/* Derived types */
struct spral_ssmfe_rcid rci; /* reverse communication data */
struct spral_ssmfe_options options; /* options */
void *keep; /* private data */
struct spral_ssmfe_inform inform; /* information */
/* Initialize options to default values */
spral_ssmfe_default_options(&options);
/* gap between the last converged eigenvalue and the rest of the spectrum
* must be at least 0.1 times average gap between computed eigenvalues */
options.left_gap = -0.1;
/* block size is small, convergence may be slow, allow more iterations */
options.max_iterations = 200;
/* Initialize W to lin indep vectors by randomizing */
for(int i=0; i<n; i++)
for(int j=0; j<m; j++)
(*W)[0][j][i] = spral_random_real(&state, true);
int ncon = 0; /* number of converged eigenpairs */
rci.job = 0; keep = NULL;
while(true) { /* reverse communication loop */
spral_ssmfe_expert_standard_double(&rci, nep, n, lambda, m, &(*rr)[0][0][0],
ind, &keep, &options, &inform);
switch ( rci.job ) {
case 1:
apply_laplacian(
ngrid, ngrid, rci.nx, &(*W)[rci.kx][rci.jx][0], &(*W)[rci.ky][rci.jy][0]
);
break;
case 2:
apply_gauss_seidel_step (
ngrid, ngrid, rci.nx, &(*W)[rci.kx][rci.jx][0], &(*W)[rci.ky][rci.jy][0]
);
break;
case 5:
if ( rci.i < 0 ) continue;
for(int k=0; k<rci.nx; k++) {
int j = ncon + k;
cblas_dcopy( n, &(*W)[0][rci.jx+k][0], 1, &(*X)[j][0], 1 );
}
ncon += rci.nx;
break;
case 11:
if ( rci.i == 0 ) {
if ( rci.kx != rci.ky || rci.jx > rci.jy ) {
cblas_dcopy(n*rci.nx, &(*W)[rci.kx][rci.jx][0], 1, &(*W)[rci.ky][rci.jy][0], 1);
} else if ( rci.jx < rci.jy ) {
for(int j=rci.nx-1; j>=0; j--)
cblas_dcopy(n, &(*W)[rci.kx][rci.jx+j][0], 1, &(*W)[rci.ky][rci.jy+j][0], 1);
}
} else {
for(int i=0; i<n; i++) {
for(int j=0; j<rci.nx; j++)
(*U)[j][i] = (*W)[rci.kx][ind[j]][i];
for(int j=0; j<rci.nx; j++)
(*W)[rci.kx][j][i] = (*U)[j][i];
if(rci.ky != rci.kx) {
for(int j=0; j<rci.nx; j++)
(*U)[j][i] = (*W)[rci.ky][ind[j]][i];
for(int j=0; j<rci.nx; j++)
(*W)[rci.ky][j][i] = (*U)[j][i];
}
}
}
break;
case 12:
for(int i=0; i<rci.nx; i++)
(*rr)[rci.k][rci.j+i][rci.i+i] =
cblas_ddot(n, &(*W)[rci.kx][rci.jx+i][0], 1, &(*W)[rci.ky][rci.jy+i][0], 1);
break;
case 13:
for(int i=0; i<rci.nx; i++) {
if( rci.kx == rci.ky ) {
double s = cblas_dnrm2(n, &(*W)[rci.kx][rci.jx+i][0], 1);
if( s > 0 )
cblas_dscal(n, 1/s, &(*W)[rci.kx][rci.jx+i][0], 1);
} else {
double s = sqrt(fabs(cblas_ddot(
n, &(*W)[rci.kx][rci.jx+i][0], 1, &(*W)[rci.ky][rci.jy+i][0], 1)
));
if ( s > 0 ) {
cblas_dscal(n, 1/s, &(*W)[rci.kx][rci.jx+i][0], 1);
cblas_dscal(n, 1/s, &(*W)[rci.ky][rci.jy+i][0], 1);
} else {
for(int j=0; j<n; j++)
(*W)[rci.ky][rci.jy+i][j] = 0.0;
}
}
}
break;
case 14:
for(int i=0; i<rci.nx; i++) {
double s = -(*rr)[rci.k][rci.j+i][rci.i+i];
cblas_daxpy(n, s, &(*W)[rci.kx][rci.jx+i][0], 1, &(*W)[rci.ky][rci.jy+i][0], 1);
}
break;
case 15:
if ( rci.nx > 0 && rci.ny > 0 )
cblas_dgemm(
CblasColMajor, CblasTrans, CblasNoTrans, rci.nx, rci.ny, n,
rci.alpha, &(*W)[rci.kx][rci.jx][0], n, &(*W)[rci.ky][rci.jy][0], n,
rci.beta, &(*rr)[rci.k][rci.j][rci.i], 2*m
);
break;
case 16: // Fall through to 17
case 17:
if( rci.ny < 1 ) continue;
if( rci.nx < 1 ) {
if( rci.job == 17 ) continue;
if( rci.beta == 1.0 ) continue;
for(int j=rci.jy; j<rci.jy+rci.ny; j++)
cblas_dscal(n, rci.beta, &(*W)[rci.ky][j][0], 1);
continue;
}
if( rci.job == 17 ) {
cblas_dgemm(
CblasColMajor, CblasNoTrans, CblasNoTrans, n, rci.ny, rci.nx,
1.0, &(*W)[rci.kx][rci.jx][0], n, &(*rr)[rci.k][rci.j][rci.i], 2*m,
0.0, &(*W)[rci.ky][rci.jy][0], n
);
cblas_dcopy(n*rci.ny, &(*W)[rci.ky][rci.jy][0], 1, &(*W)[rci.kx][rci.jx][0], 1);
} else {
cblas_dgemm(
CblasColMajor, CblasNoTrans, CblasNoTrans, n, rci.ny, rci.nx,
rci.alpha, &(*W)[rci.kx][rci.jx][0], n, &(*rr)[rci.k][rci.j][rci.i],
2*m, rci.beta, &(*W)[rci.ky][rci.jy][0], n
);
}
break;
case 21: // Fall through to 22
case 22:
if( ncon > 0 ) {
cblas_dgemm(
CblasColMajor, CblasTrans, CblasNoTrans, ncon, rci.nx, n,
1.0, &(*X)[0][0], n, &(*W)[rci.ky][rci.jy][0], n, 0.0, &(*U)[0][0], n
);
cblas_dgemm(
CblasColMajor, CblasNoTrans, CblasNoTrans, n, rci.nx, ncon,
-1.0, &(*X)[0][0], n, &(*U)[0][0], n, 1.0, &(*W)[rci.kx][rci.jx][0], n
);
}
break;
case 999:
if( rci.k == 0 ) {
if( rci.jx > 1 ) {
for(int j=0; j<rci.jx; j++)
for(int i=0; i<n; i++)
(*W)[0][j][i] = spral_random_real(&state, true);
}
}
break;
default:
goto finished;
}
}
finished:
if(inform.flag != 0) printf("inform.flag = %d\n", inform.flag);
printf("%3d eigenpairs converged in %d iterations\n", ncon, inform.iteration);
for(int i=0; i<ncon; i++)
printf(" lambda[%1d] = %13.7e\n", i, lambda[i]);
spral_ssmfe_expert_free(&keep, &inform);
free(ind);
free(lambda);
free(X);
free(rr);
free(W);
free(U);
/* Success */
return 0;
}
This code produces the following output:
6 eigenpairs converged in 129 iterations
lambda[0] = 4.4676695e-02
lambda[1] = 1.1119274e-01
lambda[2] = 1.1119274e-01
lambda[3] = 1.7770878e-01
lambda[4] = 2.2040061e-01
lambda[5] = 2.2040061e-01
Note that the code computed one extra eigenpair because of the insufficient gap between the 5th and 6th eigenvalues.
Method
The algorithm
The solver procedures of spral_ssmfe_expert
are interfaces to solver
procedures of spral_ssmfe_core
, which implement a block iterative
algorithm based on the Jacobi-conjugate preconditioned gradients method
[2], [3]. Further information on the algorithm used by
spral_ssmfe_expert
can be found in the specification document for
spral_ssmfe_core
and in [1].
Stopping criteria
An approximate eigenpair \(\{x,\lambda\}\) is considered to have converged if the following three conditions are all satisfied:
if options.abs_tol_lambda and options.rel_tol_lambda are not both equal to zero, then the estimated error in the approximate eigenvalue must be less than or equal to \(\max(\mathrm{options\.abs\_tol\_lambda}, \delta*\mathrm{options\.rel\_tol\_lambda})\), where \(\delta\) is the estimated average distance between eigenvalues.
if options.tol_x is not zero, then the estimated sine of the angle between the approximate eigenvector and the invariant subspace corresponding to the eigenvalue approximated by \(\lambda\) must be less than or equal to options.tol_x.
if options.abs_tol_residual and options.rel_tol_residual are not both equal to zero, then the Euclidean norm of the residual, \(\|A x - \lambda B x\|_2\), must be less than or equal to \(\max(\mathrm{options\.abs\_tol\_residual}, \mathrm{options\.rel\_tol\_residual}*\|\lambda B x\|_2)\).
The extra eigenpairs are not checked for convergence, as their role is purely auxiliary.
Improving eigenvector accuracy
If the gap between the last computed eigenvalue and the rest of the
spectrum is small, then the accuracy of the corresponding eigenvector
may be very low. To prevent this from happening, the user should set the
eigenpairs storage size mep to a value that is larger than the number of
desired eigenpairs, and set the options options.left_gap and
options.right_gap to non-zero values \(\delta_l\) and
\(\delta_r\). These values determine the size of the minimal
acceptable gaps between the computed eigenvalues and the rest of the
spectrum, \(\delta_l\) referring to either leftmost eigenvalues (for
ssmfe_standard()
and ssmfe_generalized()
only) or those
to the left of the shift sigma, and \(\delta_r\) to those to the right of
the shift sigma. Positive values of \(\delta_l\) and \(\delta_r\) set
the gap explicitly, and negative values require the gap to be not less than
their absolute value times the average distance between the computed
eigenvalues. A recommended value of \(\delta_l\) and
\(\delta_r\) is \(-0.1\). The value of mep has little effect on
the speed of computation, hence it might be set to any reasonably large
value. The larger the value of mep, the larger the size of an eigenvalue
cluster for which accurate eigenvectors can be computed, notably: to
safeguard against clusters of size up to \(k\), it is sufficient to
set mep to the number of desired eigenpairs plus \(k - 1\).
The use of shifted matrix factorization
When using the solver procedures that employ the shift-and-invert technique, it is very important to ensure that the numbers of desired eigenvalues each side of the shift do not exceed the actual numbers of these eigenvalues, as the eigenpairs ‘approximating’ non-existing eigenpairs of the problem will not converge. It is therefore strongly recommended that the user employs a linear system solver that performs the \(LDL^T\) factorization of the shifted system, e.g. HSL_MA97 or SPRAL_SSIDS. The \(LDL^T\) factorization of the matrix \(A - \sigma B\) consists in finding a lower triangular matrix \(L\), a block-diagonal matrix \(D\) with \(1\times 1\) and \(2\times 2\) blocks on the diagonal and a permutation matrix \(P\) such that \(P^T(A - \sigma B)P = L D L^T\). By the inertia theorem, the number of eigenvalues to the left and right from the shift \(\sigma\) is equal to the number of negative and positive eigenvalues of \(D\), which allows quick computation of the eigenvalue numbers each side of the shift.
Error estimation
Standard problem
If options.err_est=1
, the error estimates for the eigenvalues are
based on the eigenvalues of a matrix of the form
where \(\tilde\Lambda_k\) is a diagonal matrix with the \(k-1\) leftmost Ritz values \(\tilde\lambda_j\) on the diagonal, and the columns of \(S_k\) are the respective residual vectors \(r_j = A \tilde x_j - \tilde\lambda_j \tilde x_j\) divided by \(\sqrt{\lambda_k - \tilde\lambda_j}\). If \(k\) is such that \(\tilde\lambda_{k-1} < \lambda_k\), then the eigenvalues of \(\hat A\) are the left-hand side bounds for eigenvalues \(\lambda_i\), and thus the difference \(\tilde\lambda_j - \hat\lambda_j\) estimates the eigenvalue error \(\tilde\lambda_j - \lambda_j\). The unknown \(\lambda_k\) is replaced by \(\tilde\lambda_k\), and select the maximal \(k \le m\) for which the distance between \(\tilde\lambda_{k-1}\) and \(\tilde\lambda_k\) exceeds the sum of the absolute error tolerance for eigenvalues and the Frobenius norm of the matrix formed by the residuals \(r_j, j = 1, \ldots, k-1\). If \(\tilde\lambda_j - \hat\lambda_j\) is close to the machine accuracy, it may be too polluted by round-off errors to rely upon. In such case, we use instead
The eigenvector errors are estimated based on the Davis-Kahan inequality:
where \(\mathcal{X}_{k-1}\) is the invariant subspace corresponding to \(k-1\) leftmost eigenvalues.
If options.err_est=2
the errors are estimated based on the
eigenvalue decrements history, which produces an estimate for the
average eigenvalue error reduction per iteration, which in turn yields
error estimates for both eigenvalues and eigenvectors. Unlike the
residual estimates mentioned in this section, such ‘kinematic’ error
estimates are not guaranteed to be upper bounds for the actual errors.
However, the numerical tests have demonstrated that kinematic error
estimates are significantly more accurate, i.e. closer to the actual
error, than the residual-based estimates. Furthermore, they
straightforwardly apply to the generalized case as well.
Generalized problem
In the case of the generalized eigenvalue problem solved by iterations
with preconditioning, all of the residual norms in the previous section
must be replaced with \(\|\cdot\|_{B^{-1}}\)-norm of the residual
\(r_j = A \tilde x_j - \tilde\lambda_j B \tilde x_j\)
(\(\|r_j\|_{B^{-1}}^2 = r_j^* B^{-1} r_j\)) or its upper estimate,
e.g. \(\beta_1^{-1/2}\|\cdot\|\), where \(\beta_1\) is the
smallest eigenvalue of \(B\). Hence, if \(\beta_1\) is known,
then the error tolerances for eigenvalues and eigenvectors must be
multiplied by \(\beta_1\) and \(\sqrt{\beta_1}\) respectively.
If no estimate for \(\|\cdot\|_{B^{-1}}\)-norm is available, then
the use of non-zero residual tolerances and
options.err_est=1
is not recommended. In the case of
problems solved by iterations with shift-and-invert and the problem ,
the residuals are computed as
\(r_j = T B \tilde x_j - \tilde \lambda_j \tilde x_j\) where
\(T = (A - \sigma B)^{-1}\) for and \(T = (B - \sigma A)^{-1}\)
for , and \(B\)-norms of \(r_j\) are used, so that Lehmann
matrix becomes \(\hat A = \tilde\Lambda_k - S_k^T B\ S_k\). 0 Note
that the residual estimates may considerably overestimate the actual
error of direct iterations because of the use of the Euclidean norm of
the residual, which is too strong a norm for it when \(A\) is the
discretization of a differential operator.