SafeOps: Linear Algebra

SafeOps: Linear Algebra#

A collection of common mathematical functions.

This module contains a collection of batch-operable, back-propagatable mathematical functions.

tad_mctc.storch.linalg.eighb(a, b=None, scheme='chol', broadening_method='cond', factor=1e-12, sort_out=True, aux=True, is_posdef=False, **kwargs)[source]#

Solves general & standard eigen-problems, with optional broadening.

Solves standard and generalised eigenvalue problems of the form Az = λBz for a real symmetric matrix a and can apply conditional or Lorentzian broadening to the eigenvalues during the backwards pass to increase gradient stability. Multiple matrices may be passed in batch major form, i.e. the first axis iterates over entries.

Parameters:

a (array_like) – Real symmetric matrix whose eigen-values/vectors will be computed.
b (array_like) – Complementary positive definite real symmetric matrix for the generalised eigenvalue problem.
scheme (str, optional) – Scheme to convert generalised eigenvalue problems to standard ones. Options are:
- “chol”: Cholesky factorisation. [DEFAULT=’chol’]
- “lowd”: Löwdin orthogonalisation.
Has no effect on solving standard problems.
broadening_method (str, optional) – Broadening method to used. Options are:
- “cond”: conditional broadening. [DEFAULT=’cond’]
- “lorn”: Lorentzian broadening.
- None: no broadening (uses torch.linalg.eigh).
factor (float, optional) – The degree of broadening (broadening factor). [Default=1E-12]
sort_out (bool, optional) – If True, eigen-vector/value tensors are reordered so that any “ghost” entries are moved to the end. “Ghost” are values which emerge as a result of zero-padding. [DEFAULT=True]
aux (bool, optional) – Converts zero-padding to identity-padding. This can improve the stability of backwards propagation. [DEFAULT=True]
direct_inv (bool, optional) – If True then the matrix inversion will be computed directly rather than via a call to torch.solve. Only relevant to the cholesky scheme. [DEFAULT=False]

Returns:

w (ndarray) – The eigenvalues, in ascending order.
v (ndarray) – The eigenvectors.

Notes

Results from backward passes through eigen-decomposition operations tend to suffer from numerical stability issues when operating on systems with degenerate eigenvalues. Fortunately, the stability of such operations can be increased through the application of eigen value broadening. However, such methods will induce small errors in the returned gradients as they effectively mutate the eigen-values in the backwards pass. Thus, it is important to be aware that while increasing the extent of broadening will help to improve stability it will also increase the error in the gradients.

Two different broadening methods have been implemented within this class. Conditional broadening as described by Seeger [MS2019], and Lorentzian as detailed by Liao [LH2019]. During the forward pass the torch.symeig function is used to calculate both the eigenvalues & the eigenvectors (U & \(\lambda\) respectively). The gradient is then calculated following:

\[\bar{A} = U (\bar{\Lambda} + sym(F \circ (U^t \bar{U}))) U^T\]

Where bar indicates a value’s gradient, passed in from the previous layer, \(\Lambda\) is the diagonal matrix associated with the \(\bar{\lambda}\) values, \(\circ\) is the so called Hadamard product, \(sym\) is the symmetrisation operator and F is:

\[F_{i, j} = \frac{I_{i \ne j}}{h(\lambda_i - \lambda_j)}\]

Where, for conditional broadening, h is:

\[h(t) = max(|t|, \epsilon)sgn(t)\]

and for, Lorentzian broadening:

\[h(t) = \frac{t^2 + \epsilon}{t}\]

The advantage of conditional broadening is that it is only applied when needed, thus the errors induced in the gradients will be restricted to systems whose gradients would be nan’s otherwise. The Lorentzian method, on the other hand, will apply broadening to all systems, irrespective of whether or not it is necessary. Note that if the h function is a unity operator then this is identical to a standard backwards pass through an eigen-solver.

Mathematical discussions regarding the Cholesky decomposition are made with reference to the “Generalized Symmetric Definite Eigenproblems” chapter of Lapack. [Lapack]

When operating in batch mode the zero valued padding columns and rows will result in the generation of “ghost” eigen-values/vectors. These are mostly harmless, but make it more difficult to extract the actual eigen-values/vectors. This function will move the “ghost” entities to the ends of their respective lists, making it easy to clip them out.

Warning

If operating upon zero-padded packed tensors then degenerate and zero valued eigen values will be encountered. This will always cause an error during the backwards pass unless broadening is enacted.

As torch.symeig sorts its results prior to returning them, it is likely that any “ghost” eigen-values/vectors, which result from zero- padded packing, will be located in the middle of the returned arrays. This makes down-stream processing more challenging. Thus, the sort_out option is enabled by default. This results in the “ghost” values being moved to the end. However, this method identifies any entry with a zero-valued eigenvalue and an eigenvector which can be interpreted as a column of an identity matrix as a ghost.

References

[MS2019]

Seeger, M., Hetzel, A., Dai, Z., & Meissner, E. Auto- Differentiating Linear Algebra. ArXiv:1710.08717 [Cs, Stat], Aug. 2019. arXiv.org, http://arxiv.org/abs/1710.08717.

[LH2019]

Liao, H.-J., Liu, J.-G., Wang, L., & Xiang, T. (2019). Differentiable Programming Tensor Networks. Physical Review X, 9(3).

[Lapack]

www.netlib.org/lapack/lug/node54.html (Accessed 21/04/2023)

SafeOps: Linear Algebra

Contents

SafeOps: Linear Algebra#