14  Eigenvalues and Decompositions

Prerequisites: matrix multiplication and determinants.

Some vectors survive a matrix transformation with only a change in scale — these eigenvectors and their eigenvalues reveal the natural axes of a linear map. This chapter builds from eigenvalues through diagonalisation, the spectral theorem, the SVD, and positive definite matrices to a capstone example tying everything together.

(ns lalinea-book.eigenvalues-and-decompositions
  (:require
   ;; La Linea (https://github.com/scicloj/lalinea):
   [scicloj.lalinea.linalg :as la]
   [scicloj.lalinea.tensor :as t]
   [scicloj.lalinea.elementwise :as el]
   ;; Visualization annotations (https://scicloj.github.io/kindly-noted/):
   [scicloj.kindly.v4.kind :as kind]
   ;; Arrow diagrams for 2D vectors:
   [scicloj.lalinea.vis :as vis]))

Eigenvalues and eigenvectors

The idea

Most vectors change direction when multiplied by a matrix. But some special vectors only get scaled — they keep pointing the same way (or flip to the opposite direction). These are the eigenvectors.

Formally, \(\mathbf{v}\) is an eigenvector of \(A\) if:

\[A\mathbf{v} = \lambda \mathbf{v}\]

for some scalar \(\lambda\) (the eigenvalue). The map \(A\) acts on \(\mathbf{v}\) by merely scaling it by \(\lambda\).

In two dimensions we can see this directly. The matrix \(\begin{bmatrix} 2 & 1 \\ 0 & 3 \end{bmatrix}\) has eigenvectors \([1,0]^T\) (eigenvalue 2) and \([1,1]^T\) (eigenvalue 3). Each eigenvector only gets scaled — it stays on the same line:

(vis/arrow-plot [{:label "v₁" :xy [1 0] :color "#2266cc"}
                 {:label "Av₁=2v₁" :xy [2 0] :color "#2266cc" :dashed? true}
                 {:label "v₂" :xy [1 1] :color "#cc4422"}
                 {:label "Av₂=3v₂" :xy [3 3] :color "#cc4422" :dashed? true}]
                {})
v₁Av₁=2v₁v₂Av₂=3v₂

A non-eigenvector \([0,1]^T\) changes direction under the same transformation — it maps to \([1,3]^T\):

(vis/arrow-plot [{:label "u" :xy [0 1] :color "#999999"}
                 {:label "Au" :xy [1 3] :color "#228833" :dashed? true}
                 {:label "v2" :xy [1 1] :color "#cc4422"}
                 {:label "Av2" :xy [3 3] :color "#cc4422" :dashed? true}]
                {})
uAuv2Av2

Consider:

(def A-eig
  (t/matrix [[4 1 2]
             [0 3 1]
             [0 0 2]]))

This upper-triangular matrix has eigenvalues on the diagonal: 4, 3, and 2.

(def eig-result (la/eigen A-eig))

la/eigen returns eigenvalues as a ComplexTensor, because eigenvalues of a real matrix can be complex — the roots of the characteristic polynomial may leave \(\mathbb{R}\). For this matrix they happen to be real. When we know all eigenvalues are real (e.g., for symmetric matrices), la/real-eigenvalues is a convenience that extracts the real parts and returns them as a sorted real tensor:

Since la/ functions are polymorphic, you can use el/re, el/im, el/abs, and el/sum directly on these ComplexTensor eigenvalues.

(la/real-eigenvalues A-eig)
#la/R [:float64 [3] [2.000 3.000 4.000]]

Verifying the eigenvalue equation

For each eigenpair \((\lambda, \mathbf{v})\), the residual \(A\mathbf{v} - \lambda\mathbf{v}\) should be zero:

(every? (fn [i]
          (let [lam (el/re ((:eigenvalues eig-result) i))
                ev (nth (:eigenvectors eig-result) i)]
            (< (la/norm (el/- (la/mmul A-eig ev)
                              (el/scale ev lam)))
               1e-10)))
        (range 3))
true

Why eigenvalues matter

Eigenvalues encode essential properties of a matrix:

  • Trace = sum of eigenvalues
  • Determinant = product of eigenvalues
  • Invertibility: a matrix is invertible iff no eigenvalue is zero
  • Stability (in differential equations): negative eigenvalues mean decay

Let us verify the trace and determinant connections:

(def eig-reals (el/re (:eigenvalues eig-result)))
(< (abs (- (la/trace A-eig) (el/sum eig-reals))) 1e-10)
true
(< (abs (- (la/det A-eig) (el/reduce-* eig-reals))) 1e-10)
true

Change of basis and diagonalisation

The idea

A matrix represents a linear map with respect to a chosen basis. Choose a different basis and you get a different matrix for the same map. This is a change of basis.

If \(P\) is the matrix whose columns are the new basis vectors, then the matrix of the map in the new basis is:

\[B = P^{-1} A P\]

This is called a similarity transform. Two matrices related this way represent the same linear map — they share eigenvalues, trace, determinant, and rank.

Diagonalisation

The most illuminating basis change uses the eigenvectors. If \(A\) has \(n\) linearly independent eigenvectors, we can diagonalise it: \(A = P D P^{-1}\) where \(D\) is diagonal.

Consider:

(def A-diag
  (t/matrix [[2 1]
             [0 3]]))

Eigenvalues are 2 and 3. Build \(P\) from the eigenvectors:

(def eig-diag (la/eigen A-diag))
(def P-cols
  (let [evecs (:eigenvectors eig-diag)
        sorted-idx (el/argsort (el/re (:eigenvalues eig-diag)))]
    (t/hstack (mapv #(nth evecs %) sorted-idx))))

The similarity transform yields a diagonal matrix:

(def D-result
  (la/mmul (la/invert P-cols) (la/mmul A-diag P-cols)))
D-result
#la/R [:float64 [2 2]
       [[2.000 0.000]
        [0.000 3.000]]]

In the eigenvector basis, the map is just scaling along each axis. This is the simplest possible form.

Powers of a matrix

Diagonalisation makes it easy to compute powers: \(A^k = P D^k P^{-1}\), and \(D^k\) just raises each diagonal entry to the \(k\)th power.

(def A-diag-sq
  (la/mmul A-diag A-diag))
(def A-diag-sq-via-eigen
  (let [Pinv (la/invert P-cols)
        D2 (t/diag (t/make-reader :float64 2
                                  (let [lam (D-result idx idx)]
                                    (* lam lam))))]
    (la/mmul P-cols (la/mmul D2 Pinv))))
(la/close? A-diag-sq A-diag-sq-via-eigen)
true

Symmetric matrices and the spectral theorem

Symmetric matrices

A matrix \(A\) is symmetric if \(A = A^T\). Symmetric matrices are everywhere in applications:

  • Covariance matrices in statistics
  • Graph Laplacians in network analysis
  • Hessians of scalar-valued functions in optimization

They have three properties that general matrices lack:

  1. All eigenvalues are real (no imaginary parts)
  2. Eigenvectors for distinct eigenvalues are orthogonal
  3. The matrix can always be diagonalised (by an orthogonal matrix)
(def S-sym
  (t/matrix [[4 2 0]
             [2 5 1]
             [0 1 3]]))

Verify symmetry:

(la/close? S-sym (la/transpose S-sym))
true
(def eig-S (la/eigen S-sym))

All eigenvalues are real (imaginary parts zero):

(< (el/reduce-max (el/abs (el/im (:eigenvalues eig-S)))) 1e-10)
true

Eigenvectors are orthonormal. Build a matrix from them and check \(Q^T Q = I\):

(def Q-eig
  (t/hstack (:eigenvectors eig-S)))
(def QtQ (la/mmul (la/transpose Q-eig) Q-eig))
(la/norm (el/- QtQ (t/eye 3)))
2.5814177514652363E-16

The spectral theorem

Every real symmetric matrix \(A\) can be written:

\[A = Q \Lambda Q^T\]

where \(Q\) is orthogonal (its columns are the eigenvectors) and \(\Lambda\) is diagonal (the eigenvalues on the diagonal).

It says every symmetric matrix is just scaling along orthogonal axes.

The word “spectrum” was introduced by Hilbert, by analogy with optical spectra.

Symmetry (\(A = A^T\)) is the matrix expression of a deeper abstract property: the operator is self-adjoint with respect to the inner product:

\[\langle A\mathbf{u}, \mathbf{v} \rangle = \langle \mathbf{u}, A\mathbf{v} \rangle \quad \text{for all vectors } \mathbf{u}, \mathbf{v}\]

The spectral theorem is really about self-adjoint operators on inner product spaces. In different settings it takes different forms: for the integral inner product \(\langle f, g \rangle = \int f\,g\,dx\) on function spaces, the self-adjoint condition leads to Sturm-Liouville theory — the same spectral structure (real eigenvalues, orthogonal eigenfunctions) appears in differential equations.


The SVD

Beyond square matrices

Eigendecomposition only applies to square matrices. The Singular Value Decomposition (SVD) works for any \(m \times n\) matrix:

\[A = U \Sigma V^T\]

  • \(U\) (\(m \times m\), orthogonal): left singular vectors
  • \(\Sigma\) (\(m \times n\), diagonal): singular values \(\sigma_i \geq 0\)
  • \(V^T\) (\(n \times n\), orthogonal): right singular vectors

The singular values are always non-negative and arranged in decreasing order. They measure how much the matrix stretches each direction.

(def A-svd
  (t/matrix [[1 0 1]
             [0 1 1]]))
(def svd-A (la/svd A-svd))
(:S svd-A)
#la/R [:float64 [2] [1.732 1.000]]

What the SVD reveals

  • Rank = number of non-zero singular values

  • Condition number \(\kappa = \sigma_1 / \sigma_r\) — how sensitive the matrix is to perturbations

  • Column space = span of the first \(r\) columns of \(U\)

  • Row space = span of the first \(r\) rows of \(V^T\)

  • Null space = span of the last \(n - r\) rows of \(V^T\)

The SVD unifies all four fundamental subspaces in one factorisation.

Low-rank approximation

The Eckart-Young theorem says the best rank-\(k\) approximation to \(A\) (minimising the Frobenius norm of the error) is obtained by keeping only the \(k\) largest singular values and zeroing the rest.

This is the mathematical foundation of dimensionality reduction, image compression, and latent semantic analysis.

(def A-lr
  (t/matrix [[3 2 2]
             [2 3 -2]]))
(def svd-lr (la/svd A-lr))
(def sigmas (:S svd-lr))
sigmas
#la/R [:float64 [2] [5.000 3.000]]

Rank-1 approximation — keep only \(\sigma_1\):

(def A-rank1
  (el/scale (la/mmul (t/submatrix (:U svd-lr) :all [0])
                     (t/submatrix (:Vt svd-lr) [0] :all)) (first sigmas)))

The approximation error equals \(\sigma_2\):

(def approx-err (la/norm (el/- A-lr A-rank1)))
(< (abs (- approx-err (second sigmas))) 1e-10)
true

Positive definite matrices

Definition

A symmetric matrix \(M\) is positive definite (PD) if, for every non-zero vector \(\mathbf{x}\): \(\mathbf{x}^T M \mathbf{x} > 0\).

Think of it as a bowl shape — the quadratic form always curves upward. Equivalent conditions:

  • All eigenvalues are positive
  • All pivots are positive
  • \(M = R^T R\) for some invertible \(R\)

Positive definite matrices arise as:

  • Covariance matrices in statistics
  • Gram matrices \(A^T A\) (always positive semi-definite)
  • Hessians of strictly convex functions

\(A^T A\) is always positive semi-definite:

(def ATA (la/mmul (la/transpose A-svd) A-svd))
(every? #(>= % -1e-10) (el/re (:eigenvalues (la/eigen ATA))))
true

Cholesky decomposition

Positive definite matrices have a special factorisation: \(M = L L^T\) where \(L\) is lower triangular. This is the Cholesky decomposition — like a “square root” for matrices.

It is about twice as fast as general LU decomposition and is the standard method for solving PD systems.

(def spd-mat
  (el/+ (la/mmul (la/transpose A-eig) A-eig) (t/eye 3)))
(def chol-L (la/cholesky spd-mat))
chol-L
#la/R [:float64 [3 3]
       [[ 4.123  0.000 0.000]
        [0.9701  3.172 0.000]
        [ 1.940 0.9830 2.295]]]

Verify \(L L^T = M\):

(la/norm (el/- (la/mmul chol-L (la/transpose chol-L)) spd-mat))
1.7763568394002505E-15

Non-positive-definite matrices have no Cholesky factor:

(la/cholesky (t/matrix [[1 2] [2 1]]))
nil

(The eigenvalues of that matrix are 3 and \(-1\) — the negative eigenvalue prevents a Cholesky factorisation.)


Putting it all together

Let us explore a single matrix through all the lenses we have developed.

(def A-final
  (t/matrix [[2 1 0]
             [1 3 1]
             [0 1 2]]))

This matrix is:

  • Symmetric: \(A = A^T\)
  • Positive definite: all eigenvalues > 0
(la/close? A-final (la/transpose A-final))
true

Eigenvalues and eigenvectors

(def eig-final (la/eigen A-final))
(def final-eigenvalues
  (la/real-eigenvalues A-final))
final-eigenvalues
#la/R [:float64 [3] [1.000 2.000 4.000]]

All eigenvalues are positive — the matrix is positive definite.

Trace = sum of eigenvalues

(< (abs (- (la/trace A-final)
           (el/sum final-eigenvalues)))
   1e-10)
true

Determinant = product of eigenvalues

(< (abs (- (la/det A-final)
           (el/reduce-* final-eigenvalues)))
   1e-10)
true

SVD

For a symmetric positive definite matrix, the singular values equal the eigenvalues — the spectral theorem and the SVD coincide.

(def final-svd (la/svd A-final))
(< (el/reduce-max
    (el/abs (el/- (el/sort (:S final-svd))
                  final-eigenvalues)))
   1e-10)
true

Full rank — invertible

(long (el/sum (el/> (:S final-svd) 1e-10)))
3
(def A-inv (la/invert A-final))
A-inv
#la/R [:float64 [3 3]
       [[ 0.6250 -0.2500  0.1250]
        [-0.2500  0.5000 -0.2500]
        [ 0.1250 -0.2500  0.6250]]]
(la/close? (la/mmul A-final A-inv) (t/eye 3))
true

Cholesky factorisation

Since the matrix is symmetric and positive definite, it has a Cholesky factorisation \(A = LL^T\) where \(L\) is lower triangular.

(def chol-final (la/cholesky A-final))
chol-final
#la/R [:float64 [3 3]
       [[ 1.414  0.000 0.000]
        [0.7071  1.581 0.000]
        [ 0.000 0.6325 1.265]]]
(la/close? (la/mmul chol-final (la/transpose chol-final)) A-final)
true

These concepts are more connected than they may first appear. Eigenvalues show up in the trace, determinant, and singular values. Symmetry leads to orthogonal eigenvectors and the spectral theorem. Positive definiteness gives us Cholesky. The threads run through most of what follows in the book.

source: notebooks/lalinea_book/eigenvalues_and_decompositions.clj