16 Data Representations

This chapter looks under the hood at how harmonica represents mathematical objects as Clojure data. If you want to contribute, port the library, or simply understand what your REPL is printing, this is the place to start.

Design principle: groups are first-class values (defrecords implementing protocols), and group elements are plain Clojure data — integers, vectors, tagged pairs. There are no wrapper types around elements; the group object carries the algebra.

(ns harmonica-book.data-representations
  (:require
   [scicloj.harmonica :as hm]
   [scicloj.lalinea.tensor :as t]
   [scicloj.lalinea.elementwise :as el]
   [scicloj.harmonica.protocols :as p]
   [scicloj.harmonica.combinatorics.permutation :as perm]
   [scicloj.harmonica.combinatorics.partition :as part]
   [scicloj.harmonica.combinatorics.young-tableaux :as yt]
   [scicloj.harmonica.combinatorics.murnaghan-nakayama :as mn]
   [fastmath.matrix :as fm]
   [scicloj.kindly.v4.kind :as kind]))

Groups as protocol values

The algebraic structure is defined by four Clojure protocols:

Protocol	Methods	Purpose
`Group`	`op`, `inv`, `id`	Binary operation, inverse, identity
`FiniteGroup`	`elements`, `order`	Enumerate elements, count them
`GroupStructure`	`conjugacy-classes`	Structural decomposition
`GroupType`	`group-type`	Keyword tag for multimethod dispatch

Each group family is a defrecord implementing these protocols. The same functions — hm/op, hm/inv, hm/id, hm/elements, hm/order — work on any group.

Cyclic group \(\mathbb{Z}/n\mathbb{Z}\)

Elements are integers 0, 1, ..., n-1. The operation is addition mod \(n\).

(def Z6 (hm/cyclic-group 6))

(type Z6)

scicloj.harmonica.group.cyclic.CyclicGroup

The group stores only its order. Elements are generated on the fly.

(select-keys Z6 [:n])

{:n 6}

(hm/elements Z6)

(0 1 2 3 4 5)

The operation, inverse, and identity are exactly what you’d expect:

[(hm/op Z6 2 3)
 (hm/inv Z6 2)
 (hm/id Z6)]

[5 4 0]

Because cyclic groups are abelian, each element is its own conjugacy class.

(count (hm/conjugacy-classes Z6))

Symmetric group \(S_n\)

Elements are 0-indexed one-line notation vectors: [σ(0) σ(1) ... σ(n-1)]. The vector [2 0 1] means \(0 \mapsto 2\), \(1 \mapsto 0\), \(2 \mapsto 1\).

Composition is right-to-left: (op G σ τ) \(= \sigma \circ \tau\) where \((\sigma \circ \tau)(i) = \sigma(\tau(i))\).

(def S4 (hm/symmetric-group 4))

(type S4)

scicloj.harmonica.group.symmetric.SymmetricGroup

(hm/order S4)

Composition of two permutations:

(hm/op S4 [1 0 3 2] [2 3 0 1])

[3 2 1 0]

The identity is [0 1 2 3] and inverses swap the mapping:

[(hm/id S4)
 (hm/inv S4 [1 2 3 0])]

[[0 1 2 3] [3 0 1 2]]

Conjugacy classes of \(S_n\) are indexed by cycle type — the partition giving the lengths of disjoint cycles. \(S_4\) has 5 classes, one for each partition of 4:

(mapv :cycle-type (hm/conjugacy-classes S4))

[[4] [3 1] [2 2] [2 1 1] [1 1 1 1]]

Dihedral group \(D_n\)

The symmetries of a regular \(n\)-gon. Order \(2n\). Elements are tagged pairs:

[:r k] — rotation by \(2\pi k / n\)
[:s k] — reflection (rotation \(r^k\) followed by base reflection \(s\))

The algebra follows from the presentation \(r^n = s^2 = e\), \(s r s = r^{-1}\).

(def D4 (hm/dihedral-group 4))

(hm/order D4)

Rotation then reflection:

(hm/op D4 [:r 1] [:s 0])

[:s 1]

Two reflections give a rotation:

(hm/op D4 [:s 2] [:s 0])

[:r 2]

Reflections are involutions (\(s^2 = e\)):

(hm/inv D4 [:s 3])

[:s 3]

Product group \(G_1 \times G_2\)

Elements are pairs [g h]; all operations are componentwise.

(def Z2xZ3 (hm/product-group (hm/cyclic-group 2) (hm/cyclic-group 3)))

(hm/order Z2xZ3)

(hm/op Z2xZ3 [1 2] [1 1])

[0 0]

Uniform interface

The same protocol functions work on every group type. Here we verify the group axioms generically:

(defn group-axioms-hold? [G]
  (let [e (hm/id G)
        elts (take 20 (hm/elements G))]
    (and
     ;; Identity: e * g = g
     (every? #(= (hm/op G e %) %) elts)
     ;; Inverse: g * g^{-1} = e
     (every? #(= (hm/op G % (hm/inv G %)) e) elts))))

(mapv group-axioms-hold? [Z6 S4 D4 Z2xZ3])

[true true true true]

Permutations

Permutations are plain vectors. The representation is 0-indexed one-line notation: sigma[i] is the image of i.

Composition is a single mapv — apply \(\tau\) first, then \(\sigma\):

(let [sigma [2 0 1]
      tau   [1 2 0]]
  (mapv sigma tau))

[0 1 2]

Inverse uses a mutable int-array for O(n) construction — if \(\sigma(i) = j\) then \(\sigma^{-1}(j) = i\):

(perm/inverse [2 0 1])

[1 2 0]

Cycle decomposition walks the permutation using a boolean-array to track visited positions. Fixed points (1-cycles) are omitted:

(perm/cycles [1 3 0 2])

[[0 1 3 2]]

Cycle type is the partition of cycle lengths (including fixed points), sorted descending:

(perm/cycle-type [1 2 0 3 4])

[3 1 1]

Sign counts the total number of transpositions in the cycle decomposition:

[(perm/sign [0 1 2 3])
 (perm/sign [1 0 2 3])
 (perm/sign [1 2 0 3])]

[1 -1 1]

Partitions and Young tableaux

A partition of \(n\) is a descending vector of positive integers summing to \(n\). Partitions index the conjugacy classes and the irreducible representations of \(S_n\).

(hm/partitions 5)

[[5] [4 1] [3 2] [3 1 1] [2 2 1] [2 1 1 1] [1 1 1 1 1]]

The conjugate (transpose) swaps rows and columns in the Young diagram:

(part/conjugate [4 2 1])

[3 2 1 1]

A Standard Young Tableau (SYT) fills the diagram with \(1, 2, \ldots, n\) so that entries increase along rows and down columns. SYTs index the basis vectors of irreducible representations. They are enumerated by depth-first backtracking:

(hm/standard-young-tableaux [3 2])

[[[1 2 3] [4 5]]
 [[1 2 4] [3 5]]
 [[1 2 5] [3 4]]
 [[1 3 4] [2 5]]
 [[1 3 5] [2 4]]]

The number of SYTs equals the dimension of the corresponding irrep, computed by the hook-length formula:

\[d_\lambda = \frac{n!}{\prod_{(i,j) \in \lambda} h(i,j)}\]

(hm/hook-length-dimension [3 2])

Both methods agree:

(= (count (hm/standard-young-tableaux [4 3 1]))
   (hm/hook-length-dimension [4 3 1]))

true

Character tables

A character table is a Clojure map with five keys:

(def ct-s4 (hm/character-table S4))

(sort (keys ct-s4))

(:class-sizes :classes :group :irrep-labels :table)

For \(S_4\), irrep labels are the 5 partitions of 4:

(:irrep-labels ct-s4)

[[4] [3 1] [2 2] [2 1 1] [1 1 1 1]]

Classes are also partitions (cycle types), with the identity class [1 1 1 1] first:

(:classes ct-s4)

[[1 1 1 1] [2 1 1] [2 2] [3 1] [4]]

The :table is a ComplexTensor matrix — a [5 × 5] complex matrix whose real part contains the character values (all integers for \(S_n\)):

(t/complex-shape (:table ct-s4))

[5 5]

(:table ct-s4)

#la/C [:float64 [5 5]
       [[1.000 + 0.000 i   1.000 + 0.000 i   1.000 + 0.000 i   1.000 + 0.000 i   1.000 + 0.000 i]
        [3.000 + 0.000 i   1.000 + 0.000 i  -1.000 + 0.000 i   0.000 + 0.000 i  -1.000 + 0.000 i]
        [2.000 + 0.000 i   0.000 + 0.000 i   2.000 + 0.000 i  -1.000 + 0.000 i   0.000 + 0.000 i]
        [3.000 + 0.000 i  -1.000 + 0.000 i  -1.000 + 0.000 i   0.000 + 0.000 i   1.000 + 0.000 i]
        [1.000 + 0.000 i  -1.000 + 0.000 i   1.000 + 0.000 i   1.000 + 0.000 i  -1.000 + 0.000 i]]]

Character table computation is a multimethod dispatching on group-type:

Group type	Algorithm
`:cyclic`	DFT matrix: χⱼ(k) = ωʲᵏ where ω = e^(2πi/n)
`:symmetric`	Murnaghan-Nakayama rule (integer-valued)
`:dihedral`	Closed-form trig expressions
`:product`	Kronecker product of factor character tables

ComplexTensor

ComplexTensor is the library’s complex number type, provided by lalinea. The core idea: a complex tensor of shape \([d_1, d_2, \ldots]\) is stored as a real tensor of shape \([d_1, d_2, \ldots, 2]\) with interleaved real/imaginary pairs.

A complex vector of length 3:

(def v (t/complex-tensor [1.0 2.0 3.0] [0.5 -0.5 1.0]))

#la/C [:float64 [3] [1.000 + 0.5000 i  2.000 - 0.5000 i  3.000 + 1.000 i]]

The underlying storage is a [3, 2] real tensor:

(t/shape (t/->tensor v))

[3 2]

re and im return zero-copy tensor views — slicing the last axis at index 0 or 1:

[(vec (el/re v)) (vec (el/im v))]

[[1.0 2.0 3.0] [0.5 -0.5 1.0]]

This interleaved layout is deliberate:

Cache-friendly: real and imaginary parts of the same element sit in adjacent memory
EJML compatible: EJML’s ZMatrixRMaj uses the same interleaved double[] layout, enabling zero-copy sharing
Field dispatch: el/+, el/-, el/* dispatch correctly on ComplexTensors (complex addition, subtraction, multiplication)

Complex multiplication via el/*:

(let [a (t/complex 3.0 4.0)
      b (t/complex 1.0 2.0)]
  {:cmul-re (el/re (el/* a b))
   :cmul-im (el/im (el/* a b))})

{:cmul-re -5.0, :cmul-im 10.0}

ComplexTensors implement IFn, Indexed, Seqable, and Counted, so they work naturally with Clojure’s indexing and sequence operations:

(let [ct-row ((:table ct-s4) 0)]
  {:first-value (el/re (((:table ct-s4) 0) 0))
   :count (count ((:table ct-s4) 0))})

{:first-value 1.0, :count 5}

Representations

An irreducible representation of \(S_n\) is built from Young’s orthogonal form. The result is a Clojure map:

(def ir-31 (hm/irrep [3 1]))

(sort (keys ir-31))

(:dimension :generators :lambda :syts)

The four fields:

Key	Type	Content
`:lambda`	vector	The partition labeling this irrep
`:dimension`	long	Number of SYTs = dimension of the representation
`:syts`	vector of SYTs	The basis (each SYT is a vector of row vectors)
`:generators`	vector of `RealMatrix`	One matrix per adjacent transposition sᵢ

(:lambda ir-31)

[3 1]

(:dimension ir-31)

The basis vectors are the 3 standard Young tableaux of shape [3, 1]:

(:syts ir-31)

[[[1 2 3] [4]] [[1 2 4] [3]] [[1 3 4] [2]]]

The generators are \((n-1) = 3\) real orthogonal matrices, one for each adjacent transposition \(s_1 = (1\;2)\), \(s_2 = (2\;3)\), \(s_3 = (3\;4)\):

(count (:generators ir-31))

To get \(\rho(\sigma)\) for any permutation \(\sigma\), the library decomposes \(\sigma\) into adjacent transpositions and multiplies the corresponding generators:

(hm/rep-matrix ir-31 [1 0 2 3])

#object[org.apache.commons.math3.linear.Array2DRowRealMatrix 0x399dfae1 "Array2DRowRealMatrix{{1.0,0.0,0.0},{0.0,1.0,0.0},{0.0,0.0,-1.0}}"]

Composite representations (tensor product, direct sum) use a different mechanism: a :matrix-fn closure that computes \(\rho(\sigma)\) on demand:

(let [rep1 (hm/irrep [3 1])
      rep2 (hm/irrep [2 2])
      tp (hm/tensor-product rep1 rep2)]
  {:has-matrix-fn (some? (:matrix-fn tp))
   :dimension (:dimension tp)})

{:has-matrix-fn true, :dimension 6}

This two-track dispatch lets generator-based irreps and closure-based composites share the same rep-matrix function.

Fourier transform pipeline

The Fourier transform on a finite abelian group uses lalinea’s field dispatch. The forward transform computes each coefficient as a Hermitian inner product:

\[\hat{f}(k) = \langle f, \chi_k \rangle = \sum_g f(g) \, \overline{\chi_k(g)}\]

via la/dot. The inverse uses el/sum(el/*) for the bilinear sum.

(let [G (hm/cyclic-group 8)
      ct (hm/character-table G)
      signal (t/complex-tensor-real [1 0 1 0 1 0 1 0])]
  (hm/fourier-transform ct signal))

#la/C [:float64 [8] [4.000 + 0.000 i  -7.267E-17 - 2.220E-16 i  0.000 - 2.449E-16 i  -1.608E-16 - 1.110E-16 i  4.000 + 0.000 i  -7.267E-17 - 2.220E-16 i  0.000 - 2.449E-16 i  -1.608E-16 - 1.110E-16 i]]

The inverse transform reverses the process, with a \(1/|G|\) scaling factor. Convolution composes: forward transform both inputs, pointwise complex multiply (el/*), inverse transform.

What comes next

The next chapter, Algorithms, walks through the key algorithms — Murnaghan-Nakayama, Young’s orthogonal form, bubble-sort decomposition — with step-by-step worked examples.

source: notebooks/harmonica_book/data_representations.clj