9 Computation tape
The tape namespace records every la/, t/, and el/ operation as a directed acyclic graph (DAG). This serves two purposes: it powers automatic differentiation (reverse-mode autodiff walks the tape backward), and it helps you inspect what the runtime actually computed — which results share memory, which are lazy, and where copies happen.
Setup
(ns lalinea-book.computation-tape
(:require [scicloj.lalinea.linalg :as la]
[scicloj.lalinea.tensor :as t]
[scicloj.lalinea.elementwise :as el]
[scicloj.lalinea.tape :as tape]
[scicloj.kindly.v4.kind :as kind])
(:import [org.ejml.data DMatrixRMaj]))Inspecting memory status
tape/memory-status classifies a tensor’s memory backing without needing a tape. Three states:
:contiguous— backed by adouble[], fast, mutable:strided— shares adouble[]but with reordered strides:lazy— no backing array, recomputes on each access
(def A (t/matrix [[1 2] [3 4]]))(tape/memory-status A):contiguousla/transpose returns a strided view — same backing data, different stride order.
(tape/memory-status (la/transpose A)):stridedel/+ returns a lazy reader — no allocation, recomputes on every access.
(def B (t/matrix [[5 6] [7 8]]))(tape/memory-status (el/+ A B)):lazyla/mmul goes through EJML and returns a fresh contiguous tensor.
(tape/memory-status (la/mmul A B)):contiguousMemory relation
tape/memory-relation classifies the relationship between two tensors:
:shared— same backingdouble[]:independent— separate backing arrays:unknown-lazy— at least one is lazy; relationship indeterminate
A matrix and its transpose share backing memory:
(tape/memory-relation A (la/transpose A)):sharedTwo separate matrices are independent:
(tape/memory-relation A B):independentTwo t/column calls on the same double[] share memory:
(def arr (double-array [10 20 30]))(tape/memory-relation (t/column arr) (t/column arr)):sharedA lazy result (from el/+) actually reads through to its inputs on every access — but without a tape, dtype-next does not expose that dependency chain:
(tape/memory-relation A (el/+ A B)):unknown-lazyWith a tape, detect-memory-status gives the complete answer. It knows the inputs of each operation, so it can report :reads-through instead of :unknown-lazy:
(let [tr (tape/with-tape
(let [M (t/matrix [[1 2] [3 4]])
S (el/+ M M)]
S))]
(tape/detect-memory-status (last (:entries tr)))):reads-throughRecording a computation tape
tape/with-tape records all la/, t/, and el/ operations within its scope. It returns {:result ... :entries ...}.
(def tape-result
(tape/with-tape
(let [M (t/matrix [[1 2] [3 4]])
S (el/scale M 2.0)
I (t/eye 2)
C (el/+ S I)
D (la/mmul C (la/transpose M))]
D)))(dissoc tape-result :registry){:entries
[{:id "t1",
:op :t/matrix,
:inputs [{:external true}],
:input-tensors [[[1 2] [3 4]]],
:output #la/R [:float64 [2 2]
[[1.000 2.000]
[3.000 4.000]]]
,
:shape [2 2],
:complex? false}
{:id "t2",
:op :el/scale,
:inputs [{:id "t1"} {:external true}],
:input-tensors [#la/R [:float64 [2 2]
[[1.000 2.000]
[3.000 4.000]]]
2.0],
:output #la/R [:float64 [2 2]
[[2.000 4.000]
[6.000 8.000]]]
,
:shape [2 2],
:complex? false}
{:id "t3",
:op :t/eye,
:inputs [{:external true}],
:input-tensors [2],
:output #la/R [:float64 [2 2]
[[1.000 0.000]
[0.000 1.000]]]
,
:shape [2 2],
:complex? false}
{:id "t4",
:op :el/+,
:inputs [{:id "t2"} {:id "t3"}],
:input-tensors [#la/R [:float64 [2 2]
[[2.000 4.000]
[6.000 8.000]]]
#la/R [:float64 [2 2]
[[1.000 0.000]
[0.000 1.000]]]
],
:output #la/R [:float64 [2 2]
[[3.000 4.000]
[6.000 9.000]]]
,
:shape [2 2],
:complex? false}
{:id "t5",
:op :la/transpose,
:inputs [{:id "t1"}],
:input-tensors [#la/R [:float64 [2 2]
[[1.000 2.000]
[3.000 4.000]]]
],
:output #la/R [:float64 [2 2]
[[1.000 3.000]
[2.000 4.000]]]
,
:shape [2 2],
:complex? false}
{:id "t6",
:op :la/mmul,
:inputs [{:id "t4"} {:id "t5"}],
:input-tensors [#la/R [:float64 [2 2]
[[3.000 4.000]
[6.000 9.000]]]
#la/R [:float64 [2 2]
[[1.000 3.000]
[2.000 4.000]]]
],
:output #la/R [:float64 [2 2]
[[11.00 25.00]
[24.00 54.00]]]
,
:shape [2 2],
:complex? false}],
:result #la/R [:float64 [2 2]
[[11.00 25.00]
[24.00 54.00]]]
}The result is a map with :result (the computed value) and :entries (the recorded operations). Each entry tracks the operation, its inputs (linked by ID or marked :external), and the output shape.
External inputs
The tape tracks la/, t/, and el/ operations. Inputs that originate outside La Linea (raw arrays, Clojure data structures, dtype-next operations, EJML objects) appear as {:external true}.
Double arrays
A double[] passed to t/column is external — the tape records t/column but not the array construction.
(def array-tape
(tape/with-tape
(let [v (t/column [1 2 3])
w (el/scale v 5.0)]
w)))(mapv (fn [e] (select-keys e [:id :op :inputs]))
(:entries array-tape))[{:id "t1", :op :t/column, :inputs [{:external true}]}
{:id "t2", :op :el/scale, :inputs [{:id "t1"} {:external true}]}]Clojure vectors and sequences
Clojure data structures are external too. t/matrix wraps them into tensors — the tape records that wrapping.
(def seq-tape
(tape/with-tape
(let [M (t/matrix (for [i (range 3)]
(for [j (range 3)]
(* (inc i) (inc j)))))
v (t/column (repeat 3 1.0))]
(la/mmul M v))))(mapv :op (:entries seq-tape))[:t/matrix :t/column :la/mmul]dtype-next operations
Using el/* instead of raw dfn/* means the tape captures the full chain. The t/matrix wrapper around the result is also recorded.
(def mul-tape
(tape/with-tape
(let [A (t/matrix [[1 2] [3 4]])
doubled (el/* A 2.0)
result (el/+ (t/matrix doubled) A)]
result)))(mapv (fn [e] (select-keys e [:id :op :inputs]))
(:entries mul-tape))[{:id "t1", :op :t/matrix, :inputs [{:external true}]}
{:id "t2", :op :el/*, :inputs [{:id "t1"} {:external true}]}
{:id "t3", :op :t/matrix, :inputs [{:id "t2"}]}
{:id "t4", :op :el/+, :inputs [{:id "t3"} {:id "t1"}]}]el/* computes the element-wise product, and t/matrix wraps its result. All four operations (t/matrix, el/*, t/matrix, el/+) are tracked on the tape.
EJML structures
EJML’s DMatrixRMaj can be converted to a tensor via t/dmat->tensor. The conversion itself is not instrumented (it is a low-level interop function), so the resulting tensor is external to the tape.
(def ejml-tape
(tape/with-tape
(let [dm (doto (DMatrixRMaj. 2 2)
(.setData (double-array [1 0 0 1])))
I (t/dmat->tensor dm)
result (el/+ (t/matrix [[5 6] [7 8]]) I)]
result)))(mapv (fn [e] (select-keys e [:id :op :inputs]))
(:entries ejml-tape))[{:id "t1", :op :t/matrix, :inputs [{:external true}]}
{:id "t2", :op :el/+, :inputs [{:id "t1"} {:id "x3", :external true}]}]The tensor from t/dmat->tensor enters the tape as external input to el/+.
Complex operations
Complex constructors are recorded as t/ operations on the tape. The tape shows the full chain: t/matrix → t/complex-tensor → el/+.
(def complex-tape
(tape/with-tape
(let [z1 (t/complex-tensor (t/matrix [[1 0] [0 1]]))
z2 (t/complex-tensor (t/matrix [[0 1] [1 0]]))
s (el/+ z1 z2)]
s)))(mapv :op (:entries complex-tape))[:t/matrix :t/complex-tensor :t/matrix :t/complex-tensor :el/+]The polymorphic el/+ works for both real and complex inputs. The tape always records as :el/+ regardless of the input type.
Complex inputs produce the same :el/+ tape key:
(mapv :op (:entries (tape/with-tape
(el/+ (t/complex-tensor [1 2])
(t/complex-tensor [3 4])))))[:t/complex-tensor :t/complex-tensor :el/+]Tape summary
tape/summary aggregates statistics about the tape: operation counts and memory status breakdown.
(tape/summary tape-result){:total 6,
:by-op
{:t/matrix 1,
:el/scale 1,
:t/eye 1,
:el/+ 1,
:la/transpose 1,
:la/mmul 1},
:by-memory {:independent 2, :reads-through 3, :shared 1}}The summary shows:
:by-op— one of each operation:by-memory— which operations read through, share memory, or are independent
Origin DAG
tape/origin walks the tape backwards from a value to reconstruct its computation DAG.
(tape/origin tape-result (:result tape-result)){:id "t6",
:op :la/mmul,
:shape [2 2],
:inputs
[{:id "t4",
:op :el/+,
:shape [2 2],
:inputs
[{:id "t2",
:op :el/scale,
:shape [2 2],
:inputs
[{:id "t1",
:op :t/matrix,
:shape [2 2],
:inputs [{:external true}]}
{:external true}]}
{:id "t3", :op :t/eye, :shape [2 2], :inputs [{:external true}]}]}
{:id "t5", :op :la/transpose, :shape [2 2], :inputs [{:ref "t1"}]}]}The DAG shows that mmul depends on add and transpose, which both trace back to the original matrix. The shared reference to matrix appears as {:ref "t1"} in the transpose branch — a proper DAG, not a tree.
Mermaid visualization
tape/mermaid renders the DAG as a Mermaid flowchart. Memory status annotations:
~reads-through (lazy, recomputes on access from inputs)=shared (same backing array as an input)+independent (fresh allocation)
(tape/mermaid tape-result (:result tape-result))Practical example: tracing a pipeline
A common question: “after this chain of operations, what is materialized and what is lazy?”
(def pipeline-result
(tape/with-tape
(let [data (t/matrix [[1 0 2]
[0 3 0]
[4 0 5]])
centered (el/- data (el/scale (t/matrix [[1 1 1]
[1 1 1]
[1 1 1]])
(/ (double (la/trace data)) 3.0)))
{:keys [U S Vt]} (la/svd centered)
projection (la/mmul (la/transpose Vt) (t/column [1 0 0]))]
projection)))(tape/summary pipeline-result){:total 9,
:by-op
{:t/matrix 2,
:la/trace 1,
:el/scale 1,
:el/- 1,
:la/svd 1,
:la/transpose 1,
:t/column 1,
:la/mmul 1},
:by-memory {:independent 4, :reads-through 5}}The summary reveals the mix of lazy intermediate operations and EJML-backed materializations.
(tape/mermaid pipeline-result (:result pipeline-result))