5 Core Concepts

The vocabulary you reach for daily: data, mappings, scope, layer types, and how they fit together. This chapter takes the mental model from Poses and turns it into a working reference you can scan while building a plot.

Read Poses first if you have not – this chapter builds on the pose vocabulary it introduces.

(ns plotje-book.core-concepts
  (:require
   ;; Tablecloth -- dataset manipulation
   [tablecloth.api :as tc]
   ;; Kindly -- notebook rendering protocol
   [scicloj.kindly.v4.kind :as kind]
   ;; Plotje -- composable plotting
   [scicloj.plotje.api :as pj]
   ;; Rdatasets -- standard datasets
   [scicloj.metamorph.ml.rdatasets :as rdatasets]))

Data

Plotje accepts plain Clojure data – maps, vectors of maps – or columnar datasets. No wrapping needed for simple cases. The Datasets chapter covers data formats and loading in detail.

We use the classic iris flower dataset throughout these examples. Each column has a name (a keyword like :sepal-length) and holds values of one type.

(rdatasets/datasets-iris)

https://vincentarelbundock.github.io/Rdatasets/csv/datasets/iris.csv [150 6]:

:rownames	:sepal-length	:sepal-width	:petal-length	:petal-width	:species
1	5.1	3.5	1.4	0.2	setosa
2	4.9	3.0	1.4	0.2	setosa
3	4.7	3.2	1.3	0.2	setosa
4	4.6	3.1	1.5	0.2	setosa
5	5.0	3.6	1.4	0.2	setosa
6	5.4	3.9	1.7	0.4	setosa
7	4.6	3.4	1.4	0.3	setosa
8	5.0	3.4	1.5	0.2	setosa
9	4.4	2.9	1.4	0.2	setosa
10	4.9	3.1	1.5	0.1	setosa
…	…	…	…	…	…
140	6.9	3.1	5.4	2.1	virginica
141	6.7	3.1	5.6	2.4	virginica
142	6.9	3.1	5.1	2.3	virginica
143	5.8	2.7	5.1	1.9	virginica
144	6.8	3.2	5.9	2.3	virginica
145	6.7	3.3	5.7	2.5	virginica
146	6.7	3.0	5.2	2.3	virginica
147	6.3	2.5	5.0	1.9	virginica
148	6.5	3.0	5.2	2.0	virginica
149	6.2	3.4	5.4	2.3	virginica
150	5.9	3.0	5.1	1.8	virginica

The dataset has 150 rows and 6 columns: a :rownames index plus four numerical measurements (in centimeters) and one categorical column (the species name – one of three strings).

This distinction matters: Plotje treats numerical and categorical columns differently when choosing axes, colors, and statistical transforms.

Here is a scatter plot of sepal dimensions, colored by species:

(-> (rdatasets/datasets-iris)
    (pj/pose :sepal-length :sepal-width)
    (pj/lay-point {:color :species}))

Printed, the pose carries the data and the :x/:y position mapping at the top, and one point layer with its own layer-scoped :color:

(-> (rdatasets/datasets-iris)
    (pj/pose :sepal-length :sepal-width)
    (pj/lay-point {:color :species})
    kind/pprint)

{:mapping {:x :sepal-length, :y :sepal-width},
 :layers [{:layer-type :point, :mapping {:color :species}}],
 :data
 https://vincentarelbundock.github.io/Rdatasets/csv/datasets/iris.csv [150 6]:

| :rownames | :sepal-length | :sepal-width | :petal-length | :petal-width |  :species |
|----------:|--------------:|-------------:|--------------:|-------------:|-----------|
|         1 |           5.1 |          3.5 |           1.4 |          0.2 |    setosa |
|         2 |           4.9 |          3.0 |           1.4 |          0.2 |    setosa |
|         3 |           4.7 |          3.2 |           1.3 |          0.2 |    setosa |
|         4 |           4.6 |          3.1 |           1.5 |          0.2 |    setosa |
|         5 |           5.0 |          3.6 |           1.4 |          0.2 |    setosa |
|         6 |           5.4 |          3.9 |           1.7 |          0.4 |    setosa |
|         7 |           4.6 |          3.4 |           1.4 |          0.3 |    setosa |
|         8 |           5.0 |          3.4 |           1.5 |          0.2 |    setosa |
|         9 |           4.4 |          2.9 |           1.4 |          0.2 |    setosa |
|        10 |           4.9 |          3.1 |           1.5 |          0.1 |    setosa |
|       ... |           ... |          ... |           ... |          ... |       ... |
|       140 |           6.9 |          3.1 |           5.4 |          2.1 | virginica |
|       141 |           6.7 |          3.1 |           5.6 |          2.4 | virginica |
|       142 |           6.9 |          3.1 |           5.1 |          2.3 | virginica |
|       143 |           5.8 |          2.7 |           5.1 |          1.9 | virginica |
|       144 |           6.8 |          3.2 |           5.9 |          2.3 | virginica |
|       145 |           6.7 |          3.3 |           5.7 |          2.5 | virginica |
|       146 |           6.7 |          3.0 |           5.2 |          2.3 | virginica |
|       147 |           6.3 |          2.5 |           5.0 |          1.9 | virginica |
|       148 |           6.5 |          3.0 |           5.2 |          2.0 | virginica |
|       149 |           6.2 |          3.4 |           5.4 |          2.3 | virginica |
|       150 |           5.9 |          3.0 |           5.1 |          1.8 | virginica |
}

Input formats

Plotje accepts several common Clojure data shapes and coerces them into a dataset internally.

Map of columns – keys are column names, values are sequences:

(-> {:x [1 2 3 4 5]
     :y [2 4 3 5 4]}
    (pj/lay-point :x :y))

Sequence of row maps – each map is one row:

(-> [{:city "Paris" :temperature 22}
     {:city "London" :temperature 18}
     {:city "Berlin" :temperature 20}
     {:city "Rome" :temperature 28}]
    (pj/lay-value-bar :city :temperature))

When the dataset has 1, 2, or 3 columns, you can omit the column names entirely – they are inferred by position: the first column becomes x, the second becomes y, the third becomes color.

(-> {:x [1 2 3 4 5] :y [2 4 3 5 4]}
    pj/lay-point)

Datasets with four or more columns require explicit column names. See the Datasets chapter for loading from CSV, URLs, and other file formats.

Mappings and Layers

The Poses chapter introduced the mapping-vs-layer split (what vs how). This section is the practical follow-up: how the split plays out across multi-layer plots and explicit-vs-shorthand calls.

Declare a mapping once with pj/pose, then add layers with pj/lay-* – both layers share the same axes:

(-> (rdatasets/datasets-iris)
    (pj/pose :sepal-length :sepal-width)
    pj/lay-point
    (pj/lay-smooth {:stat :linear-model}))

One mapping, two layers: points and a regression line.

When pj/lay-* is called with columns, it creates a pose and attaches a layer in one step:

(-> (rdatasets/datasets-iris)
    (pj/lay-point :sepal-length :sepal-width))

We will revisit where mappings flow (to all layers or to a single one) in the Scope section below.

With multiple poses arranged side by side, use pj/arrange:

(def two-panel
  (pj/arrange
   [(-> (rdatasets/datasets-iris)
        (pj/lay-point :sepal-length :sepal-width))
    (-> (rdatasets/datasets-iris)
        (pj/lay-point :petal-length :petal-width))]))

two-panel

Each sub-pose has its own mapping and layers. pj/arrange produces a composite pose that contains them as siblings.

Scope

A mapping connects a column to a visual property – like mapping :species to color. Where you write a mapping determines who sees it. There are two levels:

Where you write it	What sees it
`(pj/pose ... {:color :c})`	All layers on this pose
`(pj/lay-point ... {:color :c})`	This layer only

This is lexical scope – the closest enclosing definition wins.

Pose-level mapping

pj/pose’s mapping flows to every layer attached to the pose:

(-> (rdatasets/datasets-iris)
    (pj/pose :sepal-length :sepal-width {:color :species})
    pj/lay-point
    (pj/lay-smooth {:stat :linear-model}))

Both point and smooth layers see :color :species. Three regression lines – one per species.

Layer-level mapping

A mapping in pj/lay-* scopes to that layer alone:

(-> (rdatasets/datasets-iris)
    (pj/pose :sepal-length :sepal-width)
    (pj/lay-point {:color :species})
    (pj/lay-smooth {:stat :linear-model}))

Printed, :color lives on the point layer’s own :mapping, not on the pose – so only the point layer sees it:

(-> (rdatasets/datasets-iris)
    (pj/pose :sepal-length :sepal-width)
    (pj/lay-point {:color :species})
    (pj/lay-smooth {:stat :linear-model})
    kind/pprint)

{:mapping {:x :sepal-length, :y :sepal-width},
 :layers
 [{:layer-type :point, :mapping {:color :species}}
  {:layer-type :smooth, :stat :linear-model}],
 :data
 https://vincentarelbundock.github.io/Rdatasets/csv/datasets/iris.csv [150 6]:

| :rownames | :sepal-length | :sepal-width | :petal-length | :petal-width |  :species |
|----------:|--------------:|-------------:|--------------:|-------------:|-----------|
|         1 |           5.1 |          3.5 |           1.4 |          0.2 |    setosa |
|         2 |           4.9 |          3.0 |           1.4 |          0.2 |    setosa |
|         3 |           4.7 |          3.2 |           1.3 |          0.2 |    setosa |
|         4 |           4.6 |          3.1 |           1.5 |          0.2 |    setosa |
|         5 |           5.0 |          3.6 |           1.4 |          0.2 |    setosa |
|         6 |           5.4 |          3.9 |           1.7 |          0.4 |    setosa |
|         7 |           4.6 |          3.4 |           1.4 |          0.3 |    setosa |
|         8 |           5.0 |          3.4 |           1.5 |          0.2 |    setosa |
|         9 |           4.4 |          2.9 |           1.4 |          0.2 |    setosa |
|        10 |           4.9 |          3.1 |           1.5 |          0.1 |    setosa |
|       ... |           ... |          ... |           ... |          ... |       ... |
|       140 |           6.9 |          3.1 |           5.4 |          2.1 | virginica |
|       141 |           6.7 |          3.1 |           5.6 |          2.4 | virginica |
|       142 |           6.9 |          3.1 |           5.1 |          2.3 | virginica |
|       143 |           5.8 |          2.7 |           5.1 |          1.9 | virginica |
|       144 |           6.8 |          3.2 |           5.9 |          2.3 | virginica |
|       145 |           6.7 |          3.3 |           5.7 |          2.5 | virginica |
|       146 |           6.7 |          3.0 |           5.2 |          2.3 | virginica |
|       147 |           6.3 |          2.5 |           5.0 |          1.9 | virginica |
|       148 |           6.5 |          3.0 |           5.2 |          2.0 | virginica |
|       149 |           6.2 |          3.4 |           5.4 |          2.3 | virginica |
|       150 |           5.9 |          3.0 |           5.1 |          1.8 | virginica |
}

Color is on the point layer. The smooth layer does not see it – one overall regression line.

Override

Lower scopes override higher ones. A layer can cancel a mapping by setting it to nil:

(-> (rdatasets/datasets-iris)
    (pj/pose :sepal-length :sepal-width {:color :species})
    (pj/lay-point {:color nil})
    (pj/lay-smooth {:stat :linear-model}))

Printed, the override appears as :color nil in the point layer’s own :mapping, erasing the pose-level color for that layer only:

(-> (rdatasets/datasets-iris)
    (pj/pose :sepal-length :sepal-width {:color :species})
    (pj/lay-point {:color nil})
    (pj/lay-smooth {:stat :linear-model})
    kind/pprint)

{:mapping {:color :species, :x :sepal-length, :y :sepal-width},
 :layers
 [{:layer-type :point, :mapping {:color nil}}
  {:layer-type :smooth, :stat :linear-model}],
 :data
 https://vincentarelbundock.github.io/Rdatasets/csv/datasets/iris.csv [150 6]:

| :rownames | :sepal-length | :sepal-width | :petal-length | :petal-width |  :species |
|----------:|--------------:|-------------:|--------------:|-------------:|-----------|
|         1 |           5.1 |          3.5 |           1.4 |          0.2 |    setosa |
|         2 |           4.9 |          3.0 |           1.4 |          0.2 |    setosa |
|         3 |           4.7 |          3.2 |           1.3 |          0.2 |    setosa |
|         4 |           4.6 |          3.1 |           1.5 |          0.2 |    setosa |
|         5 |           5.0 |          3.6 |           1.4 |          0.2 |    setosa |
|         6 |           5.4 |          3.9 |           1.7 |          0.4 |    setosa |
|         7 |           4.6 |          3.4 |           1.4 |          0.3 |    setosa |
|         8 |           5.0 |          3.4 |           1.5 |          0.2 |    setosa |
|         9 |           4.4 |          2.9 |           1.4 |          0.2 |    setosa |
|        10 |           4.9 |          3.1 |           1.5 |          0.1 |    setosa |
|       ... |           ... |          ... |           ... |          ... |       ... |
|       140 |           6.9 |          3.1 |           5.4 |          2.1 | virginica |
|       141 |           6.7 |          3.1 |           5.6 |          2.4 | virginica |
|       142 |           6.9 |          3.1 |           5.1 |          2.3 | virginica |
|       143 |           5.8 |          2.7 |           5.1 |          1.9 | virginica |
|       144 |           6.8 |          3.2 |           5.9 |          2.3 | virginica |
|       145 |           6.7 |          3.3 |           5.7 |          2.5 | virginica |
|       146 |           6.7 |          3.0 |           5.2 |          2.3 | virginica |
|       147 |           6.3 |          2.5 |           5.0 |          1.9 | virginica |
|       148 |           6.5 |          3.0 |           5.2 |          2.0 | virginica |
|       149 |           6.2 |          3.4 |           5.4 |          2.3 | virginica |
|       150 |           5.9 |          3.0 |           5.1 |          1.8 | virginica |
}

The pose says :color :species. The point layer cancels it with nil – uncolored points. The smooth layer has no override, so it keeps the pose-level color – three lines.

How scope is applied

The same scoping principle governs three things – mappings, layers, and data – but each combines differently when pose level meets layer level:

Mappings: pose and layer mappings merge; the innermost wins on conflict, and an explicit nil erases a mapping inherited from above.
Layers: every pj/lay-* accumulates a new layer; layers do not override, they pile up.
Data: the first argument to pj/pose/pj/lay-* sets the pose-level dataset; passing :data in a layer’s options overrides it for that layer alone (innermost non-nil wins).

Layer-level data

Pass :data in the options map of pj/lay-* to give that layer its own dataset:

(def setosa
  (tc/select-rows (rdatasets/datasets-iris)
                  #(= "setosa" (:species %))))

(def versicolor
  (tc/select-rows (rdatasets/datasets-iris)
                  #(= "versicolor" (:species %))))

(-> (rdatasets/datasets-iris)
    (pj/pose :sepal-length :sepal-width)
    (pj/lay-point {:data setosa})
    (pj/lay-smooth {:stat :linear-model :data versicolor}))

Printed, each layer carries its own :data alongside its :mapping; the pose-level :data remains as a default for any layer that does not override it:

(-> (rdatasets/datasets-iris)
    (pj/pose :sepal-length :sepal-width)
    (pj/lay-point {:data setosa})
    (pj/lay-smooth {:stat :linear-model :data versicolor})
    kind/pprint)

{:mapping {:x :sepal-length, :y :sepal-width},
 :layers
 [{:layer-type :point,
   :data
   https://vincentarelbundock.github.io/Rdatasets/csv/datasets/iris.csv [50 6]:

| :rownames | :sepal-length | :sepal-width | :petal-length | :petal-width | :species |
|----------:|--------------:|-------------:|--------------:|-------------:|----------|
|         1 |           5.1 |          3.5 |           1.4 |          0.2 |   setosa |
|         2 |           4.9 |          3.0 |           1.4 |          0.2 |   setosa |
|         3 |           4.7 |          3.2 |           1.3 |          0.2 |   setosa |
|         4 |           4.6 |          3.1 |           1.5 |          0.2 |   setosa |
|         5 |           5.0 |          3.6 |           1.4 |          0.2 |   setosa |
|         6 |           5.4 |          3.9 |           1.7 |          0.4 |   setosa |
|         7 |           4.6 |          3.4 |           1.4 |          0.3 |   setosa |
|         8 |           5.0 |          3.4 |           1.5 |          0.2 |   setosa |
|         9 |           4.4 |          2.9 |           1.4 |          0.2 |   setosa |
|        10 |           4.9 |          3.1 |           1.5 |          0.1 |   setosa |
|       ... |           ... |          ... |           ... |          ... |      ... |
|        40 |           5.1 |          3.4 |           1.5 |          0.2 |   setosa |
|        41 |           5.0 |          3.5 |           1.3 |          0.3 |   setosa |
|        42 |           4.5 |          2.3 |           1.3 |          0.3 |   setosa |
|        43 |           4.4 |          3.2 |           1.3 |          0.2 |   setosa |
|        44 |           5.0 |          3.5 |           1.6 |          0.6 |   setosa |
|        45 |           5.1 |          3.8 |           1.9 |          0.4 |   setosa |
|        46 |           4.8 |          3.0 |           1.4 |          0.3 |   setosa |
|        47 |           5.1 |          3.8 |           1.6 |          0.2 |   setosa |
|        48 |           4.6 |          3.2 |           1.4 |          0.2 |   setosa |
|        49 |           5.3 |          3.7 |           1.5 |          0.2 |   setosa |
|        50 |           5.0 |          3.3 |           1.4 |          0.2 |   setosa |
}
  {:layer-type :smooth,
   :stat :linear-model,
   :data
   https://vincentarelbundock.github.io/Rdatasets/csv/datasets/iris.csv [50 6]:

| :rownames | :sepal-length | :sepal-width | :petal-length | :petal-width |   :species |
|----------:|--------------:|-------------:|--------------:|-------------:|------------|
|        51 |           7.0 |          3.2 |           4.7 |          1.4 | versicolor |
|        52 |           6.4 |          3.2 |           4.5 |          1.5 | versicolor |
|        53 |           6.9 |          3.1 |           4.9 |          1.5 | versicolor |
|        54 |           5.5 |          2.3 |           4.0 |          1.3 | versicolor |
|        55 |           6.5 |          2.8 |           4.6 |          1.5 | versicolor |
|        56 |           5.7 |          2.8 |           4.5 |          1.3 | versicolor |
|        57 |           6.3 |          3.3 |           4.7 |          1.6 | versicolor |
|        58 |           4.9 |          2.4 |           3.3 |          1.0 | versicolor |
|        59 |           6.6 |          2.9 |           4.6 |          1.3 | versicolor |
|        60 |           5.2 |          2.7 |           3.9 |          1.4 | versicolor |
|       ... |           ... |          ... |           ... |          ... |        ... |
|        90 |           5.5 |          2.5 |           4.0 |          1.3 | versicolor |
|        91 |           5.5 |          2.6 |           4.4 |          1.2 | versicolor |
|        92 |           6.1 |          3.0 |           4.6 |          1.4 | versicolor |
|        93 |           5.8 |          2.6 |           4.0 |          1.2 | versicolor |
|        94 |           5.0 |          2.3 |           3.3 |          1.0 | versicolor |
|        95 |           5.6 |          2.7 |           4.2 |          1.3 | versicolor |
|        96 |           5.7 |          3.0 |           4.2 |          1.2 | versicolor |
|        97 |           5.7 |          2.9 |           4.2 |          1.3 | versicolor |
|        98 |           6.2 |          2.9 |           4.3 |          1.3 | versicolor |
|        99 |           5.1 |          2.5 |           3.0 |          1.1 | versicolor |
|       100 |           5.7 |          2.8 |           4.1 |          1.3 | versicolor |
}],
 :data
 https://vincentarelbundock.github.io/Rdatasets/csv/datasets/iris.csv [150 6]:

| :rownames | :sepal-length | :sepal-width | :petal-length | :petal-width |  :species |
|----------:|--------------:|-------------:|--------------:|-------------:|-----------|
|         1 |           5.1 |          3.5 |           1.4 |          0.2 |    setosa |
|         2 |           4.9 |          3.0 |           1.4 |          0.2 |    setosa |
|         3 |           4.7 |          3.2 |           1.3 |          0.2 |    setosa |
|         4 |           4.6 |          3.1 |           1.5 |          0.2 |    setosa |
|         5 |           5.0 |          3.6 |           1.4 |          0.2 |    setosa |
|         6 |           5.4 |          3.9 |           1.7 |          0.4 |    setosa |
|         7 |           4.6 |          3.4 |           1.4 |          0.3 |    setosa |
|         8 |           5.0 |          3.4 |           1.5 |          0.2 |    setosa |
|         9 |           4.4 |          2.9 |           1.4 |          0.2 |    setosa |
|        10 |           4.9 |          3.1 |           1.5 |          0.1 |    setosa |
|       ... |           ... |          ... |           ... |          ... |       ... |
|       140 |           6.9 |          3.1 |           5.4 |          2.1 | virginica |
|       141 |           6.7 |          3.1 |           5.6 |          2.4 | virginica |
|       142 |           6.9 |          3.1 |           5.1 |          2.3 | virginica |
|       143 |           5.8 |          2.7 |           5.1 |          1.9 | virginica |
|       144 |           6.8 |          3.2 |           5.9 |          2.3 | virginica |
|       145 |           6.7 |          3.3 |           5.7 |          2.5 | virginica |
|       146 |           6.7 |          3.0 |           5.2 |          2.3 | virginica |
|       147 |           6.3 |          2.5 |           5.0 |          1.9 | virginica |
|       148 |           6.5 |          3.0 |           5.2 |          2.0 | virginica |
|       149 |           6.2 |          3.4 |           5.4 |          2.3 | virginica |
|       150 |           5.9 |          3.0 |           5.1 |          1.8 | virginica |
}

Points from setosa (50 rows), regression from versicolor. Same pose, different data per layer.

Faceting splits a single dataset into panels automatically:

(-> (rdatasets/datasets-iris)
    (pj/lay-point :sepal-length :sepal-width)
    (pj/facet :species))

Three panels, each with its own data subset.

Identity

Scope determines what each pose and layer sees. But how do layers find their pose? The rule is:

pj/lay-* with columns finds the most recent leaf pose whose position mappings match, or creates a new one if none matches.

This makes the threading pipeline sequential and predictable. When the columns match, the new layer attaches to the existing pose:

(-> (rdatasets/datasets-iris)
    (pj/lay-point :sepal-length :sepal-width)
    (pj/lay-smooth :sepal-length :sepal-width {:stat :linear-model}))

One pose, two layers: scatter points and a regression line sharing the same axes.

To place two plots side by side with different columns, use an explicit pj/pose call to add a second panel:

(-> (rdatasets/datasets-iris)
    (pj/pose [[:sepal-length :sepal-width] [:petal-length :petal-width]])
    (pj/lay-point))

Printed, the two-panel outcome is a composite with two sub-poses:

(-> (rdatasets/datasets-iris)
    (pj/pose [[:sepal-length :sepal-width] [:petal-length :petal-width]])
    (pj/lay-point)
    kind/pprint)

{:layout {:direction :matrix},
 :data
 https://vincentarelbundock.github.io/Rdatasets/csv/datasets/iris.csv [150 6]:

| :rownames | :sepal-length | :sepal-width | :petal-length | :petal-width |  :species |
|----------:|--------------:|-------------:|--------------:|-------------:|-----------|
|         1 |           5.1 |          3.5 |           1.4 |          0.2 |    setosa |
|         2 |           4.9 |          3.0 |           1.4 |          0.2 |    setosa |
|         3 |           4.7 |          3.2 |           1.3 |          0.2 |    setosa |
|         4 |           4.6 |          3.1 |           1.5 |          0.2 |    setosa |
|         5 |           5.0 |          3.6 |           1.4 |          0.2 |    setosa |
|         6 |           5.4 |          3.9 |           1.7 |          0.4 |    setosa |
|         7 |           4.6 |          3.4 |           1.4 |          0.3 |    setosa |
|         8 |           5.0 |          3.4 |           1.5 |          0.2 |    setosa |
|         9 |           4.4 |          2.9 |           1.4 |          0.2 |    setosa |
|        10 |           4.9 |          3.1 |           1.5 |          0.1 |    setosa |
|       ... |           ... |          ... |           ... |          ... |       ... |
|       140 |           6.9 |          3.1 |           5.4 |          2.1 | virginica |
|       141 |           6.7 |          3.1 |           5.6 |          2.4 | virginica |
|       142 |           6.9 |          3.1 |           5.1 |          2.3 | virginica |
|       143 |           5.8 |          2.7 |           5.1 |          1.9 | virginica |
|       144 |           6.8 |          3.2 |           5.9 |          2.3 | virginica |
|       145 |           6.7 |          3.3 |           5.7 |          2.5 | virginica |
|       146 |           6.7 |          3.0 |           5.2 |          2.3 | virginica |
|       147 |           6.3 |          2.5 |           5.0 |          1.9 | virginica |
|       148 |           6.5 |          3.0 |           5.2 |          2.0 | virginica |
|       149 |           6.2 |          3.4 |           5.4 |          2.3 | virginica |
|       150 |           5.9 |          3.0 |           5.1 |          1.8 | virginica |
,
 :poses
 [{:mapping {:x :sepal-length, :y :sepal-width}, :layers []}
  {:mapping {:x :petal-length, :y :petal-width}, :layers []}],
 :layers [{:layer-type :point}]}

Two panels, arranged side by side. For plots with different layer kinds (a scatter and a histogram, say), use pj/arrange to combine independent poses:

(pj/arrange
 [(-> (rdatasets/datasets-iris) (pj/lay-histogram :sepal-width))
  (-> (rdatasets/datasets-iris) (pj/lay-density :sepal-width))])

The Pose

The Poses chapter walked through the shape of a pose end to end. This is the per-field reference card – what each slot holds and which API call sets it:

Field	Contains	Set by
`:data`	the dataset	`pj/pose`, `pj/lay-*`, or `pj/with-data`
`:mapping`	pose-level mappings	`pj/pose`
`:layers`	layers attached to the pose	`pj/lay-*`
`:opts`	title, width, theme, scale, coord	`pj/options`, `pj/scale`, `pj/coord`

A composite pose adds :poses (sub-poses) and optionally :layout and :share-scales; see the Composition chapter for that shape.

Mark, Stat, and Position

Each layer has a layer-type – a rendering recipe with three parts:

Mark – the visual shape (point, bar, line, area, tile, …)
Stat – the computation before rendering (identity, bin, linear-model, density, …)
Position – how overlapping groups share space (identity, dodge, stack, …)

A keyword like :histogram or :point names a layer-type – look it up to see its parts:

(pj/layer-type-lookup :histogram)

{:mark :bar,
 :stat :bin,
 :x-only true,
 :accepts [:normalize :bins :binwidth],
 :doc "Histogram — bins numerical data into bars."}

A histogram: stat :bin computes ranges, mark :bar shows them:

(-> (rdatasets/datasets-iris)
    (pj/lay-histogram :sepal-length))

A regression: stat :linear-model fits a line, mark :line shows it:

(pj/layer-type-lookup :smooth)

{:mark :line,
 :stat :loess,
 :accepts
 [:confidence-band
  :level
  :bootstrap-resamples
  :bandwidth
  :size
  :nudge-x
  :nudge-y],
 :doc
 "Smoothed trend line — defaults to LOESS; pass {:stat :linear-model} for OLS."}

Position :stack places groups on top of each other:

(-> {:day ["Mon" "Mon" "Tue" "Tue"]
     :count [30 20 45 15]
     :meal ["lunch" "dinner" "lunch" "dinner"]}
    (pj/lay-value-bar :day :count {:color :meal :position :stack}))

See the Layer Types chapter for complete tables of every mark, stat, and position.

Inference

Plotje tries to make small poses work without you having to specify everything. You give it what you know – a dataset, perhaps a column or two – and it fills in the rest by looking at the data.

The underlying principle is short: resolved = your-choice, or else inferred-from-data. Wherever you make a choice it wins; wherever you don’t, the library picks something sensible.

Column inference kicks in when a dataset has up to three columns and you call pj/pose (or a pj/lay-*) without naming any column. Plotje pairs the columns with aesthetics in dataset order:

Columns in data	Inferred mapping
1	`:x`
2	`:x`, `:y`
3	`:x`, `:y`, `:color`

With four or more columns the library does not guess – you have to name the columns you want. Column inference is most useful for quick sketches of small, focused datasets.

(-> {:height [170 180 165 175] :weight [70 80 65 75]}
    pj/lay-point)

Layer-type inference fires when a pose has no explicit layer. The library inspects the types of the columns the mapping refers to and picks a chart type that fits. Two numerical columns produce a scatter plot:

(-> (rdatasets/datasets-iris)
    (pj/pose :sepal-length :sepal-width))

A single numerical column produces a histogram:

(-> (rdatasets/datasets-iris)
    (pj/pose :sepal-length))

In both cases the inferred plot is the same one you would get from pj/lay-point or pj/lay-histogram. Inference is a shorthand, not a separate rendering path. Every inferred choice can be overridden – see Inference Rules for the full decision logic and override settings, or Architecture for where in the pipeline each kind of inference happens.

Incremental Building

Because poses are plain data, you can save a partial plot and extend it later. Each call returns a new pose without changing the original.

(def scatter-base
  (-> (rdatasets/datasets-iris)
      (pj/lay-point :sepal-length :sepal-width)))

Add a regression line:

(-> scatter-base (pj/lay-smooth {:stat :linear-model}))

Or a LOESS smoother instead:

(-> scatter-base pj/lay-smooth)

Reusable Pose Templates

A pose does not need to carry data. (pj/pose) creates an empty pose you can evolve like any other – adding layers, options – and then attach a dataset at the end with pj/with-data. The result is a plotting instrument that can be applied to many datasets:

(def scatter-with-regression
  (-> (pj/pose nil {:x :x :y :y :color :group})
      pj/lay-point
      (pj/lay-smooth {:stat :linear-model})
      (pj/options {:title "Scatter with Regression"})))

Printed, the template has :data nil – a pose that carries mapping, layers, and options but no data yet:

(kind/pprint scatter-with-regression)

{:mapping {:x :x, :y :y, :color :group},
 :layers
 [{:layer-type :point} {:layer-type :smooth, :stat :linear-model}],
 :opts {:title "Scatter with Regression"}}

Apply to one dataset:

(-> scatter-with-regression
    (pj/with-data {:x [1 2 3 4 5 6]
                   :y [2 4 3 5 6 8]
                   :group ["a" "a" "a" "b" "b" "b"]}))

Apply the same template to a different dataset:

(-> scatter-with-regression
    (pj/with-data {:x [10 20 30 40 50 60]
                   :y [15 18 22 20 25 28]
                   :group ["x" "x" "x" "y" "y" "y"]}))

pj/with-data validates at attach time: if the dataset is missing a column the pose references, you get a clear error naming the missing columns – no cryptic failure deep in the rendering path.

Color and Grouping

:color controls point and line colors. Its behavior depends on what you pass.

Categorical column – each unique value gets a distinct color. A legend maps labels to colors:

(-> (rdatasets/datasets-iris)
    (pj/lay-point :sepal-length :sepal-width {:color :species}))

Numeric column – values map to a continuous gradient:

(-> (rdatasets/datasets-iris)
    (pj/lay-point :sepal-length :sepal-width {:color :petal-length}))

Fixed color string – all points colored uniformly:

(-> (rdatasets/datasets-iris)
    (pj/lay-point :sepal-length :sepal-width {:color "steelblue"}))

Coming from ggplot2. In ggplot2, colour="blue" is always a literal CSS color. In Plotje, a string {:color "blue"} is interpreted as a column reference if a column with that exact name exists in the data, otherwise as a literal CSS color. Matching is strict: a string only matches a string column name, and a keyword only matches a keyword column name. Hex codes like "#0000ff" cannot collide with a column name and are unambiguous. A keyword {:color :blue} is always a column reference and throws if the column is missing.

The disambiguation matters when the dataset uses string column names. With a string column literally named "blue", the column wins – three palette colors render, not a single literal blue:

(-> (tc/dataset {"x" [1 2 3] "y" [1 2 3] "blue" ["a" "b" "c"]})
    (pj/lay-point "x" "y" {:color "blue"}))

Same string :color, dataset without a "blue" column – “blue” parses as a literal CSS color:

(-> (tc/dataset {"x" [1 2 3] "y" [1 2 3]})
    (pj/lay-point "x" "y" {:color "blue"}))

Categorical color does more than set colors – it creates groups. Each group is processed independently: it gets its own regression line, density curve, or boxplot:

(-> (rdatasets/datasets-iris)
    (pj/lay-density :sepal-length {:color :species}))

Other visual properties include :alpha (transparency), :size, and :shape. Each accepts a literal value or a column reference, the same way :color does.

Bubble plot – :size mapped to a numeric column gives each point a radius reflecting the value:

(-> (rdatasets/datasets-iris)
    (pj/lay-point :sepal-length :sepal-width
                  {:color :petal-length :size :petal-width :alpha 0.7}))

Shape by category – :shape mapped to a categorical column renders each group with a different marker. Useful for monochrome printing or to reinforce the color encoding:

(-> (rdatasets/datasets-iris)
    (pj/lay-point :sepal-length :sepal-width {:shape :species}))

The :group option creates groups without changing colors:

(-> (rdatasets/datasets-iris)
    (pj/pose :sepal-length :sepal-width {:group :species})
    pj/lay-point
    (pj/lay-smooth {:stat :linear-model}))

Three regression lines but all the same color.

Plot Options and Annotations

So far you’ve seen mappings, layers, and data – all scoped at pose or layer level. The functions in this section set plot-level options instead: values that configure the whole rendered plot and cannot be scoped down. See Options and Scopes for the full picture.

pj/options sets plot-level settings – title, axis labels, size, theme overrides:

(-> (rdatasets/datasets-iris)
    (pj/lay-point :sepal-length :sepal-width {:color :species})
    (pj/options {:title "Iris Measurements"
                 :width 500 :palette :dark2}))

Reference lines and shaded bands are themselves layers, added with pj/lay-rule-h, pj/lay-rule-v, pj/lay-band-h, pj/lay-band-v. Positions come from the options map (:y-intercept / :x-intercept for rules; :y-min/:y-max or :x-min/:x-max for bands); appearance aesthetics like :color and :alpha work the same way they do on any other layer.

(-> (rdatasets/datasets-iris)
    (pj/lay-point :sepal-length :sepal-width {:color :species})
    (pj/lay-rule-h {:y-intercept 3.0})
    (pj/lay-band-v {:x-min 5.0 :x-max 6.0 :alpha 0.1}))

Printed, annotation layers carry their positions (:y-intercept, :x-min, :x-max) and appearance (:alpha) inside the :mapping slot, the same slot chart layers use for their mappings:

(-> (rdatasets/datasets-iris)
    (pj/lay-point :sepal-length :sepal-width {:color :species})
    (pj/lay-rule-h {:y-intercept 3.0})
    (pj/lay-band-v {:x-min 5.0 :x-max 6.0 :alpha 0.1})
    kind/pprint)

{:layers
 [{:layer-type :point, :mapping {:color :species}}
  {:layer-type :rule-h, :mapping {:y-intercept 3.0}}
  {:layer-type :band-v,
   :mapping {:x-min 5.0, :x-max 6.0, :alpha 0.1}}],
 :data
 https://vincentarelbundock.github.io/Rdatasets/csv/datasets/iris.csv [150 6]:

| :rownames | :sepal-length | :sepal-width | :petal-length | :petal-width |  :species |
|----------:|--------------:|-------------:|--------------:|-------------:|-----------|
|         1 |           5.1 |          3.5 |           1.4 |          0.2 |    setosa |
|         2 |           4.9 |          3.0 |           1.4 |          0.2 |    setosa |
|         3 |           4.7 |          3.2 |           1.3 |          0.2 |    setosa |
|         4 |           4.6 |          3.1 |           1.5 |          0.2 |    setosa |
|         5 |           5.0 |          3.6 |           1.4 |          0.2 |    setosa |
|         6 |           5.4 |          3.9 |           1.7 |          0.4 |    setosa |
|         7 |           4.6 |          3.4 |           1.4 |          0.3 |    setosa |
|         8 |           5.0 |          3.4 |           1.5 |          0.2 |    setosa |
|         9 |           4.4 |          2.9 |           1.4 |          0.2 |    setosa |
|        10 |           4.9 |          3.1 |           1.5 |          0.1 |    setosa |
|       ... |           ... |          ... |           ... |          ... |       ... |
|       140 |           6.9 |          3.1 |           5.4 |          2.1 | virginica |
|       141 |           6.7 |          3.1 |           5.6 |          2.4 | virginica |
|       142 |           6.9 |          3.1 |           5.1 |          2.3 | virginica |
|       143 |           5.8 |          2.7 |           5.1 |          1.9 | virginica |
|       144 |           6.8 |          3.2 |           5.9 |          2.3 | virginica |
|       145 |           6.7 |          3.3 |           5.7 |          2.5 | virginica |
|       146 |           6.7 |          3.0 |           5.2 |          2.3 | virginica |
|       147 |           6.3 |          2.5 |           5.0 |          1.9 | virginica |
|       148 |           6.5 |          3.0 |           5.2 |          2.0 | virginica |
|       149 |           6.2 |          3.4 |           5.4 |          2.3 | virginica |
|       150 |           5.9 |          3.0 |           5.1 |          1.8 | virginica |
,
 :mapping {:x :sepal-length, :y :sepal-width}}

See the Customization chapter for themes, palettes, and annotation appearance (default opacity, color overrides). Temporal intercepts (LocalDate, Instant) are covered in Timelines.

Coordinates and Scales

pj/coord sets the coordinate system. :flip swaps the axes:

(-> (rdatasets/datasets-iris)
    (pj/lay-point :sepal-length :sepal-width {:color :species})
    (pj/coord :flip))

:fixed locks the aspect ratio so one data unit on x equals one data unit on y – useful for spatial data or when distances on the two axes should read the same. The panel width adjusts to preserve the ratio:

(-> {:x [-1 1 -1 1] :y [-1 -1 1 1]}
    (pj/lay-point :x :y)
    (pj/coord :fixed))

pj/scale changes how a numeric axis is shown. :log applies a logarithmic transformation:

(-> {:population [1000 5000 50000 200000 1000000 5000000]
     :area [2 8 30 120 500 2100]}
    (pj/lay-point :population :area)
    (pj/scale :x :log)
    (pj/scale :y :log))

Both are plot-level – they apply uniformly across the whole pose.

Faceting and Multi-Panel Layouts

Faceting splits a pose into panels by a column’s values:

(-> (rdatasets/datasets-iris)
    (pj/pose :sepal-length :sepal-width)
    (pj/facet :species)
    pj/lay-point
    (pj/lay-smooth {:stat :linear-model}))

Printed, the facet column lives in :opts as :facet-col – the pose itself is not split until render time:

(-> (rdatasets/datasets-iris)
    (pj/pose :sepal-length :sepal-width)
    (pj/facet :species)
    pj/lay-point
    (pj/lay-smooth {:stat :linear-model})
    kind/pprint)

{:mapping {:x :sepal-length, :y :sepal-width},
 :layers
 [{:layer-type :point} {:layer-type :smooth, :stat :linear-model}],
 :data
 https://vincentarelbundock.github.io/Rdatasets/csv/datasets/iris.csv [150 6]:

| :rownames | :sepal-length | :sepal-width | :petal-length | :petal-width |  :species |
|----------:|--------------:|-------------:|--------------:|-------------:|-----------|
|         1 |           5.1 |          3.5 |           1.4 |          0.2 |    setosa |
|         2 |           4.9 |          3.0 |           1.4 |          0.2 |    setosa |
|         3 |           4.7 |          3.2 |           1.3 |          0.2 |    setosa |
|         4 |           4.6 |          3.1 |           1.5 |          0.2 |    setosa |
|         5 |           5.0 |          3.6 |           1.4 |          0.2 |    setosa |
|         6 |           5.4 |          3.9 |           1.7 |          0.4 |    setosa |
|         7 |           4.6 |          3.4 |           1.4 |          0.3 |    setosa |
|         8 |           5.0 |          3.4 |           1.5 |          0.2 |    setosa |
|         9 |           4.4 |          2.9 |           1.4 |          0.2 |    setosa |
|        10 |           4.9 |          3.1 |           1.5 |          0.1 |    setosa |
|       ... |           ... |          ... |           ... |          ... |       ... |
|       140 |           6.9 |          3.1 |           5.4 |          2.1 | virginica |
|       141 |           6.7 |          3.1 |           5.6 |          2.4 | virginica |
|       142 |           6.9 |          3.1 |           5.1 |          2.3 | virginica |
|       143 |           5.8 |          2.7 |           5.1 |          1.9 | virginica |
|       144 |           6.8 |          3.2 |           5.9 |          2.3 | virginica |
|       145 |           6.7 |          3.3 |           5.7 |          2.5 | virginica |
|       146 |           6.7 |          3.0 |           5.2 |          2.3 | virginica |
|       147 |           6.3 |          2.5 |           5.0 |          1.9 | virginica |
|       148 |           6.5 |          3.0 |           5.2 |          2.0 | virginica |
|       149 |           6.2 |          3.4 |           5.4 |          2.3 | virginica |
|       150 |           5.9 |          3.0 |           5.1 |          1.8 | virginica |
,
 :opts {:facet-col :species}}

A vector of column names creates one panel per variable:

(-> (rdatasets/datasets-iris)
    (pj/lay-histogram [:sepal-length :sepal-width :petal-length]))

Printed, each named column becomes a sub-pose with its own x mapping; the bare pj/lay-histogram attaches at the root and flows into every panel at plan time:

(-> (rdatasets/datasets-iris)
    (pj/lay-histogram [:sepal-length :sepal-width :petal-length])
    kind/pprint)

{:layout {:direction :matrix},
 :data
 https://vincentarelbundock.github.io/Rdatasets/csv/datasets/iris.csv [150 6]:

| :rownames | :sepal-length | :sepal-width | :petal-length | :petal-width |  :species |
|----------:|--------------:|-------------:|--------------:|-------------:|-----------|
|         1 |           5.1 |          3.5 |           1.4 |          0.2 |    setosa |
|         2 |           4.9 |          3.0 |           1.4 |          0.2 |    setosa |
|         3 |           4.7 |          3.2 |           1.3 |          0.2 |    setosa |
|         4 |           4.6 |          3.1 |           1.5 |          0.2 |    setosa |
|         5 |           5.0 |          3.6 |           1.4 |          0.2 |    setosa |
|         6 |           5.4 |          3.9 |           1.7 |          0.4 |    setosa |
|         7 |           4.6 |          3.4 |           1.4 |          0.3 |    setosa |
|         8 |           5.0 |          3.4 |           1.5 |          0.2 |    setosa |
|         9 |           4.4 |          2.9 |           1.4 |          0.2 |    setosa |
|        10 |           4.9 |          3.1 |           1.5 |          0.1 |    setosa |
|       ... |           ... |          ... |           ... |          ... |       ... |
|       140 |           6.9 |          3.1 |           5.4 |          2.1 | virginica |
|       141 |           6.7 |          3.1 |           5.6 |          2.4 | virginica |
|       142 |           6.9 |          3.1 |           5.1 |          2.3 | virginica |
|       143 |           5.8 |          2.7 |           5.1 |          1.9 | virginica |
|       144 |           6.8 |          3.2 |           5.9 |          2.3 | virginica |
|       145 |           6.7 |          3.3 |           5.7 |          2.5 | virginica |
|       146 |           6.7 |          3.0 |           5.2 |          2.3 | virginica |
|       147 |           6.3 |          2.5 |           5.0 |          1.9 | virginica |
|       148 |           6.5 |          3.0 |           5.2 |          2.0 | virginica |
|       149 |           6.2 |          3.4 |           5.4 |          2.3 | virginica |
|       150 |           5.9 |          3.0 |           5.1 |          1.8 | virginica |
,
 :poses
 [{:mapping {:x :sepal-length}, :layers []}
  {:mapping {:x :sepal-width}, :layers []}
  {:mapping {:x :petal-length}, :layers []}],
 :layers [{:layer-type :histogram}]}

To place whole poses side by side, use pj/arrange:

(pj/arrange
 [(-> (rdatasets/datasets-iris)
      (pj/lay-point :sepal-length :sepal-width))
  (-> (rdatasets/datasets-iris)
      (pj/lay-point :petal-length :petal-width))])

Each sub-pose inside pj/arrange can have its own data, mapping, layers, and options – they are independent plots tiled into a single rendered image.

What’s Next

Composition – composite poses, shared scales, and multi-panel patterns
Options and Scopes – where options live and how scope determines what they reach
Pose Rules – 30 rules that formalize the model with tested assertions

source: notebooks/plotje_book/core_concepts.clj