9  Inference Rules

Plotje infers many parameters automatically so you can write less and get reasonable defaults. This notebook walks each rule with a worked example: a small pose, the rendered plot, and a description of what was inferred. Every rule is also checked against the resolved plot on every run, so the claims here stay honest as the library evolves.

This chapter is a reference: each rule with its default and its override. For the conceptual overview, read Poses and Core Concepts first. The examples use small inline datasets so the relationships are easy to read at a glance.

(ns plotje-book.inference-rules
  (:require
   ;; Tablecloth -- dataset manipulation
   [tablecloth.api :as tc]
   ;; Kindly -- notebook rendering protocol
   [scicloj.kindly.v4.kind :as kind]
   ;; Plotje -- composable plotting
   [scicloj.plotje.api :as pj]
   ;; Rdatasets -- standard datasets
   [scicloj.metamorph.ml.rdatasets :as rdatasets]))

A Worked Example

Before the rule-by-rule tour, here is what β€œinference” looks like in practice: a five-point scatter where Plotje filled in almost everything for us.

(def five-points
  {:x [1.0 2.0 3.0 4.0 5.0]
   :y [2.1 4.3 3.0 5.2 4.8]})
(def scatter-pose
  (-> five-points
      (pj/lay-point :x :y)))
scatter-pose
yx1.01.52.02.53.03.54.04.55.02.02.53.03.54.04.55.0

Notice what was inferred:

  • The x-axis label "x" and y-axis label "y", taken from the column keywords

  • A linear scale on each axis, since both columns are numeric

  • The data range [1.0, 5.0] widened to [0.8, 5.2] – a 5% padding so the extreme points do not sit on the panel edge

  • Round tick values: 1.0, 1.5, 2.0, ...

  • No legend, since no color mapping was given

  • A single point group rendered in the default color (dark gray, #333)

Each of those decisions is its own inference rule, with a default and an explicit override.

Overrides at a Glance

Every inference rule has an explicit override. The table below lists them all – scan it to find what you need, then jump to the matching section for the details and worked examples.

What is inferred Default Override
Column selection one column fills x; two fill x, y; three fill x, y, color explicit column args in pj/pose or pj/lay-*
Column type dtype inspection :x-type, :y-type, :color-type in pose or layer options
Aesthetic classification keyword = column, string = color/column explicit :color keyword vs hex string
Grouping categorical color column :group aesthetic
Layer type (mark + stat) column types (see Layer Type section) pj/lay-point, pj/lay-histogram, etc.
Domain extent data range + 5% padding (pj/scale pose :x {:domain [0 10]})
Domain zero-anchor bar/stacked charts include zero (pj/scale pose :y {:domain [5 20]})
Fill domain [0.0, 1.0] for fill position (pj/scale pose :y {:domain [0 2]})
Tick values round intervals (linear), powers of 10 (log) wadogo scale configuration
Tick labels number formatting, calendar formatting wadogo label formatting
Axis labels column name, with underscores replaced by spaces (pj/options {:x-label "Custom"})
Color legend categorical = discrete, numerical = continuous, none = no legend :color mapping controls presence
Size legend 5 graduated circles when :size maps to numerical column :size mapping controls presence
Alpha legend 5 graduated opacity squares when :alpha maps to numerical column :alpha mapping controls presence
Layout padding adjusts for title, labels, legend :width, :height in options
Layout type single, facet-grid, multi-variable pj/facet, multiple x-y pairs
Coordinate system :cartesian (pj/coord :flip), (pj/coord :polar)

The sections below walk each rule in detail. The order roughly follows how a pose is resolved into a plot – column selection, column types, aesthetics, grouping, layer type, domains, ticks, labels, legends, layout, coord flip – with two cross-cutting closing sections on how the rules combine in multi-layer plots and a diagram of the full resolution flow.

Column Selection

When column names are omitted, Plotje infers them from the dataset shape:

Number of columns Inferred mapping
1 first column becomes x
2 first becomes x, second becomes y
3 first becomes x, second becomes y, third becomes color
4+ no inference – see the note below

The same rule applies whether you start with pj/lay-* on raw data or pj/pose on raw data. Both read the first 1-3 columns of the dataset in the order they appear and build the mapping from there.

One column:

(-> {:values [1 2 3 4 5 6]}
    pj/lay-histogram)
values1.01.52.02.53.03.54.04.55.05.56.00.00.20.40.60.81.01.21.41.61.82.0

Two columns:

(-> {:x [1 2 3 4 5] :y [2 4 3 5 4]}
    pj/lay-point)
yx1.01.52.02.53.03.54.04.55.02.02.53.03.54.04.55.0

Three columns – the third becomes :color:

(-> {:x [1 2 3 4] :y [4 5 6 7] :g ["a" "a" "b" "b"]}
    pj/lay-point)
yxgab1.01.52.02.53.03.54.04.04.55.05.56.06.57.0

Pose construction infers the same mapping

Calling pj/pose on raw data without explicit column arguments runs the same column-selection rule. A 1-3 column dataset gets its mapping filled in; the resulting pose carries the mapping but has no layer attached yet, so layer type inference (covered below) supplies the mark at render time.

(def two-col-pose
  (pj/pose {:x [1.0 2.0 3.0 4.0 5.0]
            :y [1.0 4.0 9.0 16.0 25.0]}))
two-col-pose
yx1.01.52.02.53.03.54.04.55.00510152025

The inferred mapping is visible on the pose itself:

(-> two-col-pose (select-keys [:mapping :layers]) kind/pprint)
{:mapping {:x :x, :y :y}, :layers []}

4+ columns

With four or more columns there is no unambiguous default, so inference stops:

  • (pj/lay-* data) throws with a message listing the available columns, asking you to pass explicit :x and :y.

  • (pj/pose data) is gentler – it builds a pose with the data attached but no mapping, so you can add one downstream with (pj/pose pose :col-a :col-b) or (pj/lay-point pose :col-a :col-b).

When you provide explicit columns, inference is skipped – you are in full control:

(-> (rdatasets/datasets-iris)
    (pj/lay-point :petal-length :petal-width {:color :species}))
petal widthpetal lengthspeciessetosaversicolorvirginica12345670.00.51.01.52.02.5

Column Types

Once columns are selected, the next step is determining the type of each column – numerical, categorical, or temporal. This determines the scale type, domain, tick style, and the default mark.

Column dtype Inferred type
float, int :numerical
string, keyword, boolean, symbol, text :categorical
LocalDate, LocalDateTime, Instant, java.util.Date :temporal (numerical, with calendar-aware ticks)

A categorical column produces a band scale with string domain values. Compare:

(def animals
  {:animal ["cat" "dog" "bird" "fish"]
   :count [12 8 15 5]})
(def bar-pose
  (-> animals
      (pj/lay-value-bar :animal :count)))
bar-pose
countanimalcatdogbirdfish02468101214

The x-axis lays out the four animal names in order of appearance – strings, treated as a categorical band scale. The y-axis starts at zero because this is a bar chart.

Temporal columns

Dates are detected and converted to epoch-milliseconds internally, with calendar-aware tick labels. Clojure’s #inst reader literal is a convenient way to write dates:

(def temporal-pose
  (-> {:date [#inst "2024-01-01" #inst "2024-06-01" #inst "2024-12-01"]
       :val [10 25 18]}
      (pj/lay-point :date :val)))
temporal-pose
valdateFeb-01Mar-01Apr-01May-01Jun-01Jul-01Aug-01Sep-01Oct-01Nov-011012141618202224

The x-axis carries epoch-millisecond numbers internally, but the 10 tick labels show human-readable dates like "Feb-01". Plotje accepts java.util.Date (from #inst), LocalDate, LocalDateTime, and Instant – all are converted to epoch-milliseconds for plotting, with calendar-aware tick formatting.

Overriding inferred types with :x-type / :y-type

Sometimes a numeric column is really categorical – for example, hours of the day, years, or subject IDs. The inference system sees numbers and treats them as numerical, but you may want discrete categorical bands. Pass :x-type :categorical (or :y-type) to the pose or layer options to override:

(def hour-bar-pose
  (-> {:hour [9 10 11 12] :count [5 8 12 7]}
      (pj/lay-value-bar :hour :count {:x-type :categorical})))
hour-bar-pose
counthour9101112024681012

Four bars at discrete hour bands. Without the override, lay-value-bar would reject the numeric :hour column; with it, the column is treated as categorical (values cast to strings for display). The same override exists for :y-type and for :color-type (see the Grouping section below for a :color-type example).

Aesthetic Resolution

The :color parameter triggers different behaviors depending on what you pass. Each aesthetic channel (:color, :size, :alpha, :text) is classified as either a column reference or a fixed literal.

Column reference – colored by palette

(def colored-pose
  (-> {:x [1 2 3 4 5 6]
       :y [3 5 4 7 6 8]
       :g ["a" "a" "a" "b" "b" "b"]}
      (pj/lay-point :x :y {:color :g})))
colored-pose
yxgab1234563.03.54.04.55.05.56.06.57.07.58.0

The categorical column :g splits the data into two groups, each with its own color drawn from the palette. A legend appears on the right (100 pixels wide) and the panel shrinks to make room.

The next section explores why a categorical color column triggers grouping while a numeric color column does not.

Fixed color string – single color, no legend

(def fixed-color-pose
  (-> five-points
      (pj/lay-point :x :y {:color "#E74C3C"})))
fixed-color-pose
yx1.01.52.02.53.03.54.04.55.02.02.53.03.54.04.55.0

A literal hex string maps every point to that single color: no grouping, no legend, no legend strip. The hex was parsed into the RGBA tuple [0.906 0.298 0.235 1.0].

Named colors and string disambiguation

CSS color names like "red" and "steelblue" also work as fixed colors:

(-> five-points
    (pj/lay-point :x :y {:color "steelblue"}))
yx1.01.52.02.53.03.54.04.55.02.02.53.03.54.04.55.0

This raises a question: since :color also accepts column names as strings (like "species"), how does the system decide whether "red" means the column :red or the color red?

The rule is: check the dataset first. If the string matches a column name in the dataset, it is treated as a column reference. Otherwise, it is treated as a color value – first trying hex parsing, then CSS color name lookup.

Here is the full resolution order for a string :color value:

  1. If the string matches a dataset column, it is a column reference (grouping)
  2. If it starts with #, it is a hex color ("#E74C3C", "#F00")
  3. If it parses as hex without #, it is a hex color ("00FF00")
  4. If it matches a CSS color name, it is a named color ("red", "steelblue")
  5. Otherwise, error with a helpful message

In practice, ambiguity is rare. Column names like "species" or "temperature" are not valid CSS colors, and color names like "red" are unlikely column names. When true ambiguity exists, use a keyword for the column (:red) or a hex string for the color ("#FF0000").

Verify: "red" is a fixed color when the dataset has no red column:

(def red-color-pose
  (-> five-points
      (pj/lay-point :x :y {:color "red"})))
red-color-pose
yx1.01.52.02.53.03.54.04.55.02.02.53.03.54.04.55.0

No legend, points drawn red – treated as a fixed color, not a column.

No color – default gray

The Worked Example at the top of the chapter shows this case: with no :color mapping, all points render in the default dark gray (#333) and no legend appears.

Grouping

The colored examples above all rest on the same concept: grouping controls how data is split into independent subsets. Each group gets its own visual elements – its own set of points, its own regression line, its own density curve, its own bar in a dodged layout.

Grouping can be derived (from a categorical :color mapping) or explicit (via the :group aesthetic).

Categorical color implies grouping

When :color maps to a categorical column (as with colored-pose above), the data is split into one group per category. Each group gets a distinct palette color and a legend entry:

colored-pose
yxgab1234563.03.54.04.55.05.56.06.57.07.58.0

Two groups, two legend entries – one per category in :g.

Numeric color does not create groups

When :color maps to a numerical column, data is NOT split. Instead, each point gets an individual color from a continuous gradient. There is one group, and the legend is continuous with 20 pre-computed color stops.

(def numeric-color-pose
  (-> {:x [1 2 3 4 5]
       :y [2 4 3 5 4]
       :val [10 20 30 40 50]}
      (pj/lay-point :x :y {:color :val})))
numeric-color-pose
yxval10.0050.001.01.52.02.53.03.54.04.55.02.02.53.03.54.04.55.0

A single group with a continuous legend of 20 color stops – the color is a visual encoding, not a grouping variable.

Overriding color type with :color-type

Sometimes a numeric column is really a categorical identifier – for example, subject IDs in a repeated-measures study. Inference treats numeric columns as continuous, but you want discrete groups. Setting :color-type :categorical overrides this so the column is treated as categorical despite its numeric dtype.

This is a core principle of the library: inference provides good defaults, but the user can always override.

(def study-data
  {:subject [1 1 1 2 2 2 3 3 3]
   :day     [1 2 3 1 2 3 1 2 3]
   :score   [5 7 6 3 4 5 8 9 7]})

Without override – one group, continuous gradient:

(def study-continuous-pose
  (-> study-data
      (pj/lay-line :day :score {:color :subject})))
study-continuous-pose
scoredaysubject1.0003.0001.01.21.41.61.82.02.22.42.62.83.03456789

With :color-type :categorical – three groups, one per subject:

(def study-categorical-pose
  (-> study-data
      (pj/lay-line :day :score {:color :subject
                                :color-type :categorical})))
study-categorical-pose
scoredaysubject1231.01.21.41.61.82.02.22.42.62.83.03456789

The same data, the same columns – but :color-type :categorical changes inference from β€œone gradient” to β€œthree distinct groups.” This affects grouping, line splitting, legend style, and palette assignment. The rendered plots look completely different:

(-> {:subject [1 1 1 2 2 2 3 3 3]
     :day     [1 2 3 1 2 3 1 2 3]
     :score   [5 7 6 3 4 5 8 9 7]}
    (pj/lay-line :day :score {:color :subject
                              :color-type :categorical})
    pj/lay-point
    (pj/options {:title "Scores by Subject (categorical override)"}))
Scores by Subject (categorical override)scoredaysubject1231.01.21.41.61.82.02.22.42.62.83.03456789

Explicit grouping with :group

The :group aesthetic splits data into groups without assigning distinct colors or creating a legend. This is useful when you want per-group statistics but uniform appearance.

(def grouped-data
  {:x [1 2 3 4 5 6]
   :y [3 5 4 7 6 8]
   :g ["a" "a" "a" "b" "b" "b"]})
(def explicit-group-pose
  (-> grouped-data
      (pj/lay-point :x :y {:group :g})))
explicit-group-pose
yx1.01.52.02.53.03.54.04.55.05.56.03.03.54.04.55.05.56.06.57.07.58.0

Two groups, but no legend and no color differentiation. Use :group when you need separate statistical fits but want a uniform visual style.

What grouping affects

Grouping determines how statistical transformations operate. Without grouping, (pj/lay-smooth {:stat :linear-model}) (linear model) fits one regression line through all the data. With grouping, it fits one line per group.

One regression line – no grouping:

(-> grouped-data
    (pj/pose :x :y)
    pj/lay-point
    (pj/lay-smooth {:stat :linear-model}))
yx1.01.52.02.53.03.54.04.55.05.56.03.03.54.04.55.05.56.06.57.07.58.0

Two regression lines – grouped by color:

(-> grouped-data
    (pj/pose :x :y {:color :g})
    pj/lay-point
    (pj/lay-smooth {:stat :linear-model}))
yxgab1234563.03.54.04.55.05.56.06.57.07.58.0

The same applies to other statistics: density curves, LOESS smoothers, boxplots, and dodge/stack positioning all operate per group.

Layer Type

When you use pj/pose without an explicit pj/lay-* call, Plotje infers the layer type – a mark + stat bundle – from the column types of the referenced columns.

Single-column cases

Column type Inferred Mark + stat
numerical histogram :bar + :bin
temporal histogram (over epoch-ms, with calendar-aware ticks) :bar + :bin
categorical bar chart of category counts :rect + :count

Two-column cases

x type y type Inferred Mark + stat
numerical numerical scatter :point + :identity
temporal numerical time-series line :line + :identity
categorical numerical boxplot (vertical) :boxplot + :boxplot
numerical categorical boxplot (horizontal) :boxplot + :boxplot
any other pair scatter (fallback) :point + :identity

Fallback pairs include temporal x + categorical y, categorical x + categorical y, and temporal x + temporal y. These are rarer in practice, and giving them a dedicated inference is deferred. You can always override with an explicit pj/lay-* call; the inferred layer type is only a default.

When you use pj/lay-point, pj/lay-histogram, etc., the layer type’s stat takes precedence – column-type inference is bypassed.

A single numerical column produces a histogram:

(def hist-pose
  (-> five-points
      (pj/pose :x)))
hist-pose
x1.01.52.02.53.03.54.04.55.00.00.20.40.60.81.01.21.41.61.82.0

The inferred layer is a histogram – a :bar mark fed by the :bin stat, so the data is binned into rectangles before rendering.

A single temporal column also becomes a histogram, binned over epoch-milliseconds with calendar-aware tick labels:

(def temporal-hist-pose
  (-> {:date [#inst "2024-01-01" #inst "2024-02-01" #inst "2024-03-01"
              #inst "2024-04-01" #inst "2024-05-01"]}
      (pj/pose :date)))
temporal-hist-pose
dateJan-02Jan-15Jan-28Feb-10Feb-23Mar-07Mar-20Apr-02Apr-15Apr-280.00.20.40.60.81.01.21.41.61.82.0

A single categorical column produces a bar chart of counts:

(def count-pose
  (-> animals
      (pj/pose :animal)))
count-pose
animalcatdogbirdfish0.00.10.20.30.40.50.60.70.80.91.0

The inferred layer uses a :rect mark fed by the :count stat, which tallied each of the 4 categories.

Two numerical columns produce a scatter (the chapter’s opening scatter-pose is such a pose):

(def num-num-pose
  (-> five-points (pj/pose :x :y)))
num-num-pose
yx1.01.52.02.53.03.54.04.55.02.02.53.03.54.04.55.0

A temporal x with a numerical y infers a time-series line. Row order is preserved, so pre-sort temporal data to avoid zigzag:

(def ts-line-pose
  (-> {:date [#inst "2024-01-01" #inst "2024-02-01" #inst "2024-03-01"]
       :val  [10 25 18]}
      (pj/pose :date :val)))
ts-line-pose
valdateJan-02Jan-08Jan-14Jan-20Jan-26Feb-01Feb-07Feb-13Feb-19Feb-251012141618202224

A categorical x with a numerical y infers a boxplot – the default for summarizing a distribution across groups:

(def boxplot-pose
  (-> {:species ["a" "a" "a" "b" "b" "b" "c" "c" "c"]
       :val     [8  10  12  18  20  22  14  15  17]}
      (pj/pose :species :val)))
boxplot-pose
valspeciesabc810121416182022

A numerical x with a categorical y infers a horizontal boxplot – the same summary laid out with the category axis on y:

(def horizontal-boxplot-pose
  (-> {:val     [8  10  12  18  20  22  14  15  17]
       :species ["a" "a" "a" "b" "b" "b" "c" "c" "c"]}
      (pj/pose :val :species)))
horizontal-boxplot-pose
speciesval810121416182022abc

Domains

Numerical domains extend 5% beyond the data range so points aren’t clipped at the edges.

scatter-pose
yx1.01.52.02.53.03.54.04.55.02.02.53.03.54.04.55.0

The x-domain is [0.8, 5.2] – the data range [1.0, 5.0] plus 0.2 padding on each side (5% of the data range, 4.0).

Special domain rules apply in certain contexts:

Bar chart y-domains always include zero:

bar-pose
countanimalcatdogbirdfish02468101214

Percentage-filled layers normalize the y-domain to [0.0, 1.0]:

(def fill-pose
  (-> {:x ["a" "a" "b" "b"]
       :g ["m" "n" "m" "n"]}
      (pj/lay-bar :x {:position :fill :color :g})))
fill-pose
xgmnab0.00.10.20.30.40.50.60.70.80.91.0

The y-domain is exactly [0.0, 1.0] – each category sums to 100%.

Multi-layer plots merge domains across layers – see β€œMulti-Layer Plots” below.

Ticks

Once domains are computed, Plotje selects β€œnice” round tick values. The logic depends on the scale type:

  • Linear – wadogo selects ticks at round intervals (1, 2, 2.5, 5, …)

  • Log – 1-2-5 nice numbers: powers of 10 when they give at least 3 ticks, otherwise intermediates at 1-2-5 or 1-2-3-5 multiples per decade

  • Categorical – tick at each category, in order of appearance

  • Temporal – calendar-aware snapping (year, month, day, hour) with adaptive formatting

Linear ticks for the scatter example:

scatter-pose
yx1.01.52.02.53.03.54.04.55.02.02.53.03.54.04.55.0

Nine ticks from 1.0 to 5.0 at 0.5 intervals – round and readable.

Log ticks for a multi-decade range:

(def log-scale-pose
  (-> {:x [0.1 1.0 10.0 100.0 1000.0]
       :y [5 10 15 20 25]}
      (pj/lay-point :x :y)
      (pj/scale :x :log)))
log-scale-pose
yx0.11101001000468101214161820222426

Five ticks at exact powers of 10 – no irrational intermediates. Whole numbers display without decimals, sub-1 values use minimal decimal places.

Categorical ticks match domain order:

bar-pose
countanimalcatdogbirdfish02468101214

Axis Labels

Labels come from column names. Underscores and hyphens become spaces.

(def iris-label-pose
  (-> (rdatasets/datasets-iris)
      (pj/lay-point :sepal-length :sepal-width)))
iris-label-pose
sepal widthsepal length4.55.05.56.06.57.07.58.02.02.53.03.54.04.5

When only one column is specified, the y-axis shows computed counts. The system omits the y-label since it would repeat the column name:

(def x-only-pose
  (-> five-points (pj/pose :x)))
x-only-pose
x1.01.52.02.53.03.54.04.55.00.00.20.40.60.81.01.21.41.61.82.0

Explicit labels override inference:

(def explicit-label-pose
  (-> five-points
      (pj/lay-point :x :y)
      (pj/options {:x-label "Length (cm)" :y-label "Width (cm)"})))
explicit-label-pose
Width (cm)Length (cm)1.01.52.02.53.03.54.04.55.02.02.53.03.54.04.55.0

Legends

A legend appears when a column is mapped to color. Three cases:

A categorical color mapping produces a discrete legend with one entry per category:

colored-pose
yxgab1234563.03.54.04.55.05.56.06.57.07.58.0

The legend’s title is the column name; each entry has a :label and a palette color.

No color mapping means no legend:

scatter-pose
yx1.01.52.02.53.03.54.04.55.02.02.53.03.54.04.55.0

A fixed color string also suppresses the legend:

fixed-color-pose
yx1.01.52.02.53.03.54.04.55.02.02.53.03.54.04.55.0

A numeric color mapping produces a continuous legend (gradient bar):

(def continuous-color-pose
  (-> {:x [1 2 3] :y [4 5 6] :val [10 20 30]}
      (pj/lay-point :x :y {:color :val})))
continuous-color-pose
yxval10.0030.001.01.21.41.61.82.02.22.42.62.83.04.04.24.44.64.85.05.25.45.65.86.0

Size Legend

When :size maps to a numerical column, a size legend shows five graduated circles spanning the data range, with radii proportional to the values they represent.

(def size-legend-pose
  (-> {:x [1 2 3 4 5] :y [1 2 3 4 5] :s [10 20 30 40 50]}
      (pj/lay-point :x :y {:size :s})))
size-legend-pose
yxs10.020.030.040.050.01.01.52.02.53.03.54.04.55.01.01.52.02.53.03.54.04.55.0

The legend has 5 entries, each pairing a value with a circle of the corresponding radius. No size mapping means no size legend:

scatter-pose
yx1.01.52.02.53.03.54.04.55.02.02.53.03.54.04.55.0

Alpha Legend

When :alpha maps to a numerical column, an alpha legend shows graduated opacity squares – about five nice 1/2/5 breaks; the exact count depends on the range (here [0.1, 0.9] yields four).

(def alpha-legend-pose
  (-> {:x [1 2 3 4 5] :y [1 2 3 4 5] :a [0.1 0.3 0.5 0.7 0.9]}
      (pj/lay-point :x :y {:alpha :a})))
alpha-legend-pose
yxa0.20.40.60.81.01.52.02.53.03.54.04.55.01.01.52.02.53.03.54.04.55.0

No alpha mapping means no alpha legend:

scatter-pose
yx1.01.52.02.53.03.54.04.55.02.02.53.03.54.04.55.0

Layout

Layout padding adjusts based on what elements are present – titles, axis labels, and legends each reserve their own space.

Compare a bare plot to one with title, labels, and legend:

scatter-pose
yx1.01.52.02.53.03.54.04.55.02.02.53.03.54.04.55.0
(def full-layout-pose
  (-> {:x [1 2 3 4 5 6]
       :y [3 5 4 7 6 8]
       :g ["a" "a" "a" "b" "b" "b"]}
      (pj/lay-point :x :y {:color :g})
      (pj/options {:title "My Plot"})))
full-layout-pose
My Plotyxgab123456345678

The bare plot reserves no space for a title and no legend strip. The full plot adds padding above for the title and 100 pixels on the right for the legend.

Layout type is also inferred from the pose structure:

  • A single panel is :single
  • A facet grid (:facet-row or :facet-col) is :facet-grid
  • Multiple x-y pairs (scatter plot matrix) are :multi-variable
scatter-pose
yx1.01.52.02.53.03.54.04.55.02.02.53.03.54.04.55.0

Coordinate Flipping

Setting :coord :flip swaps the visual axes. The data stays the same – the categorical band that was on x ends up on y, with ticks and labels following along.

(def normal-pose
  (-> animals
      (pj/lay-value-bar :animal :count)))
normal-pose
countanimalcatdogbirdfish02468101214
(def flip-pose
  (-> animals
      (pj/lay-value-bar :animal :count)
      (pj/coord :flip)))
flip-pose
animalcount02468101214catdogbirdfish

The categorical axis moved from x to y.

Labels are also swapped – the x-label and y-label follow their visual axis, not the data axis:

(def flipped-labels-pose
  (-> five-points
      (pj/lay-point :x :y)
      (pj/coord :flip)))
flipped-labels-pose
xy2.02.53.03.54.04.55.01.01.52.02.53.03.54.04.55.0

After flipping, the visual x-axis shows β€œy” and the visual y-axis shows β€œx” – labels track the visual axes.

Polar coordinates (:coord :polar) are covered separately – see the Polar Coordinates chapter for rose charts, radial bars, and related plots.

Multi-Layer Plots

When multiple layers share a panel, their domains are merged:

(def multi-pose
  (-> five-points
      (pj/pose :x :y)
      pj/lay-point
      (pj/lay-smooth {:stat :linear-model})))
multi-pose
yx1.01.52.02.53.03.54.04.55.02.02.53.03.54.04.55.0

Two layers – one :point, one :line – sharing the same domain. The line carries the regression curve as a polyline.

Resolution Overview

The diagram below sketches how the rules above combine – which inferences feed which others on the way from a pose to a rendered plot:

graph TD POSE["pose + options"] POSE --> CT["Column types"] POSE --> AE["Aesthetics"] CT --> GR["Grouping"] AE --> GR CT --> ME["Layer type"] GR --> STATS["Statistics"] ME --> STATS STATS --> DOM["Domains"] DOM --> TK["Ticks"] POSE --> LBL["Axis labels"] AE --> LEG["Color legend"] AE --> SLEG["Size legend"] AE --> ALEG["Alpha legend"] DOM --> LAYOUT["Layout"] LBL --> LAYOUT LEG --> LAYOUT SLEG --> LAYOUT ALEG --> LAYOUT DOM --> PLOT["Rendered plot"] TK --> PLOT LBL --> PLOT LEG --> PLOT SLEG --> PLOT ALEG --> PLOT LAYOUT --> PLOT STATS --> PLOT style POSE fill:#e8f5e9 style PLOT fill:#fff3e0 style STATS fill:#e3f2fd style DOM fill:#e3f2fd

Column types and aesthetic classification are the starting points; everything else flows from them. Statistics and domains together set the geometry; labels, legends, and layout round out the surrounding plot.

What’s Next

  • Layer Types – the full registry of marks, stats, and positions that inference selects from
  • Relationships – see inference in action on scatter, regression, and SPLOM
source: notebooks/plotje_book/inference_rules.clj