6 Inference Rules
Napkinsketch infers many parameters automatically so you can write less and get reasonable defaults. This notebook shows those rules in action by examining the plan β the resolved data structure that captures every inference decision.
The examples use small inline datasets so the full plan is readable.
(ns napkinsketch-book.inference-rules
(:require
;; Tablecloth β dataset manipulation
[tablecloth.api :as tc]
;; Kindly β notebook rendering protocol
[scicloj.kindly.v4.kind :as kind]
;; Napkinsketch β composable plotting
[scicloj.napkinsketch.api :as sk]
;; Shared datasets
[napkinsketch-book.datasets :as data]))What Gets Inferred
When you write (-> data (sk/lay-point :x :y)) β or even just (sk/lay-point data) β the library fills in everything needed to render a plot. Here is the full list of inference steps, in the order they happen:
- Column selection β which columns map to x, y, and color (inferred from dataset shape when omitted)
- Column types β numerical, categorical, or temporal
- Aesthetic resolution β is
:colora column reference, a hex string, or a CSS name? - Grouping β which column(s) split data into subsets
- Method β which mark and stat to use (scatter, histogram, bar, β¦)
- Domains β data extent for each axis, with padding
- Ticks β nice round values and formatted labels
- Axis labels β derived from column names
- Legend β type, entries, and layout space
- Layout β single panel, facet grid, or multi-variable
- Coordinate transform β cartesian, flip, or polar
Each rule has a sensible default and an explicit override. The sections below demonstrate each rule with live examples.
Inspecting the Plan
Every call to sk/plan returns a plain Clojure map: the plan. It contains everything needed to render a plot β domains, ticks, scales, layers with positioned data, legend, layout dimensions.
To understand what Napkinsketch inferred, look at the plan.
(def five-points
{:x [1.0 2.0 3.0 4.0 5.0]
:y [2.1 4.3 3.0 5.2 4.8]})(def scatter-views
(-> five-points
(sk/lay-point :x :y)))Here is the full plan:
(sk/plan scatter-views){:panels
[{:coord :cartesian,
:y-domain [1.945 5.355],
:x-scale {:type :linear},
:x-domain [0.8 5.2],
:x-ticks
{:values [1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0],
:labels ["1.0" "1.5" "2.0" "2.5" "3.0" "3.5" "4.0" "4.5" "5.0"],
:categorical? false},
:col 0,
:layers
[{:mark :point,
:style {:opacity 0.75, :radius 3.0},
:groups
[{:color [0.2 0.2 0.2 1.0], :xs #tech.v3.dataset.column<float64>[5]
:x
[1.000, 2.000, 3.000, 4.000, 5.000],
:ys #tech.v3.dataset.column<float64>[5]
:y
[2.100, 4.300, 3.000, 5.200, 4.800],
:row-indices #tech.v3.dataset.column<int64>[5]
:__row-idx
[0, 1, 2, 3, 4]}],
:y-domain [2.1 5.2],
:x-domain [1.0 5.0]}],
:y-scale {:type :linear},
:y-ticks
{:values [2.0 2.5 3.0 3.5 4.0 4.5 5.0],
:labels ["2.0" "2.5" "3.0" "3.5" "4.0" "4.5" "5.0"],
:categorical? false},
:row 0}],
:width 600,
:height 400,
:caption nil,
:total-width 622.5,
:legend-position :right,
:layout-type :single,
:layout
{:subtitle-pad 0,
:legend-w 0,
:caption-pad 0,
:y-label-pad 22.5,
:legend-h 0,
:title-pad 0,
:strip-h 0,
:x-label-pad 18,
:strip-w 0},
:grid {:rows 1, :cols 1},
:legend nil,
:panel-height 400.0,
:title nil,
:y-label "y",
:alpha-legend nil,
:x-label "x",
:subtitle nil,
:panel-width 600.0,
:size-legend nil,
:total-height 418.0,
:margin 30}And the resulting plot:
scatter-viewsNotice in the plan above:
:x-domainis[0.8 5.2]β wider than the data range[1.0, 5.0]because of 5% padding:x-scaleis{:type :linear}β inferred from numeric data:x-tickshas nice round values:1.0, 1.5, 2.0, ...:x-labelis"x"β derived from the column keyword:legendisnilβ no color mapping:layouthas:legend-w 0β no space reserved for a legendThe single layer has
:mark :pointand a single:groupsentry with all 5 data points, colored in the default color (steel blue)
Column Selection
When column names are omitted, napkinsketch infers them from the dataset shape:
| Number of columns | Inferred mapping |
|---|---|
| 1 | first β x |
| 2 | first β x, second β y |
| 3 | first β x, second β y, third β color |
| 4+ | error β specify columns explicitly |
One column:
(-> {:values [1 2 3 4 5 6]}
sk/lay-histogram)Two columns:
(-> {:x [1 2 3 4 5] :y [2 4 3 5 4]}
sk/lay-point)Three columns β the third becomes :color:
(-> {:x [1 2 3 4] :y [4 5 6 7] :g ["a" "a" "b" "b"]}
sk/lay-point)When you provide explicit columns, inference is skipped β you are in full control:
(-> data/iris
(sk/lay-point :petal_length :petal_width {:color :species}))Column Type Detection
Once columns are selected, the next step is determining the type of each column: numerical, categorical, or temporal? This determines the scale type, domain, tick style, and the default mark.
| Column dtype | Inferred type |
|---|---|
| float, int | :numerical |
| string, keyword, boolean, symbol, text | :categorical |
| LocalDate, LocalDateTime, Instant, java.util.Date | :temporal β numerical with calendar-aware ticks |
Internally, infer-column-types in view.clj handles this step.
A categorical column produces a band scale with string domain values. Compare:
(def animals
{:animal ["cat" "dog" "bird" "fish"]
:count [12 8 15 5]})(def bar-views
(-> animals
(sk/lay-value-bar :animal :count)))(sk/plan bar-views){:panels
[{:coord :cartesian,
:y-domain [-0.75 15.75],
:x-scale {:type :linear},
:x-domain ["cat" "dog" "bird" "fish"],
:x-ticks
{:values ["cat" "dog" "bird" "fish"],
:labels ["cat" "dog" "bird" "fish"],
:categorical? true},
:col 0,
:layers
[{:mark :rect,
:style {:opacity 0.85},
:position :dodge,
:groups
[{:color [0.2 0.2 0.2 1.0],
:label "",
:xs #tech.v3.dataset.column<string>[4]
:animal
[cat, dog, bird, fish],
:ys #tech.v3.dataset.column<int64>[4]
:count
[12, 8, 15, 5],
:dodge-idx 0}],
:y-domain [0 15],
:x-domain ("cat" "dog" "bird" "fish"),
:dodge-ctx {:n-groups 1}}],
:y-scale {:type :linear},
:y-ticks
{:values [0.0 2.0 4.0 6.0 8.0 10.0 12.0 14.0],
:labels ["0" "2" "4" "6" "8" "10" "12" "14"],
:categorical? false},
:row 0}],
:width 600,
:height 400,
:caption nil,
:total-width 618.0,
:legend-position :right,
:layout-type :single,
:layout
{:subtitle-pad 0,
:legend-w 0,
:caption-pad 0,
:y-label-pad 18.0,
:legend-h 0,
:title-pad 0,
:strip-h 0,
:x-label-pad 18,
:strip-w 0},
:grid {:rows 1, :cols 1},
:legend nil,
:panel-height 400.0,
:title nil,
:y-label "count",
:alpha-legend nil,
:x-label "animal",
:subtitle nil,
:panel-width 600.0,
:size-legend nil,
:total-height 418.0,
:margin 30}bar-viewsThe x-domain is ["cat" "dog" "bird" "fish"] β strings in order of appearance. The ticks have :categorical? true. The y-domain starts at zero because this is a bar chart.
Temporal columns
Dates are detected and converted to epoch-milliseconds internally, with calendar-aware tick labels. Clojureβs #inst reader literal is a convenient way to write dates:
(let [pl (-> {:date [#inst "2024-01-01" #inst "2024-06-01" #inst "2024-12-01"]
:val [10 25 18]}
(sk/lay-point :date :val)
sk/plan)
p (first (:panels pl))]
{:x-domain-numeric? (number? (first (:x-domain p)))
:tick-count (count (:values (:x-ticks p)))
:first-tick-label (first (:labels (:x-ticks p)))}){:x-domain-numeric? true, :tick-count 10, :first-tick-label "Feb-01"}The x-domain contains epoch-millisecond numbers, but the 10 tick labels show human-readable dates like "Feb-01". Napkinsketch accepts java.util.Date (from #inst), LocalDate, LocalDateTime, and Instant β all are converted to epoch-milliseconds for plotting, with calendar-aware tick formatting.
Aesthetic Resolution
The :color parameter triggers different behaviors depending on what you pass. Internally, resolve-aesthetics in view.clj classifies each aesthetic channel (:color, :size, :alpha, :text) as either a column reference or a fixed literal.
Column reference β colored by palette
(def colored-views
(-> {:x [1 2 3 4 5 6]
:y [3 5 4 7 6 8]
:g ["a" "a" "a" "b" "b" "b"]}
(sk/lay-point :x :y {:color :g})))(sk/plan colored-views){:panels
[{:coord :cartesian,
:y-domain [2.75 8.25],
:x-scale {:type :linear},
:x-domain [0.75 6.25],
:x-ticks
{:values [1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0],
:labels
["1.0"
"1.5"
"2.0"
"2.5"
"3.0"
"3.5"
"4.0"
"4.5"
"5.0"
"5.5"
"6.0"],
:categorical? false},
:col 0,
:layers
[{:mark :point,
:style {:opacity 0.75, :radius 3.0},
:groups
[{:color
[0.8941176470588236
0.10196078431372549
0.10980392156862745
1.0],
:xs #tech.v3.dataset.column<int64>[3]
:x
[1, 2, 3],
:ys #tech.v3.dataset.column<int64>[3]
:y
[3, 5, 4],
:label "a",
:row-indices #tech.v3.dataset.column<int64>[3]
:__row-idx
[0, 1, 2]}
{:color
[0.21568627450980393
0.49411764705882355
0.7215686274509804
1.0],
:xs #tech.v3.dataset.column<int64>[3]
:x
[4, 5, 6],
:ys #tech.v3.dataset.column<int64>[3]
:y
[7, 6, 8],
:label "b",
:row-indices #tech.v3.dataset.column<int64>[3]
:__row-idx
[3, 4, 5]}],
:y-domain [3 8],
:x-domain [1 6]}],
:y-scale {:type :linear},
:y-ticks
{:values [3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0],
:labels
["3.0"
"3.5"
"4.0"
"4.5"
"5.0"
"5.5"
"6.0"
"6.5"
"7.0"
"7.5"
"8.0"],
:categorical? false},
:row 0}],
:width 600,
:height 400,
:caption nil,
:total-width 722.5,
:legend-position :right,
:layout-type :single,
:layout
{:subtitle-pad 0,
:legend-w 100,
:caption-pad 0,
:y-label-pad 22.5,
:legend-h 0,
:title-pad 0,
:strip-h 0,
:x-label-pad 18,
:strip-w 0},
:grid {:rows 1, :cols 1},
:legend
{:title :g,
:entries
[{:label "a",
:color
[0.8941176470588236 0.10196078431372549 0.10980392156862745 1.0]}
{:label "b",
:color
[0.21568627450980393
0.49411764705882355
0.7215686274509804
1.0]}]},
:panel-height 400.0,
:title nil,
:y-label "y",
:alpha-legend nil,
:x-label "x",
:subtitle nil,
:panel-width 600.0,
:size-legend nil,
:total-height 418.0,
:margin 30}colored-viewsTwo entries in :groups, each with its own :color (RGBA), :xs, :ys, and :label. A :legend appeared with 2 entries. The :layout now has :legend-w 100 β space reserved on the right.
Why two entries? Because :g is a categorical column. The next section explores this mechanism in detail.
Fixed color string β single color, no legend
(def fixed-color-views
(-> five-points
(sk/lay-point :x :y {:color "#E74C3C"})))(sk/plan fixed-color-views){:panels
[{:coord :cartesian,
:y-domain [1.945 5.355],
:x-scale {:type :linear},
:x-domain [0.8 5.2],
:x-ticks
{:values [1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0],
:labels ["1.0" "1.5" "2.0" "2.5" "3.0" "3.5" "4.0" "4.5" "5.0"],
:categorical? false},
:col 0,
:layers
[{:mark :point,
:style {:opacity 0.75, :radius 3.0},
:groups
[{:color
[0.9058823529411765 0.2980392156862745 0.23529411764705882 1.0],
:xs #tech.v3.dataset.column<float64>[5]
:x
[1.000, 2.000, 3.000, 4.000, 5.000],
:ys #tech.v3.dataset.column<float64>[5]
:y
[2.100, 4.300, 3.000, 5.200, 4.800],
:row-indices #tech.v3.dataset.column<int64>[5]
:__row-idx
[0, 1, 2, 3, 4]}],
:y-domain [2.1 5.2],
:x-domain [1.0 5.0]}],
:y-scale {:type :linear},
:y-ticks
{:values [2.0 2.5 3.0 3.5 4.0 4.5 5.0],
:labels ["2.0" "2.5" "3.0" "3.5" "4.0" "4.5" "5.0"],
:categorical? false},
:row 0}],
:width 600,
:height 400,
:caption nil,
:total-width 622.5,
:legend-position :right,
:layout-type :single,
:layout
{:subtitle-pad 0,
:legend-w 0,
:caption-pad 0,
:y-label-pad 22.5,
:legend-h 0,
:title-pad 0,
:strip-h 0,
:x-label-pad 18,
:strip-w 0},
:grid {:rows 1, :cols 1},
:legend nil,
:panel-height 400.0,
:title nil,
:y-label "y",
:alpha-legend nil,
:x-label "x",
:subtitle nil,
:panel-width 600.0,
:size-legend nil,
:total-height 418.0,
:margin 30}fixed-color-viewsA single :groups entry with red RGBA values. No :legend, :legend-w is 0. The hex string was converted to [0.906 0.298 0.235 1.0].
Named colors and string disambiguation
CSS color names like "red" and "steelblue" also work as fixed colors:
(-> five-points
(sk/lay-point :x :y {:color "steelblue"}))This raises a question: since :color also accepts column names as strings (like "species"), how does the system decide whether "red" means the column :red or the color red?
The rule is: check the dataset first. If the string matches a column name in the dataset, it is treated as a column reference. Otherwise, it is treated as a color value β first trying hex parsing, then CSS color name lookup.
Here is the full resolution order for a string :color value:
- If the string matches a dataset column β column reference (grouping)
- If it starts with
#β hex color ("#E74C3C","#F00") - If it parses as hex without
#β hex color ("00FF00") - If it matches a CSS color name β named color (
"red","steelblue") - Otherwise β error with a helpful message
In practice, ambiguity is rare. Column names like "species" or "temperature" are not valid CSS colors, and color names like "red" are unlikely column names. When true ambiguity exists, use a keyword for the column (:red) or a hex string for the color ("#FF0000").
Verify: "red" is a fixed color when the dataset has no red column:
(let [pl (-> five-points
(sk/lay-point :x :y {:color "red"})
sk/plan)]
{:legend (:legend pl)
:color (:color (first (:groups (first (:layers (first (:panels pl)))))))}){:legend nil, :color [1.0 0.0 0.0 1.0]}No legend, red RGBA β treated as a fixed color, not a column.
No color β default gray
Look back at the first scatter plan above β its single :groups entry has the default color (steel blue). No legend.
Grouping
The :groups entries you saw above reflect a key concept: grouping controls how data is split into independent subsets. Each group gets its own visual elements β its own set of points, its own regression line, its own density curve, its own bar in a dodged layout.
Internally, infer-grouping in view.clj builds the grouping vector from explicit :group and categorical color.
Grouping can be derived (from a categorical :color mapping) or explicit (via the :group aesthetic).
Categorical color implies grouping
When :color maps to a categorical column (as with colored-views above), the data is split into one group per category. Each group gets a distinct palette color and a legend entry:
(let [pl (sk/plan colored-views)
layer (first (:layers (first (:panels pl))))]
{:group-count (count (:groups layer))
:group-labels (mapv :label (:groups layer))
:has-legend? (some? (:legend pl))}){:group-count 2, :group-labels ["a" "b"], :has-legend? true}Two groups, two legend entries. Each group has its own :xs, :ys, and :color.
Numeric color does not create groups
When :color maps to a numerical column, data is NOT split. Instead, each point gets an individual color from a continuous gradient. There is one group, and the legend is continuous with 20 pre-computed color stops.
(let [pl (-> {:x [1 2 3 4 5]
:y [2 4 3 5 4]
:val [10 20 30 40 50]}
(sk/lay-point :x :y {:color :val})
sk/plan)
layer (first (:layers (first (:panels pl))))]
{:group-count (count (:groups layer))
:legend-type (:type (:legend pl))
:color-stops (count (:stops (:legend pl)))}){:group-count 1, :legend-type :continuous, :color-stops 20}One group, continuous legend with 20 stops. No splitting occurred β the color is a visual encoding, not a grouping variable.
Explicit grouping with :group
The :group aesthetic splits data into groups without assigning distinct colors or creating a legend. This is useful when you want per-group statistics but uniform appearance.
(def grouped-data
{:x [1 2 3 4 5 6]
:y [3 5 4 7 6 8]
:g ["a" "a" "a" "b" "b" "b"]})(let [pl (-> grouped-data
(sk/lay-point :x :y {:group :g})
sk/plan)
layer (first (:layers (first (:panels pl))))]
{:group-count (count (:groups layer))
:has-legend? (some? (:legend pl))}){:group-count 2, :has-legend? false}Two groups, but no legend and no color differentiation. Use :group when you need separate statistical fits but want a uniform visual style.
What grouping affects
Grouping determines how statistical transformations operate. Without grouping, sk/lay-lm (linear model) fits one regression line through all the data. With grouping, it fits one line per group.
One regression line β no grouping:
(-> grouped-data
(sk/view :x :y)
sk/lay-point
sk/lay-lm)Two regression lines β grouped by color:
(-> grouped-data
(sk/view :x :y {:color :g})
sk/lay-point
sk/lay-lm)The same applies to other statistics: density curves, LOESS smoothers, boxplots, and dodge/stack positioning all operate per group.
Method Inference
When you use sk/view without an explicit sk/lay-* call, Napkinsketch infers the method β a mark + stat bundle β from the column types. Internally, infer-method in view.clj implements these rules:
| Columns | Inferred mark | Inferred stat |
|---|---|---|
| one numerical | :bar |
:bin (histogram) |
| one categorical | :rect |
:count (bar chart) |
| two numerical | :point |
:identity (scatter) |
| mixed (categorical + numerical) | :point |
:identity (scatter) |
When you use sk/lay-point, sk/lay-histogram, etc., the methodβs stat takes precedence β column-type inference is bypassed.
A single numerical column:
(def hist-views
(-> five-points
(sk/view :x)))(sk/plan hist-views){:panels
[{:coord :cartesian,
:y-domain [-0.1 2.1],
:x-scale {:type :linear},
:x-domain [0.8 5.2],
:x-ticks
{:values [1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0],
:labels ["1.0" "1.5" "2.0" "2.5" "3.0" "3.5" "4.0" "4.5" "5.0"],
:categorical? false},
:col 0,
:layers
[{:mark :bar,
:style {:opacity 0.85},
:groups
[{:color [0.2 0.2 0.2 1.0],
:bars
[{:lo 1.0, :hi 2.0, :count 1}
{:lo 2.0, :hi 3.0, :count 1}
{:lo 3.0, :hi 4.0, :count 1}
{:lo 4.0, :hi 5.0, :count 2}]}],
:y-domain [0 2],
:x-domain [1.0 5.0]}],
:y-scale {:type :linear},
:y-ticks
{:values [-0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0],
:labels
["0.0"
"0.2"
"0.4"
"0.6"
"0.8"
"1.0"
"1.2"
"1.4"
"1.6"
"1.8"
"2.0"],
:categorical? false},
:row 0}],
:width 600,
:height 400,
:caption nil,
:total-width 600.0,
:legend-position :right,
:layout-type :single,
:layout
{:subtitle-pad 0,
:legend-w 0,
:caption-pad 0,
:y-label-pad 0,
:legend-h 0,
:title-pad 0,
:strip-h 0,
:x-label-pad 18,
:strip-w 0},
:grid {:rows 1, :cols 1},
:legend nil,
:panel-height 400.0,
:title nil,
:y-label nil,
:alpha-legend nil,
:x-label "x",
:subtitle nil,
:panel-width 600.0,
:size-legend nil,
:total-height 418.0,
:margin 30}hist-viewsThe layer mark is :bar β inferred because a single numerical column means histogram. The layer data contains :bins with :x0, :x1, :count β the result of the :bin stat.
A single categorical column:
(def count-views
(-> animals
(sk/view :animal)))(sk/plan count-views){:panels
[{:coord :cartesian,
:y-domain [-0.05 1.05],
:x-scale {:type :linear},
:x-domain ["cat" "dog" "bird" "fish"],
:x-ticks
{:values ["cat" "dog" "bird" "fish"],
:labels ["cat" "dog" "bird" "fish"],
:categorical? true},
:col 0,
:layers
[{:mark :rect,
:style {:opacity 0.85},
:position :dodge,
:categories ["cat" "dog" "bird" "fish"],
:groups
[{:color [0.2 0.2 0.2 1.0],
:label "",
:counts
[{:category "cat", :count 1}
{:category "dog", :count 1}
{:category "bird", :count 1}
{:category "fish", :count 1}],
:dodge-idx 0}],
:y-domain [0 1],
:x-domain ("cat" "dog" "bird" "fish"),
:dodge-ctx {:n-groups 1}}],
:y-scale {:type :linear},
:y-ticks
{:values [-0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0],
:labels
["0.0"
"0.1"
"0.2"
"0.3"
"0.4"
"0.5"
"0.6"
"0.7"
"0.8"
"0.9"
"1.0"],
:categorical? false},
:row 0}],
:width 600,
:height 400,
:caption nil,
:total-width 600.0,
:legend-position :right,
:layout-type :single,
:layout
{:subtitle-pad 0,
:legend-w 0,
:caption-pad 0,
:y-label-pad 0,
:legend-h 0,
:title-pad 0,
:strip-h 0,
:x-label-pad 18,
:strip-w 0},
:grid {:rows 1, :cols 1},
:legend nil,
:panel-height 400.0,
:title nil,
:y-label nil,
:alpha-legend nil,
:x-label "animal",
:subtitle nil,
:panel-width 600.0,
:size-legend nil,
:total-height 418.0,
:margin 30}count-viewsMark is :rect with :counts β the :count stat tallied each of the 4 categories.
Mixed column types (categorical x, numerical y) default to :point:
(let [pl (-> {:species ["a" "b" "c"] :val [10 20 15]}
(sk/view :species :val)
sk/plan)
layer (first (:layers (first (:panels pl))))]
(:mark layer)):pointDomain Inference
Numerical domains extend 5% beyond the data range so points arenβt clipped at the edges. Internally, pad-domain in scale.clj computes this padding.
(let [pl (sk/plan scatter-views)
p (first (:panels pl))]
{:x-domain (:x-domain p)
:data-range [1.0 5.0]
:padding-each-side (* 0.05 (- 5.0 1.0))}){:x-domain [0.8 5.2], :data-range [1.0 5.0], :padding-each-side 0.2}The domain [0.8, 5.2] = data range [1.0, 5.0] Β± 0.2 (5% of 4.0).
Special domain rules apply in certain contexts:
Bar chart y-domains always include zero:
(let [pl (sk/plan bar-views)
p (first (:panels pl))]
{:y-domain (:y-domain p)}){:y-domain [-0.75 15.75]}Percentage-filled layers normalize the y-domain to [0.0, 1.0]:
(let [fill-pl (-> {:x ["a" "a" "b" "b"]
:g ["m" "n" "m" "n"]}
(sk/lay-stacked-bar-fill :x {:color :g})
sk/plan)
p (first (:panels fill-pl))]
(:y-domain p))[0.0 1.0]The y-domain is exactly [0.0, 1.0] β each category sums to 100%.
Multi-layer plots merge domains across layers β see βMulti-Layer Plansβ below.
Tick Inference
Once domains are computed, Napkinsketch selects βniceβ round tick values. The logic depends on the scale type:
Linear β wadogo selects ticks at round intervals (1, 2, 2.5, 5, β¦)
Log β ggplot2-style 1-2-5 nice numbers: powers of 10 when they give at least 3 ticks, otherwise intermediates at 1-2-5 or 1-2-3-5 multiples per decade
Categorical β tick at each category, in order of appearance
Temporal β calendar-aware snapping (year, month, day, hour) with adaptive formatting
Linear ticks for the scatter example:
(let [pl (sk/plan scatter-views)
p (first (:panels pl))]
{:x-tick-values (:values (:x-ticks p))
:x-tick-labels (:labels (:x-ticks p))}){:x-tick-values [1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0],
:x-tick-labels ["1.0" "1.5" "2.0" "2.5" "3.0" "3.5" "4.0" "4.5" "5.0"]}Nine ticks from 1.0 to 5.0 at 0.5 intervals β round and readable.
Log ticks for a multi-decade range:
(let [pl (-> {:x [0.1 1.0 10.0 100.0 1000.0]
:y [5 10 15 20 25]}
(sk/lay-point :x :y)
(sk/scale :x :log)
sk/plan)
p (first (:panels pl))]
{:tick-values (:values (:x-ticks p))
:tick-labels (:labels (:x-ticks p))}){:tick-values [0.1 1.0 10.0 100.0 1000.0],
:tick-labels ["0.1" "1" "10" "100" "1000"]}Five ticks at exact powers of 10 β no irrational intermediates. Whole numbers display without decimals, sub-1 values use minimal decimal places.
Categorical ticks match domain order:
(let [pl (sk/plan bar-views)
p (first (:panels pl))]
(:values (:x-ticks p)))["cat" "dog" "bird" "fish"]Axis Label Inference
Labels come from column names. Underscores and hyphens become spaces. Internally, resolve-labels in plan.clj handles this.
(def iris data/iris)(let [pl (-> iris
(sk/lay-point :sepal_length :sepal_width)
sk/plan)]
{:x-label (:x-label pl)
:y-label (:y-label pl)}){:x-label "sepal length", :y-label "sepal width"}When only one column is specified, the y-axis shows computed counts. The system omits the y-label since it would repeat the column name:
(let [pl (-> five-points (sk/view :x) sk/plan)]
{:x-label (:x-label pl)
:y-label (:y-label pl)}){:x-label "x", :y-label nil}Explicit labels override inference:
(let [pl (-> five-points
(sk/lay-point :x :y)
(sk/options {:x-label "Length (cm)" :y-label "Width (cm)"})
sk/plan)]
{:x-label (:x-label pl)
:y-label (:y-label pl)}){:x-label "Length (cm)", :y-label "Width (cm)"}Legend Inference
A legend appears when a column is mapped to color. Internally, build-legend in plan.clj constructs the legend from the collected color information. Three cases:
Categorical color β discrete legend with one entry per category:
(:legend (sk/plan colored-views)){:title :g,
:entries
[{:label "a",
:color
[0.8941176470588236 0.10196078431372549 0.10980392156862745 1.0]}
{:label "b",
:color
[0.21568627450980393 0.49411764705882355 0.7215686274509804 1.0]}]}Title is the column name. Each entry has a :label and :color (RGBA).
No color mapping β no legend:
(:legend (sk/plan scatter-views))nilFixed color string β no legend:
(:legend (sk/plan fixed-color-views))nilNumeric color β continuous legend (gradient bar):
(:legend (-> {:x [1 2 3] :y [4 5 6] :val [10 20 30]}
(sk/lay-point :x :y {:color :val})
sk/plan)){:title :val,
:type :continuous,
:min 10,
:max 30,
:color-scale nil,
:stops
[{:t 0.0,
:color
[0.07450980392156863 0.16862745098039217 0.2627450980392157 1.0]}
{:t 0.05263157894736842,
:color
[0.08833849329205366 0.19628482972136224 0.2998968008255934 1.0]}
{:t 0.10526315789473684,
:color
[0.1021671826625387 0.22394220846233232 0.3370485036119711 1.0]}
{:t 0.15789473684210525,
:color
[0.11599587203302374 0.2515995872033024 0.3742002063983488 1.0]}
{:t 0.21052631578947367,
:color
[0.1298245614035088 0.27925696594427246 0.4113519091847265 1.0]}
{:t 0.2631578947368421,
:color
[0.14365325077399382 0.30691434468524253 0.4485036119711042 1.0]}
{:t 0.3157894736842105,
:color
[0.15748194014447883 0.33457172342621255 0.4856553147574819 1.0]}
{:t 0.3684210526315789,
:color
[0.17131062951496387 0.3622291021671826 0.5228070175438597 1.0]}
{:t 0.42105263157894735,
:color
[0.1851393188854489 0.3898864809081527 0.5599587203302373 1.0]}
{:t 0.47368421052631576,
:color
[0.19896800825593394 0.41754385964912283 0.597110423116615 1.0]}
{:t 0.5263157894736842,
:color
[0.21279669762641898 0.4452012383900929 0.6342621259029928 1.0]}
{:t 0.5789473684210527,
:color
[0.22662538699690402 0.472858617131063 0.6714138286893705 1.0]}
{:t 0.631578947368421,
:color
[0.24045407636738905 0.500515995872033 0.7085655314757482 1.0]}
{:t 0.6842105263157895,
:color
[0.25428276573787406 0.5281733746130031 0.7457172342621259 1.0]}
{:t 0.7368421052631579,
:color
[0.2681114551083591 0.5558307533539731 0.7828689370485036 1.0]}
{:t 0.7894736842105263,
:color
[0.28194014447884413 0.5834881320949432 0.8200206398348814 1.0]}
{:t 0.8421052631578947,
:color
[0.29576883384932917 0.6111455108359133 0.857172342621259 1.0]}
{:t 0.8947368421052632,
:color
[0.3095975232198142 0.6388028895768834 0.8943240454076368 1.0]}
{:t 0.9473684210526315,
:color
[0.3234262125902993 0.6664602683178534 0.9314757481940144 1.0]}
{:t 1.0,
:color
[0.33725490196078434 0.6941176470588235 0.9686274509803922 1.0]}]}Size Legend
When :size maps to a numerical column, a size legend shows graduated circles spanning the data range. Internally, build-size-legend in plan.clj generates five entries with proportional radii.
(:size-legend (-> {:x [1 2 3 4 5] :y [1 2 3 4 5] :s [10 20 30 40 50]}
(sk/lay-point :x :y {:size :s})
sk/plan)){:title :s,
:type :size,
:min 10,
:max 50,
:entries
[{:value 10.0, :radius 2.0}
{:value 20.0, :radius 3.5}
{:value 30.0, :radius 5.0}
{:value 40.0, :radius 6.5}
{:value 50.0, :radius 8.0}]}Each entry has a :value and :radius. No size mapping β no size legend:
(:size-legend (sk/plan scatter-views))nilAlpha Legend
When :alpha maps to a numerical column, an alpha legend shows graduated opacity squares. Internally, build-alpha-legend in plan.clj generates five entries with proportional opacity.
(:alpha-legend (-> {:x [1 2 3 4 5] :y [1 2 3 4 5] :a [0.1 0.3 0.5 0.7 0.9]}
(sk/lay-point :x :y {:alpha :a})
sk/plan)){:title :a,
:type :alpha,
:min 0.1,
:max 0.9,
:entries
[{:value 0.1, :alpha 0.2}
{:value 0.3, :alpha 0.4}
{:value 0.5, :alpha 0.6000000000000001}
{:value 0.7, :alpha 0.8}
{:value 0.9, :alpha 1.0}]}No alpha mapping β no alpha legend:
(:alpha-legend (sk/plan scatter-views))nilLayout Inference
The :layout map adjusts padding based on what elements are present. Internally, compute-layout-dims in plan.clj calculates the space needed for titles, labels, and legends.
Compare a bare plot to one with title, labels, and legend:
(let [bare (sk/plan scatter-views)
full (-> {:x [1 2 3 4 5 6]
:y [3 5 4 7 6 8]
:g ["a" "a" "a" "b" "b" "b"]}
(sk/lay-point :x :y {:color :g})
(sk/options {:title "My Plot"})
sk/plan)]
{:bare-title-pad (get-in bare [:layout :title-pad])
:full-title-pad (get-in full [:layout :title-pad])
:bare-legend-w (get-in bare [:layout :legend-w])
:full-legend-w (get-in full [:layout :legend-w])}){:bare-title-pad 0,
:full-title-pad 18,
:bare-legend-w 0,
:full-legend-w 100}The bare plot has zero title padding and zero legend width. The full plot adds padding for the title and 100 pixels for the legend.
Layout type is also inferred from the view structure:
- Single panel β
:single - Facet grid (
:facet-rowor:facet-col) β:facet-grid - Multiple x-y pairs (scatter plot matrix) β
:multi-variable
(let [pl (sk/plan scatter-views)]
(:layout-type pl)):singleCoordinate Flipping
Setting :coord :flip swaps axes in the plan. The layer data stays the same β the panel-level domains and ticks are swapped. Internally, make-coord in coord.clj handles the transformation.
(def normal-pl
(-> animals
(sk/lay-value-bar :animal :count)
sk/plan))(def flip-pl
(-> animals
(sk/lay-value-bar :animal :count)
(sk/coord :flip)
sk/plan))(let [np (first (:panels normal-pl))
fp (first (:panels flip-pl))]
{:normal {:x-categorical? (:categorical? (:x-ticks np))
:y-categorical? (:categorical? (:y-ticks np))}
:flipped {:x-categorical? (:categorical? (:x-ticks fp))
:y-categorical? (:categorical? (:y-ticks fp))}}){:normal {:x-categorical? true, :y-categorical? false},
:flipped {:x-categorical? false, :y-categorical? true}}(-> animals
(sk/lay-value-bar :animal :count)
(sk/coord :flip))The categorical axis moved from x to y.
Labels are also swapped β the x-label and y-label follow their visual axis, not the data axis:
(let [pl (-> five-points
(sk/lay-point :x :y)
(sk/coord :flip)
sk/plan)]
{:x-label (:x-label pl)
:y-label (:y-label pl)}){:x-label "y", :y-label "x"}After flipping, the visual x-axis shows βyβ and the visual y-axis shows βxβ β labels track the visual axes.
Multi-Layer Plans
When multiple layers share a panel, their domains are merged:
(def multi-views
(-> five-points
(sk/view :x :y)
sk/lay-point
sk/lay-lm))(sk/plan multi-views){:panels
[{:coord :cartesian,
:y-domain [1.945 5.355],
:x-scale {:type :linear},
:x-domain [0.8 5.2],
:x-ticks
{:values [1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0],
:labels ["1.0" "1.5" "2.0" "2.5" "3.0" "3.5" "4.0" "4.5" "5.0"],
:categorical? false},
:col 0,
:layers
[{:mark :point,
:style {:opacity 0.75, :radius 3.0},
:groups
[{:color [0.2 0.2 0.2 1.0], :xs #tech.v3.dataset.column<float64>[5]
:x
[1.000, 2.000, 3.000, 4.000, 5.000],
:ys #tech.v3.dataset.column<float64>[5]
:y
[2.100, 4.300, 3.000, 5.200, 4.800],
:row-indices #tech.v3.dataset.column<int64>[5]
:__row-idx
[0, 1, 2, 3, 4]}],
:y-domain [2.1 5.2],
:x-domain [1.0 5.0]}
{:mark :line,
:style {:stroke-width 2.5},
:groups
[{:color [0.2 0.2 0.2 1.0],
:label "",
:x1 1.0,
:y1 2.6200000000000014,
:x2 5.0,
:y2 5.139999999999999}],
:y-domain [2.1 5.2],
:x-domain [1.0 5.0]}],
:y-scale {:type :linear},
:y-ticks
{:values [2.0 2.5 3.0 3.5 4.0 4.5 5.0],
:labels ["2.0" "2.5" "3.0" "3.5" "4.0" "4.5" "5.0"],
:categorical? false},
:row 0}],
:width 600,
:height 400,
:caption nil,
:total-width 622.5,
:legend-position :right,
:layout-type :single,
:layout
{:subtitle-pad 0,
:legend-w 0,
:caption-pad 0,
:y-label-pad 22.5,
:legend-h 0,
:title-pad 0,
:strip-h 0,
:x-label-pad 18,
:strip-w 0},
:grid {:rows 1, :cols 1},
:legend nil,
:panel-height 400.0,
:title nil,
:y-label "y",
:alpha-legend nil,
:x-label "x",
:subtitle nil,
:panel-width 600.0,
:size-legend nil,
:total-height 418.0,
:margin 30}multi-viewsTwo layers β one :point, one :line β sharing the same domain. The :line layer has :mark :line and its groups contain :polyline-xs and :polyline-ys β the regression curve.
Resolution Overview
All of the inference rules above feed into views->plan, which orchestrates a resolution pipeline. The diagram below shows the key steps and their data dependencies:
(infer-column-types)"] VIEWS --> AE["Aesthetics
(resolve-aesthetics)"] CT --> GR["Grouping
(infer-grouping)"] AE --> GR CT --> ME["Method
(infer-method)"] GR --> STATS["Statistics
(compute-stat)"] ME --> STATS STATS --> DOM["Domains
(collect-domain + pad-domain)"] DOM --> TK["Ticks
(compute-ticks)"] VIEWS --> LBL["Labels
(resolve-labels)"] AE --> LEG["Color Legend
(build-legend)"] AE --> SLEG["Size Legend
(build-size-legend)"] AE --> ALEG["Alpha Legend
(build-alpha-legend)"] DOM --> LAYOUT["Layout
(compute-layout-dims)"] LBL --> LAYOUT LEG --> LAYOUT SLEG --> LAYOUT ALEG --> LAYOUT DOM --> PLAN["Plan"] TK --> PLAN LBL --> PLAN LEG --> PLAN SLEG --> PLAN ALEG --> PLAN LAYOUT --> PLAN STATS --> PLAN style VIEWS fill:#e8f5e9 style PLAN fill:#fff3e0 style STATS fill:#e3f2fd style DOM fill:#e3f2fd
Each box corresponds to a named function in the codebase. The top four boxes β Column Types, Aesthetics, Grouping, and Method β are the per-view inference steps (in view.clj). The remaining boxes are the plan-level orchestration steps (in plan.clj and scale.clj).
Summary
Every inference can be overridden. Here is the complete list:
| What is inferred | Default | Override |
|---|---|---|
| Column selection | 1βx, 2βx y, 3βx y color | explicit column args in sk/view or sk/lay-* |
| Column type | dtype inspection | :x-type, :y-type, :color-type in view options |
| Aesthetic classification | keyword = column, string = color/column | explicit :color keyword vs hex string |
| Grouping | categorical color column | :group aesthetic |
| Method (mark + stat) | column types (see table above) | sk/lay-point, sk/lay-histogram, etc. |
| Domain extent | data range + 5% padding | (sk/scale views :x {:domain [0 10]}) |
| Domain zero-anchor | bar/stacked charts include zero | (sk/scale views :y {:domain [5 20]}) |
| Fill domain | [0.0, 1.0] for fill position |
(sk/scale views :y {:domain [0 2]}) |
| Tick values | round intervals (linear), powers of 10 (log) | wadogo scale configuration |
| Tick labels | number formatting, calendar formatting | wadogo label formatting |
| Axis labels | column name, underscores β spaces | (sk/options {:x-label "Custom"}) |
| Color legend | categorical = discrete, numerical = continuous, none = no legend | :color mapping controls presence |
| Size legend | 5 graduated circles when :size maps to numerical column |
:size mapping controls presence |
| Alpha legend | 5 graduated opacity squares when :alpha maps to numerical column |
:alpha mapping controls presence |
| Layout padding | adjusts for title, labels, legend | :width, :height in options |
| Layout type | single, facet-grid, multi-variable | sk/facet, multiple x-y pairs |
| Coordinate system | :cartesian |
(sk/coord :flip), (sk/coord :polar) |
The plan captures the result of all inference. When in doubt, look at the plan.
Whatβs Next
- Methods β the full registry of marks, stats, and positions that inference selects from
- Scatter Plots β see inference in action with the most common chart type