23  Troubleshooting

Common mistakes and how to fix them.

(ns plotje-book.troubleshooting
  (:require
   ;; Rdatasets -- standard datasets
   [scicloj.metamorph.ml.rdatasets :as rdatasets]
   ;; Kindly -- notebook rendering protocol
   [scicloj.kindly.v4.kind :as kind]
   ;; Tablecloth -- dataset manipulation
   [tablecloth.api :as tc]
   ;; Plotje -- composable plotting
   [scicloj.plotje.api :as pj]))

Column Not Found

Symptom: "Column :foo (from :x) not found in dataset" error, listing the available columns at the end of the message.

Cause: The column reference does not match a dataset column name. Matching is strict – :foo matches keyword column :foo only, and "foo" matches string column "foo" only. The two forms do not interchange. Three common triggers:

1. Typo. A misspelled column name. Always check the spelling against the dataset’s actual columns:

(tc/column-names (rdatasets/datasets-iris))
(:rownames
 :sepal-length
 :sepal-width
 :petal-length
 :petal-width
 :species)

2. Keyword vs string. A CSV loaded without :key-fn keyword produces string column names; using a keyword reference against that dataset throws:

(try
  (-> (tc/dataset {"sepal_length" [5.0 6.0] "sepal_width" [3.0 3.5]})
      (pj/pose :sepal_length :sepal_width)
      pj/lay-point pj/plot)
  (catch clojure.lang.ExceptionInfo e (ex-message e)))
"Column :sepal_width (from :y) not found in dataset. Available: (\"sepal_length\" \"sepal_width\")"

The fix is to either pass {:key-fn keyword} when loading the CSV (so the dataset has keyword columns) or to use string references everywhere.

3. Whitespace or punctuation mismatch. A column literally named "sepal length" (with a space) does not match :sepal-length (with a hyphen):

(try
  (-> (tc/dataset {"sepal length" [5.0 6.0] "sepal width" [3.0 3.5]})
      (pj/pose :sepal-length :sepal-width)
      pj/lay-point pj/plot)
  (catch clojure.lang.ExceptionInfo e (ex-message e)))
"Column :sepal-width (from :y) not found in dataset. Available: (\"sepal length\" \"sepal width\")"

Note that :key-fn keyword on "sepal length" produces :sepal length – a keyword whose printed form contains a space, not the hyphenated form a Clojure reader would normally produce. Spaces and other special characters in CSV headers usually need a custom :key-fn, e.g. (comp keyword #(clojure.string/replace % " " "-")).

Wrong Chart Type from Inference

Symptom: pj/pose produces a chart type that isn’t what you wanted – a boxplot when you wanted individual points, a line when you wanted a scatter.

Cause: pj/pose infers the layer type from column types. The defaults fit the most common use case for each column-type pair (see Inference Rules), but they can be overridden.

Fix: Use an explicit pj/lay-* function. For example, a categorical x with a numerical y defaults to a boxplot:

(-> (rdatasets/datasets-iris)
    (pj/pose :species :sepal-width))
sepal widthspeciessetosaversicolorvirginica2.02.53.03.54.04.5

Use pj/lay-point if you want the individual points instead:

(-> (rdatasets/datasets-iris)
    (pj/lay-point :species :sepal-width))
sepal widthspeciessetosaversicolorvirginica2.02.53.03.54.04.5

Numeric IDs Treated as Continuous Color

Symptom: You color by a subject/group ID column that contains numbers (e.g., 1, 2, 3), but instead of discrete colored groups you get a single continuous gradient.

Cause: The inference system sees a numeric column and treats it as continuous. Continuous color means no grouping – all data stays in one group with a gradient legend.

(def subject-scores
  {:day     [1 2 3 4 1 2 3 4 1 2 3 4]
   :score   [3 5 4 6 6 7 5 8 8 9 7 10]
   :subject [1 1 1 1 2 2 2 2 3 3 3 3]})

Gradient (wrong for IDs) – one line for the whole dataset, with the color sampled from the gradient legend. Plotje also prints a warning at the REPL pointing at the fix:

(-> subject-scores
    (pj/lay-line :day :score {:color :subject}))
scoredaysubject1.0003.0001.01.52.02.53.03.54.0345678910

Fix: Add :color-type :categorical to override the inference – three discrete groups, one line per subject:

(-> subject-scores
    (pj/lay-line :day :score {:color :subject :color-type :categorical}))
scoredaysubject1231.01.52.02.53.03.54.0345678910

See Inference Rules for the full mechanism.

Numeric Column Rejected by a Categorical-Axis Mark

Symptom: An error like "Mark :rect (lay-value-bar) requires a categorical column for :x, but :hour is numerical", or the equivalent for :boxplot, :violin, :lollipop, or similar marks that need a categorical axis.

Cause: The column you passed (e.g., hour of day, year, subject ID) contains numbers, so column-type inference classifies it as :numerical. The mark needs :categorical.

A bar chart of hourly counts hits this – :hour looks like integers, so it is inferred numerical:

(try
  (-> {:hour [9 10 11 12] :count [5 8 12 7]}
      (pj/lay-value-bar :hour :count)
      pj/plan)
  (catch clojure.lang.ExceptionInfo e (ex-message e)))
"Mark :rect (lay-value-bar) requires a categorical column for :x, but :hour is numerical. Use lay-line/lay-point for numeric x, or convert :hour to a string column."

Fix: Add :x-type :categorical (or :y-type :categorical for horizontal layouts) to override the inferred type. No need to convert the column itself:

(-> {:hour [9 10 11 12] :count [5 8 12 7]}
    (pj/lay-value-bar :hour :count {:x-type :categorical}))
counthour9101112024681012

The override propagates into infer-column-types, so every downstream step (scale type, tick placement, domain) treats :hour as categorical. The same switch works for :y-type when a numeric column is on the y axis of a horizontal boxplot or similar layout. See Inference Rules for a worked example.

Log Scale via :scale-x / :scale-y Options

Symptom: Passing {:scale-x :log} (or {:scale-y :log}) to a layer or to pj/options prints a warning – "does not recognize option(s): [:scale-x]" – and the chart comes out on a linear axis.

Cause: Scales are plot-level, not layer-level or option-map keys. They are set by the pj/scale function, not by a :scale-* key.

The wrong form does not throw; it warns and silently falls back to a linear axis:

(with-out-str
  (-> (rdatasets/ggplot2-diamonds)
      (pj/lay-point :carat :price {:scale-y :log})
      pj/plan))
"Warning: lay-point does not recognize option(s): [:scale-y]. Accepted: [:alpha :color :color-type :data :group :jitter :mark :nudge-x :nudge-y :position :shape :size :stat :text :x :x-type :y :y-type]\n"

Fix: Use pj/scale:

(-> (rdatasets/ggplot2-diamonds)
    (pj/lay-point :carat :price {:alpha 0.1})
    (pj/scale :y :log))
pricecarat0.00.51.01.52.02.53.03.54.04.55.030050010002000300050001000020000

pj/scale takes the pose, the axis (:x or :y), and either a type keyword (:linear, :log) or a scale spec map with :type and an optional :domain override. See the Inference Rules chapter for how scale types and domains interact with column inference.

x-Only Layer Types Do Not Accept a y Column

Symptom: "lay-histogram uses only the x column; do not pass a y column" error.

Cause: Histogram, bar, density, and rug layer types use only the x column. Passing a y column is an error.

(try
  (-> (rdatasets/datasets-iris)
      (pj/lay-histogram :sepal-length :sepal-width)
      pj/plan)
  (catch clojure.lang.ExceptionInfo e (ex-message e)))
"lay-histogram uses only the x column; do not pass a y column"

Fix: Remove the y column:

(-> (rdatasets/datasets-iris)
    (pj/lay-histogram :sepal-length))
sepal length4.55.05.56.06.57.07.58.00510152025

Categorical Column with Log Scale

Symptom: "Log scale requires numeric data" error.

Cause: Log scales only work with numerical columns. Categorical columns (strings, keywords) have no meaningful log transform.

(try
  (-> (rdatasets/datasets-iris)
      (pj/lay-bar :species)
      (pj/scale :x :log)
      pj/plot)
  (catch clojure.lang.ExceptionInfo e (ex-message e)))
"Log scale on :x requires numeric data, but column :species is non-numeric."

Fix: Use a numerical column for the log-scaled axis, or drop the log scale on the categorical axis.

Polar Coordinates with Unsupported Marks

Symptom: "Mark :line is not supported with polar coordinates. Supported polar marks: (:bar :point :rect :rug :text)" (or the same message for :area and other unsupported marks).

Cause: Polar coordinates currently support a subset of marks: :bar, :point, :rect, :rug, and :text. Layer types built on these marks (such as :value-bar and :histogram, which both render as bars) work too.

(try
  (-> {:x [1 2 3 4 5] :y [2 4 3 5 4]}
      (pj/lay-line :x :y)
      (pj/coord :polar)
      pj/plan)
  (catch clojure.lang.ExceptionInfo e (ex-message e)))
"Mark :line is not supported with polar coordinates. Supported polar marks: (:bar :point :rect :rug :text)"

Fix for now: Use a supported mark. A bar chart flipped to polar becomes a rose chart:

(-> (rdatasets/datasets-chickwts)
    (pj/pose :feed)
    pj/lay-bar
    (pj/coord :polar))

Support for :line, :area, and other marks in polar is planned. See the Polar Coordinates chapter for the full set of currently supported marks and examples.

Tooltip and Brush Not Working

Symptom: You set {:tooltip true} but no tooltip appears when hovering over points.

Cause: Tooltip and brush interactivity use JavaScript that requires a compatible notebook viewer. Static HTML export or some viewers may not support it.

Fix: Use Clay or another Kindly-compatible tool that supports kind/hiccup with embedded scripts.

(-> (rdatasets/datasets-iris)
    (pj/lay-point :sepal-length :sepal-width {:color :species})
    (pj/options {:tooltip true}))
sepal widthsepal lengthspeciessetosaversicolorvirginica4.55.05.56.06.57.07.58.02.02.53.03.54.04.5

Faceting Keys in a Layer’s Options Map

Symptom: An error like "Faceting is plot-level, not layer-level. Use (pj/facet pose col) ..." when you put :facet-col, :facet-row, :facet-x, or :facet-y inside a pj/lay-* options map.

Cause: Faceting configures the plot as a whole, not a single layer. Putting these keys in a layer’s options map is rejected with a guidance message.

(try
  (-> (rdatasets/datasets-iris)
      (pj/pose :sepal-length :sepal-width)
      (pj/lay-point {:facet-col :species})
      pj/plan)
  (catch clojure.lang.ExceptionInfo e (ex-message e)))
"Faceting is plot-level, not layer-level. Use (pj/facet pose col) or (pj/facet-grid pose col-col row-col) instead of putting facet-col in a layer's options map."

Fix: Use pj/facet (single-axis) or pj/facet-grid (two-axis) as a top-level step in the pipeline:

(-> (rdatasets/datasets-iris)
    (pj/lay-point :sepal-length :sepal-width)
    (pj/facet :species))
sepal widthsepal length682.02.53.03.54.04.56868setosaversicolorvirginica

Constant :x or :y in a Layer’s Options

Symptom: An error like "lay-text :y must be a column reference (keyword or string), but got 3.0", typically when adding a text or label layer at a fixed horizontal or vertical position. (Reference lines use pj/lay-rule-h with :y-intercept or pj/lay-rule-v with :x-intercept instead.)

Cause: :x and :y are position mappings – they must name a column that the stat can index into, not hold a scalar constant.

(try
  (-> (rdatasets/datasets-iris)
      (pj/lay-point :sepal-length :sepal-width)
      (pj/lay-text {:x :sepal-length :y 3.0 :text :species})
      pj/plan)
  (catch clojure.lang.ExceptionInfo e (ex-message e)))
"lay-text :y must be a column reference (keyword or string), but got 3.0. For a constant position, add a column to :data with that value, e.g. `(tc/add-column data :y (constantly 3.0))` and pass :y :y."

Fix: Provide a small one-row dataset via :data whose columns hold the constant values, then reference those columns:

(-> (rdatasets/datasets-iris)
    (pj/lay-point :sepal-length :sepal-width)
    (pj/lay-text {:data {:sepal-length [6.5] :species ["mean"] :yy [3.5]}
                  :x :sepal-length :y :yy :text :species}))
sepal widthsepal lengthmean4.55.05.56.06.57.07.58.02.02.53.03.54.04.5

Dataset Missing Columns a Template References

Symptom: An error like "Cannot attach data: pose references column(s) [:group] not present in the dataset. Available columns: [:x :y]" when calling pj/with-data on a dataless template pose.

Cause: pj/with-data validates at attach time – every keyword column reference in the template must exist in the dataset, or the attachment fails fast.

(def template
  (-> (pj/pose nil {:x :x :y :y :color :group})
      pj/lay-point))
(try
  (-> template
      (pj/with-data {:x [1 2 3] :y [4 5 6]}))
  (catch clojure.lang.ExceptionInfo e (ex-message e)))
"Cannot attach data: pose references column(s) [:group] not present in the dataset. Available columns: [:x :y]."

Fix: Either rename the dataset columns to match the template (tc/rename-columns), or adjust the template to reference the columns the dataset has.

(-> (pj/pose nil {:x :x :y :y})
    pj/lay-point
    (pj/with-data {:x [1 2 3] :y [4 5 6]}))
yx1.01.21.41.61.82.02.22.42.62.83.04.04.24.44.64.85.05.25.45.65.86.0

Horizontal Ranking Bars Draw Biggest-at-Bottom

Symptom: A horizontal bar chart made with (pj/coord :flip) shows the first row of the data at the bottom of the chart. A descending-sorted “top-N” dataset ends up with the biggest bar at the bottom instead of the top.

Cause: coord :flip draws categories bottom-to-top in the order they appear in the data (matching ggplot2’s coord_flip()).

Descending data plotted as-is – “A” (the biggest) renders at the bottom, not the top:

(-> [{:category "A" :value 100}
     {:category "B" :value 50}
     {:category "C" :value 25}]
    (tc/dataset)
    (pj/lay-value-bar :category :value)
    (pj/coord :flip))
categoryvalue0102030405060708090100ABC

Fix for now: Sort the dataset ascending before plotting – the ascending order shows up top-to-bottom on the flipped axis, so the biggest value lands at the top:

(-> [{:category "A" :value 100}
     {:category "B" :value 50}
     {:category "C" :value 25}]
    (tc/dataset)
    (tc/order-by [:value] :asc)
    (pj/lay-value-bar :category :value)
    (pj/coord :flip))
categoryvalue0102030405060708090100CBA

A future opt-in option (e.g. (pj/coord :flip {:reverse-categorical true})) would spare the sort dance. Tracked in CHANGELOG.md Known limitations.

Stacked Bars Reject Pre-Aggregated Counts

Symptom: "lay-bar uses only the x column; do not pass a y column" when you have already grouped and aggregated the data and want a stacked bar chart of the computed values.

Cause: pj/lay-bar {:position :stack} is count-only – it bins by x internally and sums counts. It has no mode that accepts a pre-computed y.

(try
  (-> {:x [1 2 3] :y [10 20 30] :group ["A" "B" "A"]}
      (pj/lay-bar :x :y {:position :stack :color :group})
      pj/plan)
  (catch clojure.lang.ExceptionInfo e (ex-message e)))
"lay-bar uses only the x column; do not pass a y column"

Fix for now: Either use (pj/lay-area ... {:position :stack}) on a numeric x (it accepts pre-aggregated y), or expand aggregated rows back into count-many duplicates so the count stat sums to the pre-aggregated value. A proper stacked value-bar is tracked in CHANGELOG.md Known limitations.

(-> {:x     (concat (range 5) (range 5))
     :y     [1  2  3  4  5  2  2  2  3  3]
     :group (concat (repeat 5 "A") (repeat 5 "B"))}
    (pj/lay-area :x :y {:position :stack :color :group}))
yxgroupAB0.00.51.01.52.02.53.03.54.0012345678

Dodge Has No Effect on Point Layers

Symptom: Adding :position :dodge to pj/lay-point (or other non-bar marks) does not spread points apart by group – the plot looks identical to the version without :position :dodge.

Cause: :position :dodge is implemented for bar-family marks (pj/lay-bar, pj/lay-value-bar). On point/line/jitter and several other marks the option is accepted but silently ignored.

The two plans below produce identical x-coordinates for the rendered points – :position :dodge is a no-op on points:

(def points-data
  {:x [1 1 2 2 3 3] :y [10 15 20 25 30 35] :group ["A" "B" "A" "B" "A" "B"]})
(defn point-xs [pose]
  (-> pose pj/plan :panels first :layers first :groups
      (->> (mapcat :xs) sort vec)))
(= (point-xs (-> points-data (pj/lay-point :x :y {:color :group})))
   (point-xs (-> points-data (pj/lay-point :x :y {:color :group :position :dodge}))))
true

Fix for now: For grouped categorical layouts use pj/lay-value-bar (or pj/lay-bar when binning a count); dodge works there. To distinguish overlapping points by group on a numeric x, encode the group with :color, :shape, or pre-compute small offsets in the data. A proper dodge for points is tracked in CHANGELOG.md Known limitations.

(-> {:cat   ["A" "A" "B" "B" "C" "C"]
     :y     [10 20 30 40 50 60]
     :group ["a" "b" "a" "b" "a" "b"]}
    (pj/lay-value-bar :cat :y {:color :group :position :dodge}))
ycatgroupabABC0102030405060

Polar Bar Chart Has No Category Labels

Symptom: A bar chart flipped to polar ((pj/coord :polar)) renders as a rose chart, but no category text appears anywhere around the wedges.

Cause: Polar coord does not currently emit angular tick labels for bar-family marks – the underlying axis machinery places labels along Cartesian axes that polar replaces with a circular layout, and the equivalent angular ticks are not yet implemented.

The polar version shows the wedges sized by category, but the category names are absent:

(-> (rdatasets/datasets-chickwts)
    (pj/pose :feed)
    pj/lay-bar
    (pj/coord :polar))

Fix for now: Drop (pj/coord :polar) for the labeled view, or combine the polar plot with a separate Cartesian-coord version for the legend. A proper rose-chart label pass is tracked in CHANGELOG.md Known limitations.

(-> (rdatasets/datasets-chickwts)
    (pj/pose :feed)
    pj/lay-bar)
feedhorsebeanlinseedsoybeansunflowermeatmealcasein02468101214

Heatmap with Categorical Axes

Symptom: "class java.lang.String cannot be cast to class java.lang.Number" when passing a string column to pj/lay-tile.

Cause: pj/lay-tile (and the underlying :bin2d stat) requires numeric x and y columns – the tile boundaries are numeric intervals. Categorical axes are not yet supported for tile.

(try
  (-> {:x ["a" "b" "c"] :y ["a" "b" "c"] :v [1 2 3]}
      (pj/lay-tile :x :y {:fill :v})
      pj/plan)
  (catch Throwable t (.getMessage t)))
"class java.lang.String cannot be cast to class java.lang.Number (java.lang.String and java.lang.Number are in module java.base of loader 'bootstrap')"

Fix: render a numeric-indexed grid (1-N integers in place of the categorical column) and pair :breaks with :labels on the axis so the tick text shows the original category names:

(-> (for [day (range 1 8) hour (range 0 24)]
      {:day day :hour hour :v (+ (* 0.3 (Math/sin (* 0.5 hour)))
                                 (* 0.2 (mod day 3)))})
    (pj/lay-tile :day :hour {:fill :v})
    (pj/scale :x {:type :linear
                  :breaks [1 2 3 4 5 6 7]
                  :labels ["Mon" "Tue" "Wed" "Thu" "Fri" "Sat" "Sun"]}))
hourdayfill-0.30000.6992MonTueWedThuFriSatSun05101520

If a true categorical axis (with binning over labels rather than numeric intervals) is what you need, that is tracked in CHANGELOG.md Known limitations. The integer-plus-:labels pattern above covers most heatmap-with-categorical-axis cases.

See Also

  • Core Concepts – the mapping and inference rules behind most of these symptoms

What’s Next

source: notebooks/plotje_book/troubleshooting.clj