3  Plotly walkthrough πŸ‘£

Tableplot offers a Clojure API for creating Plotly.js plots through layered pipelines.

The API uses Hanami templates but is completely separate from the classical Hanami templates and parameters.

Here, we provide a walkthrough of that API.

See also the more detailed reference πŸ“–.

3.1 Setup

For this tutorial, we require:

(ns tableplot-book.plotly-walkthrough
  (:require [scicloj.tableplot.v1.plotly :as plotly]
            [tablecloth.api :as tc]
            [tablecloth.column.api :as tcc]
            [tech.v3.datatype.datetime :as datetime]
            [tech.v3.dataset.print :as print]
            [scicloj.kindly.v4.kind :as kind]
            [clojure.string :as str]
            [scicloj.kindly.v4.api :as kindly]
            [tableplot-book.datasets :as datasets]
            [aerial.hanami.templates :as ht]))

3.2 Basic usage

Plotly plots are created by passing datasets to a pipeline of layer functions.

Additional parameters to the functions are passed as maps. Map keys begin with = (e.g., :=color).

For example, let us plot a scatterplot (a layer of points) of 10 random items from the Iris dataset.

(-> datasets/iris
    (tc/random 10 {:seed 1})
    (plotly/layer-point
     {:=x :sepal-width
      :=y :sepal-length
      :=color :species
      :=mark-size 20
      :=mark-opacity 0.6}))

3.3 Templates and parameters

(πŸ’‘ You do neet need to understand these details for basic usage.)

Technically, the parameter maps contain Hanami substitution keys, which means they are processed by a simple set of rules, but you do not need to understand what this means yet.

The layer functions return a Hanami template. Let us print the resulting structure of the previous plot.

(def example1
  (-> datasets/iris
      (tc/random 10 {:seed 1})
      (plotly/layer-point
       {:=x :sepal-width
        :=y :sepal-length
        :=color :species
        :=mark-size 20
        :=mark-opacity 0.6})))
(kind/pprint example1)
{:data :=traces,
 :layout :=layout,
 :aerial.hanami.templates/defaults
 {:=textfont :com.rpl.specter.impl/NONE,
  :=x0 :com.rpl.specter.impl/NONE,
  :=y-type #function[clojure.lang.AFunction/1],
  :=coordinates :2d,
  :=boxmode :com.rpl.specter.impl/NONE,
  :=x0-after-stat :=x0,
  :=z-after-stat :=z,
  :=splom-traces #function[clojure.lang.AFunction/1],
  :=zmax :com.rpl.specter.impl/NONE,
  :=layers
  [{:y :=y-after-stat,
    :trace-base
    {:mode :=mode,
     :type :=type,
     :opacity :=mark-opacity,
     :textfont :=textfont},
    :colorscale :=colorscale,
    :color-type :=color-type,
    :r :=r,
    :coordinates :=coordinates,
    :group :=group,
    :color :=color,
    :meanline-visible :=meanline-visible,
    :mark :=mark,
    :x-title :=x-title,
    :symbol :=symbol,
    :name :=name,
    :fill :=mark-fill,
    :y1 :=y1-after-stat,
    :bar-width :=bar-width,
    :boxmode :=boxmode,
    :theta :=theta,
    :size :=size,
    :size-type :=size-type,
    :z :=z-after-stat,
    :lon :=lon,
    :aerial.hanami.templates/defaults
    {:=textfont :com.rpl.specter.impl/NONE,
     :=x0 :com.rpl.specter.impl/NONE,
     :=y-type #function[clojure.lang.AFunction/1],
     :=coordinates :2d,
     :=boxmode :com.rpl.specter.impl/NONE,
     :=x0-after-stat :=x0,
     :=z-after-stat :=z,
     :=splom-traces #function[clojure.lang.AFunction/1],
     :=zmax :com.rpl.specter.impl/NONE,
     :=layers [],
     :=mark-fill :com.rpl.specter.impl/NONE,
     :=x1 :com.rpl.specter.impl/NONE,
     :=title :com.rpl.specter.impl/NONE,
     :=annotations :com.rpl.specter.impl/NONE,
     :=z-type #function[clojure.lang.AFunction/1],
     :=y1 :com.rpl.specter.impl/NONE,
     :=y-type-after-stat #function[clojure.lang.AFunction/1],
     :=height 400,
     :=box-visible :com.rpl.specter.impl/NONE,
     :=mark-symbol :com.rpl.specter.impl/NONE,
     :=name :com.rpl.specter.impl/NONE,
     :=mark-opacity 0.6,
     :=inferred-group #function[clojure.lang.AFunction/1],
     :=y-showgrid true,
     :=density-bandwidth :com.rpl.specter.impl/NONE,
     :=mode #function[clojure.lang.AFunction/1],
     :=splom-layout #function[clojure.lang.AFunction/1],
     :=y-title :com.rpl.specter.impl/NONE,
     :=z-type-after-stat #function[clojure.lang.AFunction/1],
     :=size :com.rpl.specter.impl/NONE,
     :=model-options {:model-type :fastmath/ols},
     :=group :=inferred-group,
     :=y0 :com.rpl.specter.impl/NONE,
     :=mark-size 20,
     :=violinmode :com.rpl.specter.impl/NONE,
     :=design-matrix #function[clojure.lang.AFunction/1],
     :=size-type #function[clojure.lang.AFunction/1],
     :=zmin :com.rpl.specter.impl/NONE,
     :=x-showgrid true,
     :=r :com.rpl.specter.impl/NONE,
     :=color :species,
     :=mark-color :com.rpl.specter.impl/NONE,
     :=bar-width :com.rpl.specter.impl/NONE,
     :=y1-after-stat :=y1,
     :=x :sepal-width,
     :=symbol :com.rpl.specter.impl/NONE,
     :=x-after-stat :=x,
     :=yaxis-gridcolor "rgb(255,255,255)",
     :=lon :com.rpl.specter.impl/NONE,
     :=text :com.rpl.specter.impl/NONE,
     :=type #function[clojure.lang.AFunction/1],
     :=x-type-after-stat #function[clojure.lang.AFunction/1],
     :=traces #function[clojure.lang.AFunction/1],
     :=x-type #function[clojure.lang.AFunction/1],
     :=histogram-nbins 10,
     :=automargin false,
     :=stat :=dataset,
     :=z :z,
     :=width 500,
     :=lat :com.rpl.specter.impl/NONE,
     :=margin {:t 25},
     :=color-type #function[clojure.lang.AFunction/1],
     :=xaxis-gridcolor "rgb(255,255,255)",
     :=mark :point,
     :=x-title :com.rpl.specter.impl/NONE,
     :=colorscale :com.rpl.specter.impl/NONE,
     :=layout #function[clojure.lang.AFunction/1],
     :=colnames :com.rpl.specter.impl/NONE,
     :=y :sepal-length,
     :=x1-after-stat :=x1,
     :=dataset datasets/iris [10 6]:

| :rownames | :sepal-length | :sepal-width | :petal-length | :petal-width |   :species |
|----------:|--------------:|-------------:|--------------:|-------------:|------------|
|        27 |           5.0 |          3.4 |           1.6 |          0.4 |     setosa |
|        97 |           5.7 |          2.9 |           4.2 |          1.3 | versicolor |
|       127 |           6.2 |          2.8 |           4.8 |          1.8 |  virginica |
|        92 |           6.1 |          3.0 |           4.6 |          1.4 | versicolor |
|         7 |           4.6 |          3.4 |           1.4 |          0.3 |     setosa |
|        95 |           5.6 |          2.7 |           4.2 |          1.3 | versicolor |
|       125 |           6.7 |          3.3 |           5.7 |          2.1 |  virginica |
|        61 |           5.0 |          2.0 |           3.5 |          1.0 | versicolor |
|        73 |           6.3 |          2.5 |           4.9 |          1.5 | versicolor |
|        42 |           4.5 |          2.3 |           1.3 |          0.3 |     setosa |
,
     :=background "rgb(235,235,235)",
     :=theta :com.rpl.specter.impl/NONE,
     :=y0-after-stat :=y0,
     :=y-after-stat :=y,
     :=predictors [:=x],
     :=marker-size-key #function[clojure.lang.AFunction/1],
     :=meanline-visible :com.rpl.specter.impl/NONE},
    :lat :=lat,
    :y0 :=y0-after-stat,
    :zmax :=zmax,
    :annotations :=annotations,
    :inferred-group :=inferred-group,
    :marker-override
    {:color :=mark-color,
     :=marker-size-key :=mark-size,
     :symbol :=mark-symbol},
    :x :=x-after-stat,
    :x1 :=x1-after-stat,
    :x0 :=x0-after-stat,
    :zmin :=zmin,
    :y-title :=y-title,
    :box-visible :=box-visible,
    :dataset :=stat,
    :violinmode :=violinmode,
    :text :=text}],
  :=mark-fill :com.rpl.specter.impl/NONE,
  :=x1 :com.rpl.specter.impl/NONE,
  :=title :com.rpl.specter.impl/NONE,
  :=annotations :com.rpl.specter.impl/NONE,
  :=z-type #function[clojure.lang.AFunction/1],
  :=y1 :com.rpl.specter.impl/NONE,
  :=y-type-after-stat #function[clojure.lang.AFunction/1],
  :=height 400,
  :=box-visible :com.rpl.specter.impl/NONE,
  :=mark-symbol :com.rpl.specter.impl/NONE,
  :=name :com.rpl.specter.impl/NONE,
  :=mark-opacity :com.rpl.specter.impl/NONE,
  :=inferred-group #function[clojure.lang.AFunction/1],
  :=y-showgrid true,
  :=density-bandwidth :com.rpl.specter.impl/NONE,
  :=mode #function[clojure.lang.AFunction/1],
  :=splom-layout #function[clojure.lang.AFunction/1],
  :=y-title :com.rpl.specter.impl/NONE,
  :=z-type-after-stat #function[clojure.lang.AFunction/1],
  :=size :com.rpl.specter.impl/NONE,
  :=model-options {:model-type :fastmath/ols},
  :=group :=inferred-group,
  :=y0 :com.rpl.specter.impl/NONE,
  :=mark-size :com.rpl.specter.impl/NONE,
  :=violinmode :com.rpl.specter.impl/NONE,
  :=design-matrix #function[clojure.lang.AFunction/1],
  :=size-type #function[clojure.lang.AFunction/1],
  :=zmin :com.rpl.specter.impl/NONE,
  :=x-showgrid true,
  :=r :com.rpl.specter.impl/NONE,
  :=color :com.rpl.specter.impl/NONE,
  :=mark-color :com.rpl.specter.impl/NONE,
  :=bar-width :com.rpl.specter.impl/NONE,
  :=y1-after-stat :=y1,
  :=x :x,
  :=symbol :com.rpl.specter.impl/NONE,
  :=x-after-stat :=x,
  :=yaxis-gridcolor "rgb(255,255,255)",
  :=lon :com.rpl.specter.impl/NONE,
  :=text :com.rpl.specter.impl/NONE,
  :=type #function[clojure.lang.AFunction/1],
  :=x-type-after-stat #function[clojure.lang.AFunction/1],
  :=traces #function[clojure.lang.AFunction/1],
  :=x-type #function[clojure.lang.AFunction/1],
  :=histogram-nbins 10,
  :=automargin false,
  :=stat :=dataset,
  :=z :z,
  :=width 500,
  :=lat :com.rpl.specter.impl/NONE,
  :=margin {:t 25},
  :=color-type #function[clojure.lang.AFunction/1],
  :=xaxis-gridcolor "rgb(255,255,255)",
  :=mark :point,
  :=x-title :com.rpl.specter.impl/NONE,
  :=colorscale :com.rpl.specter.impl/NONE,
  :=layout #function[clojure.lang.AFunction/1],
  :=colnames :com.rpl.specter.impl/NONE,
  :=y :y,
  :=x1-after-stat :=x1,
  :=dataset datasets/iris [10 6]:

| :rownames | :sepal-length | :sepal-width | :petal-length | :petal-width |   :species |
|----------:|--------------:|-------------:|--------------:|-------------:|------------|
|        27 |           5.0 |          3.4 |           1.6 |          0.4 |     setosa |
|        97 |           5.7 |          2.9 |           4.2 |          1.3 | versicolor |
|       127 |           6.2 |          2.8 |           4.8 |          1.8 |  virginica |
|        92 |           6.1 |          3.0 |           4.6 |          1.4 | versicolor |
|         7 |           4.6 |          3.4 |           1.4 |          0.3 |     setosa |
|        95 |           5.6 |          2.7 |           4.2 |          1.3 | versicolor |
|       125 |           6.7 |          3.3 |           5.7 |          2.1 |  virginica |
|        61 |           5.0 |          2.0 |           3.5 |          1.0 | versicolor |
|        73 |           6.3 |          2.5 |           4.9 |          1.5 | versicolor |
|        42 |           4.5 |          2.3 |           1.3 |          0.3 |     setosa |
,
  :=background "rgb(235,235,235)",
  :=theta :com.rpl.specter.impl/NONE,
  :=y0-after-stat :=y0,
  :=y-after-stat :=y,
  :=predictors [:=x],
  :=marker-size-key #function[clojure.lang.AFunction/1],
  :=meanline-visible :com.rpl.specter.impl/NONE},
 :kindly/f #'scicloj.tableplot.v1.plotly/plotly-xform}

This template has all the necessary knowledge, including the substitution keys, to turn into a plot. This happens when your visual tool (e.g., Clay) displays the plot. The tool knows what to do thanks to the Kindly metadata and a special function attached to the plot.

(meta example1)
#:kindly{:kind :kind/fn, :options nil}
(:kindly/f example1)
#'scicloj.tableplot.v1.plotly/plotly-xform

3.4 Realizing the plot

If you wish to see the resulting plot specification before displaying it as a plot, you can use the plot function. In this case, it generates a Plotly.js plot:

(-> example1
    plotly/plot
    kind/pprint)
{:data
 [{:y [5.0 4.6 4.5],
   :r nil,
   :name "setosa",
   :marker {:color "#1B9E77", :size 20},
   :fill nil,
   :mode :markers,
   :width nil,
   :type "scatter",
   :theta nil,
   :z nil,
   :opacity 0.6,
   :lon nil,
   :lat nil,
   :x [3.4 3.4 2.3],
   :text nil}
  {:y [5.7 6.1 5.6 5.0 6.3],
   :r nil,
   :name "versicolor",
   :marker {:color "#D95F02", :size 20},
   :fill nil,
   :mode :markers,
   :width nil,
   :type "scatter",
   :theta nil,
   :z nil,
   :opacity 0.6,
   :lon nil,
   :lat nil,
   :x [2.9 3.0 2.7 2.0 2.5],
   :text nil}
  {:y [6.2 6.7],
   :r nil,
   :name "virginica",
   :marker {:color "#7570B3", :size 20},
   :fill nil,
   :mode :markers,
   :width nil,
   :type "scatter",
   :theta nil,
   :z nil,
   :opacity 0.6,
   :lon nil,
   :lat nil,
   :x [2.8 3.3],
   :text nil}],
 :layout
 {:width 500,
  :height 400,
  :margin {:t 25},
  :automargin false,
  :plot_bgcolor "rgb(235,235,235)",
  :xaxis
  {:gridcolor "rgb(255,255,255)", :title :sepal-width, :showgrid true},
  :yaxis
  {:gridcolor "rgb(255,255,255)",
   :title :sepal-length,
   :showgrid true},
  :title nil}}

It is annotated as kind/plotly, so that visual tools know how to render it.

(-> example1
    plotly/plot
    meta)
#:kindly{:kind :kind/plotly, :options {:style {:height :auto}}}

This can be useful if you wish to process the Actual Plotly.js spec rather than use the Tableplot Plotly API. Let us change the background colour, for example:

(-> example1
    plotly/plot
    (assoc-in [:layout :plot_bgcolor] "#eeeedd"))

For another example, let us use a logarithmic scale for the y axis:

(-> example1
    plotly/plot
    (assoc-in [:layout :yaxis :type] "log"))

3.5 Field type inference

Tableplot infers the type of relevant fields from the data.

The example above was colored as it were since :species column was nominal, so it was assigned distinct colours.

In the following example, the coloring is by a quantitative column, so a color gradient is used:

(-> datasets/mtcars
    (plotly/layer-point
     {:=x :mpg
      :=y :disp
      :=color :cyl
      :=mark-size 20}))

We can override the inferred types and thus affect the generated plot:

(-> datasets/mtcars
    (plotly/layer-point
     {:=x :mpg
      :=y :disp
      :=color :cyl
      :=color-type :nominal
      :=mark-size 20}))

3.6 More examples

3.6.1 Boxplot

(-> datasets/mtcars
    (plotly/layer-boxplot
     {:=x :cyl
      :=y :disp}))

3.6.2 Area chart

(-> datasets/mtcars
    (tc/group-by [:cyl])
    (tc/aggregate {:total-disp #(-> % :disp tcc/sum)})
    (tc/order-by [:cyl])
    (plotly/layer-line
     {:=x :cyl
      :=mark-fill :tozeroy
      :=y :total-disp}))

3.6.3 Bar chart

(-> datasets/mtcars
    (tc/group-by [:cyl])
    (tc/aggregate {:total-disp #(-> % :disp tcc/sum)})
    (plotly/layer-bar
     {:=x :cyl
      :=y :total-disp}))
(-> datasets/mtcars
    (tc/group-by [:cyl])
    (tc/aggregate {:total-disp #(-> % :disp tcc/sum)})
    (tc/add-column :bar-width 0.5)
    (plotly/layer-bar
     {:=x :cyl
      :=bar-width :bar-width
      :=y :total-disp}))

3.6.4 Text

(-> datasets/mtcars
    (plotly/layer-text
     {:=x :mpg
      :=y :disp
      :=text :cyl
      :=mark-size 20}))
(-> datasets/mtcars
    (plotly/layer-text
     {:=x :mpg
      :=y :disp
      :=text :cyl
      :=textfont {:family "Courier New, monospace"
                  :size 16
                  :color :purple}
      :=mark-size 20}))

3.6.5 Segment plot

(-> datasets/iris
    (plotly/layer-segment
     {:=x0 :sepal-width
      :=y0 :sepal-length
      :=x1 :petal-width
      :=y1 :petal-length
      :=mark-opacity 0.4
      :=mark-size 3
      :=color :species}))

3.7 Varying color and size

(-> {:x (range 10)}
    tc/dataset
    (plotly/layer-point {:=x :x
                         :=y :x
                         :=mark-size (range 15 65 5)
                         :=mark-color ["#bebada", "#fdb462", "#fb8072", "#d9d9d9", "#bc80bd",
                                       "#b3de69", "#8dd3c7", "#80b1d3", "#fccde5", "#ffffb3"]}))
(-> {:ABCD (range 1 11)
     :EFGH [5 2.5 5 7.5 5 2.5 7.5 4.5 5.5 5]
     :IJKL [:A :A :A :A :A :B :B :B :B :B]
     :MNOP [:C :D :C :D :C :D :C :D :C :D]}
    tc/dataset
    (plotly/base {:=title "IJKLMNOP"})
    (plotly/layer-point {:=x :ABCD
                         :=y :EFGH
                         :=color :IJKL
                         :=size :MNOP
                         :=name "QRST1"})
    (plotly/layer-line
     {:=title "IJKL MNOP"
      :=x :ABCD
      :=y :ABCD
      :=name "QRST2"
      :=mark-color "magenta"
      :=mark-size 20
      :=mark-opacity 0.2}))

3.8 Time series

Date and time fields are handle appropriately. Let us, for example, draw the time series of unemployment counts.

(-> datasets/economics-long
    (tc/select-rows #(-> % :variable (= "unemploy")))
    (plotly/layer-line
     {:=x :date
      :=y :value
      :=mark-color "purple"}))

3.9 Multiple layers

We can draw more than one layer:

(-> datasets/economics-long
    (tc/select-rows #(-> % :variable (= "unemploy")))
    (plotly/layer-point {:=x :date
                         :=y :value
                         :=mark-color "green"
                         :=mark-size 20
                         :=mark-opacity 0.5})
    (plotly/layer-line {:=x :date
                        :=y :value
                        :=mark-color "purple"}))

We can also use the base function for the common parameters across layers:

(-> datasets/economics-long
    (tc/select-rows #(-> % :variable (= "unemploy")))
    (plotly/base {:=x :date
                  :=y :value})
    (plotly/layer-point {:=mark-color "green"
                         :=mark-size 20
                         :=mark-opacity 0.5})
    (plotly/layer-line {:=mark-color "purple"}))

Layers can be named:

(-> datasets/economics-long
    (tc/select-rows #(-> % :variable (= "unemploy")))
    (plotly/base {:=x :date
                  :=y :value})
    (plotly/layer-point {:=mark-color "green"
                         :=mark-size 20
                         :=mark-opacity 0.5
                         :=name "points"})
    (plotly/layer-line {:=mark-color "purple"
                        :=name "line"}))

3.10 Updating data

We can use the update-data function to vary the dataset along a plotting pipeline, affecting the layers that follow.

This functionality is inspired by ggbuilder and metamorph.

Here, for example, we draw a line, then sample 5 data rows, and draw them as points:

(-> datasets/economics-long
    (tc/select-rows #(-> % :variable (= "unemploy")))
    (plotly/base {:=x :date
                  :=y :value})
    (plotly/layer-line {:=mark-color "purple"})
    (plotly/update-data tc/random 5)
    (plotly/layer-point {:=mark-color "green"
                         :=mark-size 15
                         :=mark-opacity 0.5}))

3.11 Overriding layer data

(-> (tc/dataset {:x (range 4)
                 :y [1 2 5 9]})
    tc/dataset
    (tc/sq :y :x)
    (plotly/layer-point {:=mark-size 20})
    (plotly/layer-line {:=dataset (tc/dataset {:x [0 3]
                                               :y [1 10]})
                        :=mark-size 5}))

3.12 Smoothing

layer-smooth is a layer that applies statistical regression methods to the dataset to model it as a smooth shape. It is inspired by ggplot’s geom_smooth.

(-> datasets/iris
    (plotly/base {:=x :sepal-width
                  :=y :sepal-length})
    (plotly/layer-point {:=mark-color "green"
                         :=name "Actual"})
    (plotly/layer-smooth {:=mark-color "orange"
                          :=name "Predicted"}))

By default, the regression is computed with only one predictor variable, which is :=x. But this can be overriden using the :=predictors key. We may compute a regression with more than one predictor.

(-> datasets/iris
    (plotly/base {:=x :sepal-width
                  :=y :sepal-length})
    (plotly/layer-point {:=mark-color "green"
                         :=name "Actual"})
    (plotly/layer-smooth {:=predictors [:petal-width
                                        :petal-length]
                          :=mark-opacity 0.5
                          :=name "Predicted"}))

We can also specify the predictor columns as expressions through the :=design-matrix key. Here, we use the design matrix functionality of Metamorph.ml.

(-> datasets/iris
    (plotly/base {:=x :sepal-width
                  :=y :sepal-length})
    (plotly/layer-point {:=mark-color "green"
                         :=name "Actual"})
    (plotly/layer-smooth {:=design-matrix [[:sepal-width '(identity :sepal-width)]
                                           [:sepal-width-2 '(* :sepal-width
                                                               :sepal-width)]]
                          :=mark-opacity 0.5
                          :=name "Predicted"}))

Inspired by Sami Kallinen’s Heart of Clojure talk:

(-> datasets/iris
    (plotly/base {:=x :sepal-width
                  :=y :sepal-length})
    (plotly/layer-point {:=mark-color "green"
                         :=name "Actual"})
    (plotly/layer-smooth {:=design-matrix [[:sepal-width '(identity :sepal-width)]
                                           [:sepal-width-2 '(* :sepal-width
                                                               :sepal-width)]
                                           [:sepal-width-3 '(* :sepal-width
                                                               :sepal-width
                                                               :sepal-width)]]
                          :=mark-opacity 0.5
                          :=name "Predicted"}))

One can also provide the regression model details through :=model-options and use any regression model and parameters registered by Metamorph.ml.

(require 'scicloj.ml.tribuo)
(def regression-tree-options
  {:model-type :scicloj.ml.tribuo/regression
   :tribuo-components [{:name "cart"
                        :type "org.tribuo.regression.rtree.CARTRegressionTrainer"
                        :properties {:maxDepth "8"
                                     :fractionFeaturesInSplit "1.0"
                                     :seed "12345"
                                     :impurity "mse"}}
                       {:name "mse"
                        :type "org.tribuo.regression.rtree.impurity.MeanSquaredError"}]
   :tribuo-trainer-name "cart"})
(-> datasets/iris
    (plotly/base {:=x :sepal-width
                  :=y :sepal-length})
    (plotly/layer-point {:=mark-color "green"
                         :=name "Actual"})
    (plotly/layer-smooth {:=model-options regression-tree-options
                          :=mark-opacity 0.5
                          :=name "Predicted"}))

An example inspired by Plotly’s ML Regressoin in Python example.

(-> datasets/tips
    (tc/split :holdout {:seed 1})
    (plotly/base {:=x :total_bill
                  :=y :tip})
    (plotly/layer-point {:=color :$split-name})
    (plotly/update-data (fn [ds]
                          (-> ds
                              (tc/select-rows #(-> % :$split-name (= :train))))))
    (plotly/layer-smooth {:=model-options regression-tree-options
                          :=name "prediction"
                          :=mark-color "purple"}))

3.13 Grouping

The regression computed by layer-smooth is affected by the inferred grouping of the data.

For example, here we recieve three regression lines, each for every species.

(-> datasets/iris
    (plotly/base {:=title "dummy"
                  :=color :species
                  :=x :sepal-width
                  :=y :sepal-length})
    plotly/layer-point
    plotly/layer-smooth)

This happened because the :color field was :species, which is of :nominal type.

But we may override this using the :group key. For example, let us avoid grouping:

(-> datasets/iris
    (plotly/base {:=title "dummy"
                  :=color :species
                  :=group []
                  :=x :sepal-width
                  :=y :sepal-length})
    plotly/layer-point
    plotly/layer-smooth)

Alternatively, we may assign the :=color only to the points layer without affecting the smoothing layer.

(-> datasets/iris
    (plotly/base {:=title "dummy"
                  :=x :sepal-width
                  :=y :sepal-length})
    (plotly/layer-point {:=color :species})
    (plotly/layer-smooth {:=name "Predicted"
                          :=mark-color "blue"}))

3.14 Example: out-of-sample predictions

Here is a slighly more elaborate example inpired by the London Clojurians talk mentioned in the preface.

Assume we wish to predict the unemployment rate for 96 months. Let us add those months to our dataset, and mark them as Future (considering the original data as Past):

(-> datasets/economics-long
    (tc/select-rows #(-> % :variable (= "unemploy")))
    (tc/add-column :relative-time "Past")
    (tc/concat (tc/dataset {:date (-> datasets/economics-long
                                      :date
                                      last
                                      (datetime/plus-temporal-amount (range 96) :days))
                            :relative-time "Future"}))
    (print/print-range 6))

ggplot2/economics_long [670 6]:

:rownames :date :variable :value :value01 :relative-time
2297 1967-07-01 unemploy 2944.0 0.02044683 Past
2298 1967-08-01 unemploy 2945.0 0.02052578 Past
2299 1967-09-01 unemploy 2958.0 0.02155206 Past
… … … … … …
2015-07-02 Future
2015-07-03 Future
2015-07-04 Future
2015-07-05 Future

Let us represent our dates as numbers, so that we can use them in linear regression:

(-> datasets/economics-long
    (tc/select-rows #(-> % :variable (= "unemploy")))
    (tc/add-column :relative-time "Past")
    (tc/concat (tc/dataset {:date (-> datasets/economics-long
                                      :date
                                      last
                                      (datetime/plus-temporal-amount (range 96) :months))
                            :relative-time "Future"}))
    (tc/add-column :year #(datetime/long-temporal-field :years (:date %)))
    (tc/add-column :month #(datetime/long-temporal-field :months (:date %)))
    (tc/map-columns :yearmonth [:year :month] (fn [y m] (+ m (* 12 y))))
    (print/print-range 6))

ggplot2/economics_long [670 9]:

:rownames :date :variable :value :value01 :relative-time :year :month :yearmonth
2297 1967-07-01 unemploy 2944.0 0.02044683 Past 1967 7 23611
2298 1967-08-01 unemploy 2945.0 0.02052578 Past 1967 8 23612
2299 1967-09-01 unemploy 2958.0 0.02155206 Past 1967 9 23613
… … … … … … … … …
2022-12-01 Future 2022 12 24276
2023-01-01 Future 2023 1 24277
2023-02-01 Future 2023 2 24278
2023-03-01 Future 2023 3 24279

Let us use the same regression line for the Past and Future groups. To do this, we avoid grouping by assigning [] to :=group. The line is affected only by the past, since in the Future, :=y is missing. We use the numerical field :yearmonth as the regression predictor, but for plotting, we still use the :temporal field :date.

(-> datasets/economics-long
    (tc/select-rows #(-> % :variable (= "unemploy")))
    (tc/add-column :relative-time "Past")
    (tc/concat (tc/dataset {:date (-> datasets/economics-long
                                      :date
                                      last
                                      (datetime/plus-temporal-amount (range 96) :months))
                            :relative-time "Future"}))
    (tc/add-column :year #(datetime/long-temporal-field :years (:date %)))
    (tc/add-column :month #(datetime/long-temporal-field :months (:date %)))
    (tc/map-columns :yearmonth [:year :month] (fn [y m] (+ m (* 12 y))))
    (plotly/base {:=x :date
                  :=y :value})
    (plotly/layer-smooth {:=color :relative-time
                          :=mark-size 15
                          :=group []
                          :=predictors [:yearmonth]})
    ;; Keep only the past for the following layer:
    (plotly/update-data (fn [dataset]
                          (-> dataset
                              (tc/select-rows (fn [row]
                                                (-> row :relative-time (= "Past")))))))
    (plotly/layer-line {:=mark-color "purple"
                        :=mark-size 3
                        :=name "Actual"}))

3.15 Histograms

Histograms can also be represented as layers with statistical processing:

(-> datasets/iris
    (plotly/layer-histogram {:=x :sepal-width}))
(-> datasets/iris
    (plotly/layer-histogram {:=x :sepal-width
                             :=histogram-nbins 30}))
(-> datasets/iris
    (plotly/layer-histogram {:=x :sepal-width
                             :=color :species
                             :=mark-opacity 0.5}))

3.16 Density

(experimental)

Density estimates are handled similarly to Histograms:

(-> datasets/iris
    (plotly/layer-density {:=x :sepal-width}))
(-> datasets/iris
    (plotly/layer-density {:=x :sepal-width
                           :=density-bandwidth 0.05}))
(-> datasets/iris
    (plotly/layer-density {:=x :sepal-width
                           :=density-bandwidth 1}))
(-> datasets/iris
    (plotly/layer-density {:=x :sepal-width
                           :=color :species}))

3.17 Coordinates

(WIP)

3.17.1 geo

Inspired by Plotly’s tutorial for Scatter Plots on Maps in JavaScript:

(-> {:lat [45.5, 43.4, 49.13, 51.1, 53.34, 45.24,
           44.64, 48.25, 49.89, 50.45]
     :lon [-73.57, -79.24, -123.06, -114.1, -113.28,
           -75.43, -63.57, -123.21, -97.13, -104.6]
     :text ["Montreal", "Toronto", "Vancouver", "Calgary", "Edmonton",
            "Ottawa", "Halifax", "Victoria", "Winnepeg", "Regina"],}
    tc/dataset
    (plotly/base {:=coordinates :geo
                  :=lat :lat
                  :=lon :lon})
    (plotly/layer-point {:=mark-opacity 0.8
                         :=mark-color ["#bebada", "#fdb462", "#fb8072", "#d9d9d9", "#bc80bd",
                                       "#b3de69", "#8dd3c7", "#80b1d3", "#fccde5", "#ffffb3"]
                         :=mark-size 20
                         :=name "Canadian cities"})
    (plotly/layer-text {:=text :text
                        :=textfont {:size 7
                                    :color :purple}})
    plotly/plot
    (assoc-in [:layout :geo]
              {:scope "north america"
               :resolution 10
               :lonaxis {:range [-130 -55]}
               :lataxis {:range [40 60]}
               :countrywidth 1.5
               :showland true
               :showlakes true
               :showrivers true}))

3.17.2 3d

(-> datasets/iris
    (plotly/layer-point {:=x :sepal-width
                         :=y :sepal-length
                         :=z :petal-length
                         :=color :petal-width
                         :=coordinates :3d}))
(-> datasets/iris
    (plotly/layer-point {:=x :sepal-width
                         :=y :sepal-length
                         :=z :petal-length
                         :=color :species
                         :=coordinates :3d}))

3.17.3 polar

Monthly rain amounts - polar bar-chart

(def rain-data
(tc/dataset
 {:month [:Jan :Feb :Mar :Apr
          :May :Jun :Jul :Aug
          :Sep :Oct :Nov :Dec]
  :rain (repeatedly #(rand-int 200))}))
(-> rain-data
(plotly/layer-bar
     {:=r :rain
      :=theta :month
      :=coordinates :polar
      :=mark-size 20
      :=mark-opacity 0.6}))

Controlling the polar layout (by manipulating the raw Plotly.js spec):

(-> rain-data
    (plotly/base
     {})
    (plotly/layer-bar
     {:=r :rain
      :=theta :month
      :=coordinates :polar
      :=mark-size 20
      :=mark-opacity 0.6})
    plotly/plot
    (assoc-in [:layout :polar]
              {:angularaxis {:tickfont {:size 16}
                             :rotation 90
                             :direction "counterclockwise"}
               :sector [0 180]}))

A polar random walk - polar line-chart

(let [n 50]
  (-> {:r (->> (repeatedly n #(- (rand) 0.5))
               (reductions +))
       :theta (->> (repeatedly n #(* 10 (rand)))
                   (reductions +)
                   (map #(rem % 360)))
       :color (range n)}
      tc/dataset
      (plotly/layer-point
       {:=r :r
        :=theta :theta
        :=coordinates :polar
        :=mark-size 10
        :=mark-opacity 0.6})
      (plotly/layer-line
       {:=r :r
        :=theta :theta
        :=coordinates :polar
        :=mark-size 3
        :=mark-opacity 0.6})))

3.18 Debugging (WIP)

3.18.1 Viewing the computational dag of substitution keys:

(def example-to-debug
  (-> datasets/iris
      (tc/random 10 {:seed 1})
      (plotly/layer-point {:=x :sepal-width
                           :=y :sepal-length
                           :=color :species})))
(-> example-to-debug
    plotly/dag)

3.18.2 Viewing intermediate values in the computational dag:

Layers (tableplot’s intermediate data representation)

(-> example-to-debug
    (plotly/debug :=layers)
    kind/pprint)
[{:y :sepal-length,
  :trace-base {:mode :markers, :type "scatter"},
  :color-type :nominal,
  :coordinates :2d,
  :group (:species),
  :color :species,
  :mark :point,
  :z :z,
  :inferred-group (:species),
  :x :sepal-width,
  :dataset datasets/iris [10 6]:

| :rownames | :sepal-length | :sepal-width | :petal-length | :petal-width |   :species |
|----------:|--------------:|-------------:|--------------:|-------------:|------------|
|        27 |           5.0 |          3.4 |           1.6 |          0.4 |     setosa |
|        97 |           5.7 |          2.9 |           4.2 |          1.3 | versicolor |
|       127 |           6.2 |          2.8 |           4.8 |          1.8 |  virginica |
|        92 |           6.1 |          3.0 |           4.6 |          1.4 | versicolor |
|         7 |           4.6 |          3.4 |           1.4 |          0.3 |     setosa |
|        95 |           5.6 |          2.7 |           4.2 |          1.3 | versicolor |
|       125 |           6.7 |          3.3 |           5.7 |          2.1 |  virginica |
|        61 |           5.0 |          2.0 |           3.5 |          1.0 | versicolor |
|        73 |           6.3 |          2.5 |           4.9 |          1.5 | versicolor |
|        42 |           4.5 |          2.3 |           1.3 |          0.3 |     setosa |
}]

Traces (part of the Plotly spec)

(-> example-to-debug
    (plotly/debug :=traces)
    kind/pprint)
[{:y [5.0 4.6 4.5],
  :r nil,
  :name "setosa",
  :marker {:color "#1B9E77"},
  :fill nil,
  :mode :markers,
  :width nil,
  :type "scatter",
  :theta nil,
  :z nil,
  :lon nil,
  :lat nil,
  :x [3.4 3.4 2.3],
  :text nil}
 {:y [5.7 6.1 5.6 5.0 6.3],
  :r nil,
  :name "versicolor",
  :marker {:color "#D95F02"},
  :fill nil,
  :mode :markers,
  :width nil,
  :type "scatter",
  :theta nil,
  :z nil,
  :lon nil,
  :lat nil,
  :x [2.9 3.0 2.7 2.0 2.5],
  :text nil}
 {:y [6.2 6.7],
  :r nil,
  :name "virginica",
  :marker {:color "#7570B3"},
  :fill nil,
  :mode :markers,
  :width nil,
  :type "scatter",
  :theta nil,
  :z nil,
  :lon nil,
  :lat nil,
  :x [2.8 3.3],
  :text nil}]

Both

(-> example-to-debug
    (plotly/debug {:layers :=layers
                   :traces :=traces})
    kind/pprint)
{:layers
 [{:y :sepal-length,
   :trace-base {:mode :markers, :type "scatter"},
   :color-type :nominal,
   :coordinates :2d,
   :group (:species),
   :color :species,
   :mark :point,
   :z :z,
   :inferred-group (:species),
   :x :sepal-width,
   :dataset datasets/iris [10 6]:

| :rownames | :sepal-length | :sepal-width | :petal-length | :petal-width |   :species |
|----------:|--------------:|-------------:|--------------:|-------------:|------------|
|        27 |           5.0 |          3.4 |           1.6 |          0.4 |     setosa |
|        97 |           5.7 |          2.9 |           4.2 |          1.3 | versicolor |
|       127 |           6.2 |          2.8 |           4.8 |          1.8 |  virginica |
|        92 |           6.1 |          3.0 |           4.6 |          1.4 | versicolor |
|         7 |           4.6 |          3.4 |           1.4 |          0.3 |     setosa |
|        95 |           5.6 |          2.7 |           4.2 |          1.3 | versicolor |
|       125 |           6.7 |          3.3 |           5.7 |          2.1 |  virginica |
|        61 |           5.0 |          2.0 |           3.5 |          1.0 | versicolor |
|        73 |           6.3 |          2.5 |           4.9 |          1.5 | versicolor |
|        42 |           4.5 |          2.3 |           1.3 |          0.3 |     setosa |
}],
 :traces
 [{:y [5.0 4.6 4.5],
   :r nil,
   :name "setosa",
   :marker {:color "#1B9E77"},
   :fill nil,
   :mode :markers,
   :width nil,
   :type "scatter",
   :theta nil,
   :z nil,
   :lon nil,
   :lat nil,
   :x [3.4 3.4 2.3],
   :text nil}
  {:y [5.7 6.1 5.6 5.0 6.3],
   :r nil,
   :name "versicolor",
   :marker {:color "#D95F02"},
   :fill nil,
   :mode :markers,
   :width nil,
   :type "scatter",
   :theta nil,
   :z nil,
   :lon nil,
   :lat nil,
   :x [2.9 3.0 2.7 2.0 2.5],
   :text nil}
  {:y [6.2 6.7],
   :r nil,
   :name "virginica",
   :marker {:color "#7570B3"},
   :fill nil,
   :mode :markers,
   :width nil,
   :type "scatter",
   :theta nil,
   :z nil,
   :lon nil,
   :lat nil,
   :x [2.8 3.3],
   :text nil}]}

3.19 Coming soon

3.19.1 Facets

(coming soon)

3.19.2 Scales

(coming soon)

source: notebooks/tableplot_book/plotly_walkthrough.clj