2 Plotly walkthrough π£
Tableplot offers a Clojure API for creating Plotly.js plots through layered pipelines.
The API uses Hanami templates but is completely separate from the classical Hanami templates and parameters.
Here, we provide a walkthrough of that API.
See also the more detailed Tableplot Plotly reference π. You might find the official Plotly.js reference helpful. (Tip: rotate narrow devices.) There are additional examples in Intro to data visualization with Tableplot in the Noj book.
π‘ Note: For a comprehensive reference of all available functions, substitution keys, and advanced features, see the Plotly reference. This walkthrough focuses on common use cases and getting started quickly.
2.1 Setup
For this tutorial, we require:
The Tableplot plotly API namespace
Tablecloth for dataset processing
the datetime namespace of dtype-next
the print namespace of tech.ml.dataset for customized dataset printing
Kindly (to specify how certain values should be visualized)
the datasets defined in the Datasets chapter
a few other namespaces used in particular examples.
(ns tableplot-book.plotly-walkthrough
(:require [scicloj.tableplot.v1.plotly :as plotly]
[tablecloth.api :as tc]
[tablecloth.column.api :as tcc]
[tech.v3.datatype.datetime :as datetime]
[tech.v3.dataset.print :as print]
[scicloj.kindly.v4.kind :as kind]
[scicloj.kindly.v4.api :as kindly]
[scicloj.metamorph.ml.rdatasets :as rdatasets]))2.2 Basic usage
Plotly plots are created by passing datasets to a pipeline of layer functions.
Additional parameters to the functions are passed as maps. Map keys begin with = (e.g., :=color).
For example, let us plot a scatterplot (a layer of points) of 10 random items from the Iris dataset.
(-> (rdatasets/datasets-iris)
(tc/random 10 {:seed 1})
(plotly/layer-point
{:=x :sepal-width
:=y :sepal-length
:=color :species
:=mark-size 20
:=mark-opacity 0.6}))2.3 Processing overview
For basic use of Tableplot with a tool such as Clay, itβs not necessary to understand the process leading to display of a plot. Knowing more might be helpful for debugging and advanced customizations, though. This section and the following ones provide more information about the process:
- The parameter map passed to a function such as
plotly/layer-pointtypically contains Plotly-specific Hanami substitution keys. - The values of those keys are automatically combined with default values calculated for other Plotly-specific keys.
- The preceding step results in an EDN map that specifies a Plotly.js plot.
- The EDN-format plot specification is automatically transformed into a Plotly JSON specification.
- The JSON specification is automatically used to display the plot.
The reason Kindly-compatible tools like Clay know what to do with the maps at each step is because previous steps add appropriate Kindly meta annotations to the maps.
2.4 Templates and parameters
Technically, the parameter maps contain Hanami substitution keys, which means they are processed by a simple set of rules, but you do not need to understand what this means yet.
The layer functions return a Hanami template. Let us print the resulting structure of the previous plot.
(def example1
(-> (rdatasets/datasets-iris)
(tc/random 10 {:seed 1})
(plotly/layer-point
{:=x :sepal-width
:=y :sepal-length
:=color :species
:=mark-size 20
:=mark-opacity 0.6})))(kind/pprint example1){:data :=traces,
:layout :=layout,
:aerial.hanami.templates/defaults
{:=textfont :com.rpl.specter.impl/NONE,
:=x0 :com.rpl.specter.impl/NONE,
:=y-type #function[clojure.lang.AFunction/1],
:=coordinates :2d,
:=boxmode :com.rpl.specter.impl/NONE,
:=x0-after-stat :=x0,
:=z-after-stat :=z,
:=splom-traces #function[clojure.lang.AFunction/1],
:=zmax :com.rpl.specter.impl/NONE,
:=layers
[{:y :=y-after-stat,
:trace-base
{:mode :=mode,
:type :=type,
:opacity :=mark-opacity,
:textfont :=textfont},
:colorscale :=colorscale,
:color-type :=color-type,
:r :=r,
:coordinates :=coordinates,
:group :=group,
:color :=color,
:meanline-visible :=meanline-visible,
:mark :=mark,
:x-title :=x-title,
:symbol :=symbol,
:name :=name,
:fill :=mark-fill,
:y1 :=y1-after-stat,
:bar-width :=bar-width,
:boxmode :=boxmode,
:size-range :=size-range,
:theta :=theta,
:size :=size,
:size-type :=size-type,
:z :=z-after-stat,
:lon :=lon,
:aerial.hanami.templates/defaults
{:=textfont :com.rpl.specter.impl/NONE,
:=x0 :com.rpl.specter.impl/NONE,
:=y-type #function[clojure.lang.AFunction/1],
:=coordinates :2d,
:=boxmode :com.rpl.specter.impl/NONE,
:=x0-after-stat :=x0,
:=z-after-stat :=z,
:=splom-traces #function[clojure.lang.AFunction/1],
:=zmax :com.rpl.specter.impl/NONE,
:=layers [],
:=mark-fill :com.rpl.specter.impl/NONE,
:=x1 :com.rpl.specter.impl/NONE,
:=title :com.rpl.specter.impl/NONE,
:=annotations :com.rpl.specter.impl/NONE,
:=z-type #function[clojure.lang.AFunction/1],
:=y1 :com.rpl.specter.impl/NONE,
:=y-type-after-stat #function[clojure.lang.AFunction/1],
:=height 400,
:=box-visible :com.rpl.specter.impl/NONE,
:=mark-symbol :com.rpl.specter.impl/NONE,
:=name :com.rpl.specter.impl/NONE,
:=mark-opacity 0.6,
:=inferred-group #function[clojure.lang.AFunction/1],
:=y-showgrid true,
:=density-bandwidth :com.rpl.specter.impl/NONE,
:=mode #function[clojure.lang.AFunction/1],
:=splom-layout #function[clojure.lang.AFunction/1],
:=y-title :com.rpl.specter.impl/NONE,
:=z-type-after-stat #function[clojure.lang.AFunction/1],
:=size :com.rpl.specter.impl/NONE,
:=model-options {:model-type :metamorph.ml/ols},
:=group :=inferred-group,
:=y0 :com.rpl.specter.impl/NONE,
:=mark-size 20,
:=violinmode :com.rpl.specter.impl/NONE,
:=design-matrix #function[clojure.lang.AFunction/1],
:=size-type #function[clojure.lang.AFunction/1],
:=zmin :com.rpl.specter.impl/NONE,
:=x-showgrid true,
:=r :com.rpl.specter.impl/NONE,
:=color :species,
:=mark-color :com.rpl.specter.impl/NONE,
:=bar-width :com.rpl.specter.impl/NONE,
:=y1-after-stat :=y1,
:=x :sepal-width,
:=symbol :com.rpl.specter.impl/NONE,
:=x-after-stat :=x,
:=yaxis-gridcolor "rgb(255,255,255)",
:=lon :com.rpl.specter.impl/NONE,
:=text :com.rpl.specter.impl/NONE,
:=type #function[clojure.lang.AFunction/1],
:=x-type-after-stat #function[clojure.lang.AFunction/1],
:=traces #function[clojure.lang.AFunction/1],
:=x-type #function[clojure.lang.AFunction/1],
:=histogram-nbins 10,
:=automargin false,
:=stat :=dataset,
:=z :z,
:=width 500,
:=lat :com.rpl.specter.impl/NONE,
:=margin {:t 25},
:=color-type #function[clojure.lang.AFunction/1],
:=xaxis-gridcolor "rgb(255,255,255)",
:=mark :point,
:=size-range [10 30],
:=x-title :com.rpl.specter.impl/NONE,
:=colorscale :com.rpl.specter.impl/NONE,
:=layout #function[clojure.lang.AFunction/1],
:=colnames #function[clojure.lang.AFunction/1],
:=y :sepal-length,
:=x1-after-stat :=x1,
:=dataset
https://vincentarelbundock.github.io/Rdatasets/csv/datasets/iris.csv [10 6]:
| :rownames | :sepal-length | :sepal-width | :petal-length | :petal-width | :species |
|----------:|--------------:|-------------:|--------------:|-------------:|------------|
| 27 | 5.0 | 3.4 | 1.6 | 0.4 | setosa |
| 97 | 5.7 | 2.9 | 4.2 | 1.3 | versicolor |
| 127 | 6.2 | 2.8 | 4.8 | 1.8 | virginica |
| 92 | 6.1 | 3.0 | 4.6 | 1.4 | versicolor |
| 7 | 4.6 | 3.4 | 1.4 | 0.3 | setosa |
| 95 | 5.6 | 2.7 | 4.2 | 1.3 | versicolor |
| 125 | 6.7 | 3.3 | 5.7 | 2.1 | virginica |
| 61 | 5.0 | 2.0 | 3.5 | 1.0 | versicolor |
| 73 | 6.3 | 2.5 | 4.9 | 1.5 | versicolor |
| 42 | 4.5 | 2.3 | 1.3 | 0.3 | setosa |
,
:=background "rgb(235,235,235)",
:=theta :com.rpl.specter.impl/NONE,
:=y0-after-stat :=y0,
:=y-after-stat :=y,
:=predictors [:=x],
:=marker-size-key #function[clojure.lang.AFunction/1],
:=meanline-visible :com.rpl.specter.impl/NONE},
:lat :=lat,
:y0 :=y0-after-stat,
:zmax :=zmax,
:annotations :=annotations,
:inferred-group :=inferred-group,
:marker-override
{:color :=mark-color,
:=marker-size-key :=mark-size,
:symbol :=mark-symbol,
:colorscale :=colorscale},
:x :=x-after-stat,
:x1 :=x1-after-stat,
:x0 :=x0-after-stat,
:zmin :=zmin,
:y-title :=y-title,
:box-visible :=box-visible,
:dataset :=stat,
:violinmode :=violinmode,
:text :=text}],
:=mark-fill :com.rpl.specter.impl/NONE,
:=x1 :com.rpl.specter.impl/NONE,
:=title :com.rpl.specter.impl/NONE,
:=annotations :com.rpl.specter.impl/NONE,
:=z-type #function[clojure.lang.AFunction/1],
:=y1 :com.rpl.specter.impl/NONE,
:=y-type-after-stat #function[clojure.lang.AFunction/1],
:=height 400,
:=box-visible :com.rpl.specter.impl/NONE,
:=mark-symbol :com.rpl.specter.impl/NONE,
:=name :com.rpl.specter.impl/NONE,
:=mark-opacity :com.rpl.specter.impl/NONE,
:=inferred-group #function[clojure.lang.AFunction/1],
:=y-showgrid true,
:=density-bandwidth :com.rpl.specter.impl/NONE,
:=mode #function[clojure.lang.AFunction/1],
:=splom-layout #function[clojure.lang.AFunction/1],
:=y-title :com.rpl.specter.impl/NONE,
:=z-type-after-stat #function[clojure.lang.AFunction/1],
:=size :com.rpl.specter.impl/NONE,
:=model-options {:model-type :metamorph.ml/ols},
:=group :=inferred-group,
:=y0 :com.rpl.specter.impl/NONE,
:=mark-size :com.rpl.specter.impl/NONE,
:=violinmode :com.rpl.specter.impl/NONE,
:=design-matrix #function[clojure.lang.AFunction/1],
:=size-type #function[clojure.lang.AFunction/1],
:=zmin :com.rpl.specter.impl/NONE,
:=x-showgrid true,
:=r :com.rpl.specter.impl/NONE,
:=color :com.rpl.specter.impl/NONE,
:=mark-color :com.rpl.specter.impl/NONE,
:=bar-width :com.rpl.specter.impl/NONE,
:=y1-after-stat :=y1,
:=x :x,
:=symbol :com.rpl.specter.impl/NONE,
:=x-after-stat :=x,
:=yaxis-gridcolor "rgb(255,255,255)",
:=lon :com.rpl.specter.impl/NONE,
:=text :com.rpl.specter.impl/NONE,
:=type #function[clojure.lang.AFunction/1],
:=x-type-after-stat #function[clojure.lang.AFunction/1],
:=traces #function[clojure.lang.AFunction/1],
:=x-type #function[clojure.lang.AFunction/1],
:=histogram-nbins 10,
:=automargin false,
:=stat :=dataset,
:=z :z,
:=width 500,
:=lat :com.rpl.specter.impl/NONE,
:=margin {:t 25},
:=color-type #function[clojure.lang.AFunction/1],
:=xaxis-gridcolor "rgb(255,255,255)",
:=mark :point,
:=size-range [10 30],
:=x-title :com.rpl.specter.impl/NONE,
:=colorscale :com.rpl.specter.impl/NONE,
:=layout #function[clojure.lang.AFunction/1],
:=colnames #function[clojure.lang.AFunction/1],
:=y :y,
:=x1-after-stat :=x1,
:=dataset
https://vincentarelbundock.github.io/Rdatasets/csv/datasets/iris.csv [10 6]:
| :rownames | :sepal-length | :sepal-width | :petal-length | :petal-width | :species |
|----------:|--------------:|-------------:|--------------:|-------------:|------------|
| 27 | 5.0 | 3.4 | 1.6 | 0.4 | setosa |
| 97 | 5.7 | 2.9 | 4.2 | 1.3 | versicolor |
| 127 | 6.2 | 2.8 | 4.8 | 1.8 | virginica |
| 92 | 6.1 | 3.0 | 4.6 | 1.4 | versicolor |
| 7 | 4.6 | 3.4 | 1.4 | 0.3 | setosa |
| 95 | 5.6 | 2.7 | 4.2 | 1.3 | versicolor |
| 125 | 6.7 | 3.3 | 5.7 | 2.1 | virginica |
| 61 | 5.0 | 2.0 | 3.5 | 1.0 | versicolor |
| 73 | 6.3 | 2.5 | 4.9 | 1.5 | versicolor |
| 42 | 4.5 | 2.3 | 1.3 | 0.3 | setosa |
,
:=background "rgb(235,235,235)",
:=theta :com.rpl.specter.impl/NONE,
:=y0-after-stat :=y0,
:=y-after-stat :=y,
:=predictors [:=x],
:=marker-size-key #function[clojure.lang.AFunction/1],
:=meanline-visible :com.rpl.specter.impl/NONE},
:kindly/f #'scicloj.tableplot.v1.plotly/plotly-xform}This template has all the necessary knowledge, including the substitution keys, to turn into a plot. This happens when your visual tool (e.g., Clay) displays the plot. The tool knows what to do thanks to the Kindly metadata and a special function attached to the plot. For example, the metadata lets Clay know that the template should be transformed into a specification with template keys and values replaced with what Plotly.js needs.
(meta example1)#:kindly{:kind :kind/fn, :options nil}(:kindly/f example1)#'scicloj.tableplot.v1.plotly/plotly-xform2.5 Realizing the plot and further customization
If you wish to see the resulting EDN plot specification before displaying it as a plot, you can use the plot function. You can also use this specification for customizations that might not be supported by the Plotly Hanami keys mentioned above.
In this case, it generates a Plotly.js plot:
(-> example1
plotly/plot
kind/pprint){:data
[{:y [5.0 4.6 4.5],
:r nil,
:name "setosa",
:marker {:color "#1B9E77", :size 20},
:fill nil,
:mode :markers,
:width nil,
:type "scatter",
:theta nil,
:z nil,
:opacity 0.6,
:lon nil,
:lat nil,
:x [3.4 3.4 2.3],
:text nil}
{:y [5.7 6.1 5.6 5.0 6.3],
:r nil,
:name "versicolor",
:marker {:color "#D95F02", :size 20},
:fill nil,
:mode :markers,
:width nil,
:type "scatter",
:theta nil,
:z nil,
:opacity 0.6,
:lon nil,
:lat nil,
:x [2.9 3.0 2.7 2.0 2.5],
:text nil}
{:y [6.2 6.7],
:r nil,
:name "virginica",
:marker {:color "#7570B3", :size 20},
:fill nil,
:mode :markers,
:width nil,
:type "scatter",
:theta nil,
:z nil,
:opacity 0.6,
:lon nil,
:lat nil,
:x [2.8 3.3],
:text nil}],
:layout
{:width 500,
:height 400,
:margin {:t 25},
:automargin false,
:plot_bgcolor "rgb(235,235,235)",
:xaxis
{:gridcolor "rgb(255,255,255)", :title :sepal-width, :showgrid true},
:yaxis
{:gridcolor "rgb(255,255,255)",
:title :sepal-length,
:showgrid true},
:title nil}}It is annotated as kind/plotly, so that visual tools know how to render it.
(-> example1
plotly/plot
meta)#:kindly{:kind :kind/plotly, :options {:style {:height :auto}}}You can manipulate the resulting Plotly EDN specification with arbitrary Clojure functions. In Kindly-compatible tools like Clay, by default this will then cause the modified EDN to be used to generate a plot.
As a simple illustration, let us change the background colour this way.
We can use assoc-in to modify the value of :plot_bgcolor nested near the end of the example1 map displayed above. (One could also do this using the :=background Hanami key.)
(-> example1
plotly/plot
(assoc-in [:layout :plot_bgcolor] "#eeeedd"))Manipulating the Plotly EDN allows customizations that might not yet be supported by the Tableplot Plotly API. The next example compresses distances in the y direction, using a capability of Plotly.js that isnβt directly supported using Tableplotβs Hanami keys.
(-> example1
plotly/plot
(assoc-in [:layout :yaxis :scaleanchor] :x)
(assoc-in [:layout :yaxis :scaleratio] 0.25))2.6 Field type inference
Tableplot infers the type of relevant fields from the data.
The example above was colored as it was since the :species column was nominal, so it was assigned distinct colours.
In the following example, the coloring is by a quantitative column, so a color gradient is used:
(-> (rdatasets/datasets-mtcars)
(plotly/layer-point
{:=x :mpg
:=y :disp
:=color :cyl
:=mark-size 20}))We can override the inferred types and thus affect the generated plot:
(-> (rdatasets/datasets-mtcars)
(plotly/layer-point
{:=x :mpg
:=y :disp
:=color :cyl
:=color-type :nominal
:=mark-size 20}))2.7 More examples
2.7.1 Boxplot
(-> (rdatasets/datasets-mtcars)
(plotly/layer-boxplot
{:=x :cyl
:=y :disp}))2.7.2 Violin plot
(-> (rdatasets/datasets-mtcars)
(plotly/layer-violin
{:=x :cyl
:=y :disp}))Violin plot with box inside:
(-> (rdatasets/datasets-mtcars)
(plotly/layer-violin
{:=x :cyl
:=y :disp
:=box-visible true}))2.7.3 Area chart
(-> (rdatasets/datasets-mtcars)
(tc/group-by [:cyl])
(tc/aggregate {:total-disp #(-> % :disp tcc/sum)})
(tc/order-by [:cyl])
(plotly/layer-line
{:=x :cyl
:=mark-fill :tozeroy
:=y :total-disp}))2.7.4 Bar chart
(-> (rdatasets/datasets-mtcars)
(tc/group-by [:cyl])
(tc/aggregate {:total-disp #(-> % :disp tcc/sum)})
(plotly/layer-bar
{:=x :cyl
:=y :total-disp}))(-> (rdatasets/datasets-mtcars)
(tc/group-by [:cyl])
(tc/aggregate {:total-disp #(-> % :disp tcc/sum)})
(tc/add-column :bar-width 0.5)
(plotly/layer-bar
{:=x :cyl
:=bar-width :bar-width
:=y :total-disp}))2.7.5 Text
(-> (rdatasets/datasets-mtcars)
(plotly/layer-text
{:=x :mpg
:=y :disp
:=text :cyl
:=mark-size 20}))(-> (rdatasets/datasets-mtcars)
(plotly/layer-text
{:=x :mpg
:=y :disp
:=text :cyl
:=textfont {:family "Courier New, monospace"
:size 16
:color :purple}
:=mark-size 20}))2.7.6 Heatmap
Heatmaps are useful for visualizing 2D data or correlation matrices:
(-> {:x (range 5)
:y (range 5)
:z (for [i (range 5)]
(for [j (range 5)]
(+ i j)))}
tc/dataset
(plotly/layer-heatmap {:=colorscale :Viridis}))2.7.7 Segment plot
(-> (rdatasets/datasets-iris)
(plotly/layer-segment
{:=x0 :sepal-width
:=y0 :sepal-length
:=x1 :petal-width
:=y1 :petal-length
:=mark-opacity 0.4
:=mark-size 3
:=color :species}))2.8 Varying color and size
(-> {:x (range 10)}
tc/dataset
(plotly/layer-point {:=x :x
:=y :x
:=mark-size (range 15 65 5)
:=mark-color ["#bebada", "#fdb462", "#fb8072", "#d9d9d9", "#bc80bd",
"#b3de69", "#8dd3c7", "#80b1d3", "#fccde5", "#ffffb3"]}))(-> {:ABCD (range 1 11)
:EFGH [5 2.5 5 7.5 5 2.5 7.5 4.5 5.5 5]
:IJKL [:A :A :A :A :A :B :B :B :B :B]
:MNOP [:C :D :C :D :C :D :C :D :C :D]}
tc/dataset
(plotly/base {:=title "IJKLMNOP"})
(plotly/layer-point {:=x :ABCD
:=y :EFGH
:=color :IJKL
:=size :MNOP
:=name "QRST1"})
(plotly/layer-line
{:=title "IJKL MNOP"
:=x :ABCD
:=y :ABCD
:=name "QRST2"
:=mark-color "magenta"
:=mark-size 20
:=mark-opacity 0.2}))2.9 Time series
Date and time fields are handle appropriately. Let us, for example, draw the time series of unemployment counts.
(-> (rdatasets/ggplot2-economics_long)
(tc/select-rows #(-> % :variable (= "unemploy")))
(plotly/layer-line
{:=x :date
:=y :value
:=mark-color "purple"}))2.10 Multivariate visualization with SPLOM
A Scatter Plot Matrix (SPLOM) is useful for exploring relationships between multiple variables at once:
(-> (rdatasets/datasets-iris)
(plotly/splom {:=colnames [:sepal-width :sepal-length :petal-width :petal-length]
:=color :species
:=height 600
:=width 600}))2.11 Multiple layers
We can draw more than one layer:
(-> (rdatasets/ggplot2-economics_long)
(tc/select-rows #(-> % :variable (= "unemploy")))
(plotly/layer-point {:=x :date
:=y :value
:=mark-color "green"
:=mark-size 20
:=mark-opacity 0.5})
(plotly/layer-line {:=x :date
:=y :value
:=mark-color "purple"}))We can also use the base function for the common parameters across layers:
(-> (rdatasets/ggplot2-economics_long)
(tc/select-rows #(-> % :variable (= "unemploy")))
(plotly/base {:=x :date
:=y :value})
(plotly/layer-point {:=mark-color "green"
:=mark-size 20
:=mark-opacity 0.5})
(plotly/layer-line {:=mark-color "purple"}))Layers can be named:
(-> (rdatasets/ggplot2-economics_long)
(tc/select-rows #(-> % :variable (= "unemploy")))
(plotly/base {:=x :date
:=y :value})
(plotly/layer-point {:=mark-color "green"
:=mark-size 20
:=mark-opacity 0.5
:=name "points"})
(plotly/layer-line {:=mark-color "purple"
:=name "line"}))2.12 Updating data
We can use the update-data function to vary the dataset along a plotting pipeline, affecting the layers that follow.
This functionality is inspired by ggbuilder and metamorph.
Here, for example, we draw a line, then sample 5 data rows, and draw them as points:
(-> (rdatasets/ggplot2-economics_long)
(tc/select-rows #(-> % :variable (= "unemploy")))
(plotly/base {:=x :date
:=y :value})
(plotly/layer-line {:=mark-color "purple"})
(plotly/update-data tc/random 5)
(plotly/layer-point {:=mark-color "green"
:=mark-size 15
:=mark-opacity 0.5}))2.13 Overriding layer data
(-> (tc/dataset {:x (range 4)
:y [1 2 5 9]})
tc/dataset
(tc/sq :y :x)
(plotly/layer-point {:=mark-size 20})
(plotly/layer-line {:=dataset (tc/dataset {:x [0 3]
:y [1 10]})
:=mark-size 5}))2.14 Smoothing
layer-smooth is a layer that applies statistical regression methods to the dataset to model it as a smooth shape. It is inspired by ggplotβs geom_smooth.
(-> (rdatasets/datasets-iris)
(plotly/base {:=x :sepal-width
:=y :sepal-length})
(plotly/layer-point {:=mark-color "green"
:=name "Actual"})
(plotly/layer-smooth {:=mark-color "orange"
:=name "Predicted"}))By default, the regression is computed with only one predictor variable, which is :=x. But this can be overriden using the :=predictors key. We may compute a regression with more than one predictor.
(-> (rdatasets/datasets-iris)
(plotly/base {:=x :sepal-width
:=y :sepal-length})
(plotly/layer-point {:=mark-color "green"
:=name "Actual"})
(plotly/layer-smooth {:=predictors [:petal-width
:petal-length]
:=mark-opacity 0.5
:=name "Predicted"}))We can also specify the predictor columns as expressions through the :=design-matrix key. Here, we use the design matrix functionality of Metamorph.ml.
(-> (rdatasets/datasets-iris)
(plotly/base {:=x :sepal-width
:=y :sepal-length})
(plotly/layer-point {:=mark-color "green"
:=name "Actual"})
(plotly/layer-smooth {:=design-matrix [[:sepal-width '(identity :sepal-width)]
[:sepal-width-2 '(* :sepal-width
:sepal-width)]]
:=mark-opacity 0.5
:=name "Predicted"}))Inspired by Sami Kallinenβs Heart of Clojure talk:
(-> (rdatasets/datasets-iris)
(plotly/base {:=x :sepal-width
:=y :sepal-length})
(plotly/layer-point {:=mark-color "green"
:=name "Actual"})
(plotly/layer-smooth {:=design-matrix [[:sepal-width '(identity :sepal-width)]
[:sepal-width-2 '(* :sepal-width
:sepal-width)]
[:sepal-width-3 '(* :sepal-width
:sepal-width
:sepal-width)]]
:=mark-opacity 0.5
:=name "Predicted"}))One can also provide the regression model details through :=model-options and use any regression model and parameters registered by Metamorph.ml.
(require 'scicloj.ml.tribuo)(def regression-tree-options
{:model-type :scicloj.ml.tribuo/regression
:tribuo-components [{:name "cart"
:type "org.tribuo.regression.rtree.CARTRegressionTrainer"
:properties {:maxDepth "8"
:fractionFeaturesInSplit "1.0"
:seed "12345"
:impurity "mse"}}
{:name "mse"
:type "org.tribuo.regression.rtree.impurity.MeanSquaredError"}]
:tribuo-trainer-name "cart"})(-> (rdatasets/datasets-iris)
(plotly/base {:=x :sepal-width
:=y :sepal-length})
(plotly/layer-point {:=mark-color "green"
:=name "Actual"})
(plotly/layer-smooth {:=model-options regression-tree-options
:=mark-opacity 0.5
:=name "Predicted"}))An example inspired by Plotlyβs ML Regressoin in Python example.
(defonce tips
(-> "https://raw.githubusercontent.com/plotly/datasets/master/tips.csv"
(tc/dataset {:key-fn keyword})))(-> tips
(tc/split :holdout {:seed 1})
(plotly/base {:=x :total_bill
:=y :tip})
(plotly/layer-point {:=color :$split-name})
(plotly/update-data (fn [ds]
(-> ds
(tc/select-rows #(-> % :$split-name (= :train))))))
(plotly/layer-smooth {:=model-options regression-tree-options
:=name "prediction"
:=mark-color "purple"}))2.15 Grouping
The regression computed by layer-smooth is affected by the inferred grouping of the data.
For example, here we recieve three regression lines, each for every species.
(-> (rdatasets/datasets-iris)
(plotly/base {:=title "dummy"
:=color :species
:=x :sepal-width
:=y :sepal-length})
plotly/layer-point
plotly/layer-smooth)This happened because the :color field was :species, which is of :nominal type.
But we may override this using the :group key. For example, let us avoid grouping:
(-> (rdatasets/datasets-iris)
(plotly/base {:=title "dummy"
:=color :species
:=group []
:=x :sepal-width
:=y :sepal-length})
plotly/layer-point
plotly/layer-smooth)Alternatively, we may assign the :=color only to the points layer without affecting the smoothing layer.
(-> (rdatasets/datasets-iris)
(plotly/base {:=title "dummy"
:=x :sepal-width
:=y :sepal-length})
(plotly/layer-point {:=color :species})
(plotly/layer-smooth {:=name "Predicted"
:=mark-color "blue"}))2.16 Example: out-of-sample predictions
Here is a slighly more elaborate example inpired by the London Clojurians talk mentioned in the preface.
Assume we wish to predict the unemployment rate for 96 months. Let us add those months to our dataset, and mark them as Future (considering the original data as Past):
(-> (rdatasets/ggplot2-economics_long)
(tc/select-rows #(-> % :variable (= "unemploy")))
(tc/add-column :relative-time "Past")
(tc/concat (tc/dataset {:date (-> (rdatasets/ggplot2-economics_long)
:date
last
(datetime/plus-temporal-amount (range 96) :days))
:relative-time "Future"}))
(print/print-range 6))https://vincentarelbundock.github.io/Rdatasets/csv/ggplot2/economics_long.csv [670 6]:
| :rownames | :date | :variable | :value | :value-01 | :relative-time |
|---|---|---|---|---|---|
| 2297 | 1967-07-01 | unemploy | 2944.0 | 0.02044683 | Past |
| 2298 | 1967-08-01 | unemploy | 2945.0 | 0.02052578 | Past |
| 2299 | 1967-09-01 | unemploy | 2958.0 | 0.02155206 | Past |
| β¦ | β¦ | β¦ | β¦ | β¦ | β¦ |
| 2015-07-02 | Future | ||||
| 2015-07-03 | Future | ||||
| 2015-07-04 | Future | ||||
| 2015-07-05 | Future |
Let us represent our dates as numbers, so that we can use them in linear regression:
(-> (rdatasets/ggplot2-economics_long)
(tc/select-rows #(-> % :variable (= "unemploy")))
(tc/add-column :relative-time "Past")
(tc/concat (tc/dataset {:date (-> (rdatasets/ggplot2-economics_long)
:date
last
(datetime/plus-temporal-amount (range 96) :months))
:relative-time "Future"}))
(tc/add-column :year #(datetime/long-temporal-field :years (:date %)))
(tc/add-column :month #(datetime/long-temporal-field :months (:date %)))
(tc/map-columns :yearmonth [:year :month] (fn [y m] (+ m (* 12 y))))
(print/print-range 6))https://vincentarelbundock.github.io/Rdatasets/csv/ggplot2/economics_long.csv [670 9]:
| :rownames | :date | :variable | :value | :value-01 | :relative-time | :year | :month | :yearmonth |
|---|---|---|---|---|---|---|---|---|
| 2297 | 1967-07-01 | unemploy | 2944.0 | 0.02044683 | Past | 1967 | 7 | 23611 |
| 2298 | 1967-08-01 | unemploy | 2945.0 | 0.02052578 | Past | 1967 | 8 | 23612 |
| 2299 | 1967-09-01 | unemploy | 2958.0 | 0.02155206 | Past | 1967 | 9 | 23613 |
| β¦ | β¦ | β¦ | β¦ | β¦ | β¦ | β¦ | β¦ | β¦ |
| 2022-12-01 | Future | 2022 | 12 | 24276 | ||||
| 2023-01-01 | Future | 2023 | 1 | 24277 | ||||
| 2023-02-01 | Future | 2023 | 2 | 24278 | ||||
| 2023-03-01 | Future | 2023 | 3 | 24279 |
Let us use the same regression line for the Past and Future groups. To do this, we avoid grouping by assigning [] to :=group. The line is affected only by the past, since in the Future, :=y is missing. We use the numerical field :yearmonth as the regression predictor, but for plotting, we still use the :temporal field :date.
(-> (rdatasets/ggplot2-economics_long)
(tc/select-rows #(-> % :variable (= "unemploy")))
(tc/add-column :relative-time "Past")
(tc/concat (tc/dataset {:date (-> (rdatasets/ggplot2-economics_long)
:date
last
(datetime/plus-temporal-amount (range 96) :months))
:relative-time "Future"}))
(tc/add-column :year #(datetime/long-temporal-field :years (:date %)))
(tc/add-column :month #(datetime/long-temporal-field :months (:date %)))
(tc/map-columns :yearmonth [:year :month] (fn [y m] (+ m (* 12 y))))
(plotly/base {:=x :date
:=y :value})
(plotly/layer-smooth {:=color :relative-time
:=mark-size 15
:=group []
:=predictors [:yearmonth]})
;; Keep only the past for the following layer:
(plotly/update-data (fn [dataset]
(-> dataset
(tc/select-rows (fn [row]
(-> row :relative-time (= "Past")))))))
(plotly/layer-line {:=mark-color "purple"
:=mark-size 3
:=name "Actual"}))2.17 Histograms
Histograms can also be represented as layers with statistical processing:
(-> (rdatasets/datasets-iris)
(plotly/layer-histogram {:=x :sepal-width}))(-> (rdatasets/datasets-iris)
(plotly/layer-histogram {:=x :sepal-width
:=histogram-nbins 30}))(-> (rdatasets/datasets-iris)
(plotly/layer-histogram {:=x :sepal-width
:=color :species
:=mark-opacity 0.5}))2.17.1 2D Histograms
For bivariate data, we can create 2D histograms:
(-> (rdatasets/datasets-iris)
(plotly/layer-histogram2d {:=x :sepal-width
:=y :sepal-length
:=histogram-nbins 15}))2.18 Density
(experimental)
Density estimates are handled similarly to Histograms:
(-> (rdatasets/datasets-iris)
(plotly/layer-density {:=x :sepal-width}))(-> (rdatasets/datasets-iris)
(plotly/layer-density {:=x :sepal-width
:=density-bandwidth 0.05}))(-> (rdatasets/datasets-iris)
(plotly/layer-density {:=x :sepal-width
:=density-bandwidth 1}))(-> (rdatasets/datasets-iris)
(plotly/layer-density {:=x :sepal-width
:=color :species}))2.19 Coordinates
(WIP)
2.19.1 geo
Inspired by Plotlyβs tutorial for Scatter Plots on Maps in JavaScript:
(-> {:lat [45.5, 43.4, 49.13, 51.1, 53.34, 45.24,
44.64, 48.25, 49.89, 50.45]
:lon [-73.57, -79.24, -123.06, -114.1, -113.28,
-75.43, -63.57, -123.21, -97.13, -104.6]
:text ["Montreal", "Toronto", "Vancouver", "Calgary", "Edmonton",
"Ottawa", "Halifax", "Victoria", "Winnepeg", "Regina"]}
tc/dataset
(plotly/base {:=coordinates :geo
:=lat :lat
:=lon :lon})
(plotly/layer-point {:=mark-opacity 0.8
:=mark-color ["#bebada", "#fdb462", "#fb8072", "#d9d9d9", "#bc80bd",
"#b3de69", "#8dd3c7", "#80b1d3", "#fccde5", "#ffffb3"]
:=mark-size 20
:=name "Canadian cities"})
(plotly/layer-text {:=text :text
:=textfont {:size 7
:color :purple}})
plotly/plot
(assoc-in [:layout :geo]
{:scope "north america"
:resolution 10
:lonaxis {:range [-130 -55]}
:lataxis {:range [40 60]}
:countrywidth 1.5
:showland true
:showlakes true
:showrivers true}))2.19.2 3d
(-> (rdatasets/datasets-iris)
(plotly/layer-point {:=x :sepal-width
:=y :sepal-length
:=z :petal-length
:=color :petal-width
:=coordinates :3d}))(-> (rdatasets/datasets-iris)
(plotly/layer-point {:=x :sepal-width
:=y :sepal-length
:=z :petal-length
:=color :species
:=coordinates :3d}))2.19.3 Surface plots
Surface plots are useful for visualizing 3D functions:
(-> {:z (for [i (range 20)]
(for [j (range 20)]
(Math/sin (/ (* i j) 10))))}
tc/dataset
(plotly/layer-surface {:=colorscale :Viridis}))2.19.4 polar
Monthly rain amounts - polar bar-chart
(def rain-data
(tc/dataset
{:month [:Jan :Feb :Mar :Apr
:May :Jun :Jul :Aug
:Sep :Oct :Nov :Dec]
:rain (repeatedly #(rand-int 200))}))(-> rain-data
(plotly/layer-bar
{:=r :rain
:=theta :month
:=coordinates :polar
:=mark-size 20
:=mark-opacity 0.6}))Controlling the polar layout (by manipulating the raw Plotly.js spec):
(-> rain-data
(plotly/base
{})
(plotly/layer-bar
{:=r :rain
:=theta :month
:=coordinates :polar
:=mark-size 20
:=mark-opacity 0.6})
plotly/plot
(assoc-in [:layout :polar]
{:angularaxis {:tickfont {:size 16}
:rotation 90
:direction "counterclockwise"}
:sector [0 180]}))A polar random walk - polar line-chart
(let [n 50]
(-> {:r (->> (repeatedly n #(- (rand) 0.5))
(reductions +))
:theta (->> (repeatedly n #(* 10 (rand)))
(reductions +)
(map #(rem % 360)))
:color (range n)}
tc/dataset
(plotly/layer-point
{:=r :r
:=theta :theta
:=coordinates :polar
:=mark-size 10
:=mark-opacity 0.6})
(plotly/layer-line
{:=r :r
:=theta :theta
:=coordinates :polar
:=mark-size 3
:=mark-opacity 0.6})))2.20 Debugging (WIP)
Tableplot provides several debugging utilities to help understand how plots are constructed internally.
2.20.1 Viewing the computational dag of substitution keys:
(def example-to-debug
(-> (rdatasets/datasets-iris)
(tc/random 10 {:seed 1})
(plotly/layer-point {:=x :sepal-width
:=y :sepal-length
:=color :species})))(-> example-to-debug
plotly/dag)2.20.2 Viewing intermediate values in the computational dag:
Layers (tableplotβs intermediate data representation)
(-> example-to-debug
(plotly/debug :=layers)
kind/pprint)[{:y :sepal-length,
:trace-base {:mode :markers, :type "scatter"},
:color-type :nominal,
:coordinates :2d,
:group (:species),
:color :species,
:mark :point,
:size-range [10 30],
:z :z,
:inferred-group (:species),
:x :sepal-width,
:dataset
https://vincentarelbundock.github.io/Rdatasets/csv/datasets/iris.csv [10 6]:
| :rownames | :sepal-length | :sepal-width | :petal-length | :petal-width | :species |
|----------:|--------------:|-------------:|--------------:|-------------:|------------|
| 27 | 5.0 | 3.4 | 1.6 | 0.4 | setosa |
| 97 | 5.7 | 2.9 | 4.2 | 1.3 | versicolor |
| 127 | 6.2 | 2.8 | 4.8 | 1.8 | virginica |
| 92 | 6.1 | 3.0 | 4.6 | 1.4 | versicolor |
| 7 | 4.6 | 3.4 | 1.4 | 0.3 | setosa |
| 95 | 5.6 | 2.7 | 4.2 | 1.3 | versicolor |
| 125 | 6.7 | 3.3 | 5.7 | 2.1 | virginica |
| 61 | 5.0 | 2.0 | 3.5 | 1.0 | versicolor |
| 73 | 6.3 | 2.5 | 4.9 | 1.5 | versicolor |
| 42 | 4.5 | 2.3 | 1.3 | 0.3 | setosa |
}]Traces (part of the Plotly spec)
(-> example-to-debug
(plotly/debug :=traces)
kind/pprint)[{:y [5.0 4.6 4.5],
:r nil,
:name "setosa",
:marker {:color "#1B9E77"},
:fill nil,
:mode :markers,
:width nil,
:type "scatter",
:theta nil,
:z nil,
:lon nil,
:lat nil,
:x [3.4 3.4 2.3],
:text nil}
{:y [5.7 6.1 5.6 5.0 6.3],
:r nil,
:name "versicolor",
:marker {:color "#D95F02"},
:fill nil,
:mode :markers,
:width nil,
:type "scatter",
:theta nil,
:z nil,
:lon nil,
:lat nil,
:x [2.9 3.0 2.7 2.0 2.5],
:text nil}
{:y [6.2 6.7],
:r nil,
:name "virginica",
:marker {:color "#7570B3"},
:fill nil,
:mode :markers,
:width nil,
:type "scatter",
:theta nil,
:z nil,
:lon nil,
:lat nil,
:x [2.8 3.3],
:text nil}]Both
(-> example-to-debug
(plotly/debug {:layers :=layers
:traces :=traces})
kind/pprint){:layers
[{:y :sepal-length,
:trace-base {:mode :markers, :type "scatter"},
:color-type :nominal,
:coordinates :2d,
:group (:species),
:color :species,
:mark :point,
:size-range [10 30],
:z :z,
:inferred-group (:species),
:x :sepal-width,
:dataset
https://vincentarelbundock.github.io/Rdatasets/csv/datasets/iris.csv [10 6]:
| :rownames | :sepal-length | :sepal-width | :petal-length | :petal-width | :species |
|----------:|--------------:|-------------:|--------------:|-------------:|------------|
| 27 | 5.0 | 3.4 | 1.6 | 0.4 | setosa |
| 97 | 5.7 | 2.9 | 4.2 | 1.3 | versicolor |
| 127 | 6.2 | 2.8 | 4.8 | 1.8 | virginica |
| 92 | 6.1 | 3.0 | 4.6 | 1.4 | versicolor |
| 7 | 4.6 | 3.4 | 1.4 | 0.3 | setosa |
| 95 | 5.6 | 2.7 | 4.2 | 1.3 | versicolor |
| 125 | 6.7 | 3.3 | 5.7 | 2.1 | virginica |
| 61 | 5.0 | 2.0 | 3.5 | 1.0 | versicolor |
| 73 | 6.3 | 2.5 | 4.9 | 1.5 | versicolor |
| 42 | 4.5 | 2.3 | 1.3 | 0.3 | setosa |
}],
:traces
[{:y [5.0 4.6 4.5],
:r nil,
:name "setosa",
:marker {:color "#1B9E77"},
:fill nil,
:mode :markers,
:width nil,
:type "scatter",
:theta nil,
:z nil,
:lon nil,
:lat nil,
:x [3.4 3.4 2.3],
:text nil}
{:y [5.7 6.1 5.6 5.0 6.3],
:r nil,
:name "versicolor",
:marker {:color "#D95F02"},
:fill nil,
:mode :markers,
:width nil,
:type "scatter",
:theta nil,
:z nil,
:lon nil,
:lat nil,
:x [2.9 3.0 2.7 2.0 2.5],
:text nil}
{:y [6.2 6.7],
:r nil,
:name "virginica",
:marker {:color "#7570B3"},
:fill nil,
:mode :markers,
:width nil,
:type "scatter",
:theta nil,
:z nil,
:lon nil,
:lat nil,
:x [2.8 3.3],
:text nil}]}2.20.3 Quick inspection of key values
You can quickly inspect the value of any substitution key:
(-> example-to-debug
(plotly/debug :=background))"rgb(235,235,235)"2.20.4 Viewing the final Plotly.js specification
Use the plot function to see the final JSON specification that gets sent to Plotly.js:
(-> example-to-debug
plotly/plot
kind/pprint){:data
[{:y [5.0 4.6 4.5],
:r nil,
:name "setosa",
:marker {:color "#1B9E77"},
:fill nil,
:mode :markers,
:width nil,
:type "scatter",
:theta nil,
:z nil,
:lon nil,
:lat nil,
:x [3.4 3.4 2.3],
:text nil}
{:y [5.7 6.1 5.6 5.0 6.3],
:r nil,
:name "versicolor",
:marker {:color "#D95F02"},
:fill nil,
:mode :markers,
:width nil,
:type "scatter",
:theta nil,
:z nil,
:lon nil,
:lat nil,
:x [2.9 3.0 2.7 2.0 2.5],
:text nil}
{:y [6.2 6.7],
:r nil,
:name "virginica",
:marker {:color "#7570B3"},
:fill nil,
:mode :markers,
:width nil,
:type "scatter",
:theta nil,
:z nil,
:lon nil,
:lat nil,
:x [2.8 3.3],
:text nil}],
:layout
{:width 500,
:height 400,
:margin {:t 25},
:automargin false,
:plot_bgcolor "rgb(235,235,235)",
:xaxis
{:gridcolor "rgb(255,255,255)", :title :sepal-width, :showgrid true},
:yaxis
{:gridcolor "rgb(255,255,255)",
:title :sepal-length,
:showgrid true},
:title nil}}2.21 Coming soon
2.21.1 Facets
(coming soon)
2.21.2 Scales
(coming soon)