13  Relationships

Regression, smoothing, density estimation, and heatmaps β€” revealing structure between two variables.

(ns napkinsketch-book.relationships
  (:require
   ;; Shared datasets for these docs
   [napkinsketch-book.datasets :as data]
   ;; Kindly β€” notebook rendering protocol
   [scicloj.kindly.v4.kind :as kind]
   ;; Napkinsketch β€” composable plotting
   [scicloj.napkinsketch.api :as sk]
   ;; Fastmath β€” random number generation
   [fastmath.random :as rng]))

Linear Regression

A single regression line through all data.

(-> data/iris
    (sk/lay-point :sepal_length :sepal_width)
    sk/lay-lm)
sepal widthsepal length4.55.05.56.06.57.07.58.02.02.53.03.54.04.5

Per-Group Regression

Fit a regression line per group.

(-> data/iris
    (sk/view :petal_length :petal_width {:color :species})
    sk/lay-point
    sk/lay-lm)
petal widthpetal lengthspeciessetosaversicolorvirginica12345670.00.51.01.52.02.5

Regression with Confidence Ribbon

Pass {:se true} to show a 95% confidence band around the line.

(-> data/iris
    (sk/view :sepal_length :sepal_width {:color :species})
    sk/lay-point
    (sk/lay-lm {:se true}))
sepal widthsepal lengthspeciessetosaversicolorvirginica4.55.05.56.06.57.07.58.02.02.53.03.54.04.5

Tips with Regression

Do smokers and non-smokers tip differently?

(-> data/tips
    (sk/view :total_bill :tip {:color :smoker})
    sk/lay-point
    sk/lay-lm)
tiptotal billsmokerNoYes510152025303540455012345678910

LOESS Smoothing

A smooth curve through noisy data.

(def noisy-wave (let [r (rng/rng :jdk 42)]
                  {:x (range 50)
                   :y (map #(+ (Math/sin (* % 0.2)) (* 0.3 (- (rng/drandom r) 0.5)))
                           (range 50))}))
(-> noisy-wave
    (sk/lay-point :x :y)
    sk/lay-loess)
yx05101520253035404550-1.2-1.0-0.8-0.6-0.4-0.20.00.20.40.60.81.0

Heatmap (Auto-Binned)

Bin x and y into a grid, count points per cell.

(-> data/iris
    (sk/lay-tile :sepal_length :sepal_width))
sepal widthsepal length4.55.05.56.06.57.07.58.02.02.53.03.54.04.5

Heatmap (Pre-Computed)

Use a numeric column for tile color.

(def grid-data
  (let [r (rng/rng :jdk 99)]
    {:x (for [i (range 5) _j (range 5)] i)
     :y (for [_i (range 5) j (range 5)] j)
     :value (repeatedly 25 #(rng/irandom r 100))}))
(-> grid-data
    (sk/lay-tile :x :y {:fill :value}))
yx0.00.51.01.52.02.53.03.54.00.00.51.01.52.02.53.03.54.0

Density 2D

KDE-smoothed 2D density heatmap.

(-> data/iris
    (sk/lay-density2d :sepal_length :sepal_width))
sepal widthsepal length3.03.54.04.55.05.56.06.57.07.58.08.59.01.52.02.53.03.54.04.55.0

Density 2D with Points

Overlay scatter points on the density heatmap.

(-> data/iris
    (sk/lay-density2d :sepal_length :sepal_width)
    (sk/lay-point {:alpha 0.5}))
sepal widthsepal length3.03.54.04.55.05.56.06.57.07.58.08.59.01.52.02.53.03.54.04.55.0

Contour Lines

Iso-density contour lines from 2D KDE.

(-> data/iris
    (sk/lay-contour :sepal_length :sepal_width))
sepal widthsepal length3.03.54.04.55.05.56.06.57.07.58.08.59.01.52.02.53.03.54.04.55.0

Contour with Points

Contour lines overlaid on scatter points.

(-> data/iris
    (sk/lay-point :sepal_length :sepal_width {:alpha 0.3})
    (sk/lay-contour {:levels 8}))
sepal widthsepal length3.03.54.04.55.05.56.06.57.07.58.08.59.01.52.02.53.03.54.04.55.0

What’s Next

  • Polar β€” radial charts and pie-style visualizations
  • Faceting β€” split any chart into panels by category
source: notebooks/napkinsketch_book/relationships.clj