Noj prompts proof-of-concept
This is a small experiment demonstrating how we can generate a relevant prompt by picking relevant tasks of a catalogue, and then use it for basic data processing and visualization tasks.
The notebook was generated by rewriting "CLAUDE.md"
with the concatenation of relevant retrieved prompts, and then gradually adding tasks like "@claude please ..."
, which are invoked by typing "handle notebooks/explore.clj"
at the Claude Code prompt.
Setup
ns explore
(:require
(:as kind]
[scicloj.kindly.v4.kind :as prompts]
[scicloj.prompts.v1.api :as tc]
[tablecloth.api :as plotly]
[scicloj.tableplot.v1.plotly :as rdatasets])) [scicloj.metamorph.ml.rdatasets
The data
We wish to analyse the following dataset of economic time-series:
def economics
( (rdatasets/ggplot2-economics_long))
source: ggplot2-economics_long at Rdatasets
Preparing the prompt
Let us explore the catalogue of relevant prompts.
(prompts/catalogue)
:prompts
{"joining-datasets.md"
{:description "Joining and combining datasets in various ways",
{:tags #{:combine-datasets :join-datasets :merge-data},
:last-modified "2025-05-20"},
"time-series-visualization.md"
:description
{"Visualizing time series data with interactive features",
:tags #{:visualize-time-series :analyze-temporal :plot-trends},
:last-modified "2025-05-20"},
"noise-generation.md"
:description
{"Generating various types of noise for procedural generation",
:tags #{:create-fractal :generate-noise :simulate-terrain},
:last-modified "2025-05-20"},
"column-operations.md"
:description "Adding, updating, and manipulating dataset columns",
{:tags #{:add-column :update-column :drop-column :rename-column},
:last-modified "2025-05-20"},
"sequence-generation.md"
:description "Generating random and quasi-random sequences",
{:tags #{:sample-space :create-low-discrepancy :generate-sequence},
:last-modified "2025-05-20"},
"random-sampling-techniques.md"
:description "Advanced random sampling techniques and utilities",
{:tags #{:sample-randomly :shuffle-collection :weight-samples},
:last-modified "2025-05-20"},
"creating-datasets.md"
:description
{"Creating datasets from various data sources with Tablecloth",
:tags #{:load-data :define-dataset :create-dataset},
:last-modified "2025-05-20"},
"basic-random-generation.md"
:description "Basic random number generation with fastmath.random",
{:tags #{:seed-random :generate-random :create-rng},
:last-modified "2025-05-20"},
"data-visualization-plotly.md"
:description
{"Creating interactive data visualizations with Tableplot's Plotly API",
:tags #{:create-visualization :customize-plot :build-chart},
:last-modified "2025-05-20"},
"geo-spatial-visualization.md"
:description "Visualizing geo-spatial data on interactive maps",
{:tags #{:plot-spatial :create-map :visualize-geographic},
:last-modified "2025-05-20"},
"filtering-rows.md"
:description "Filtering and selecting rows from datasets",
{:tags #{:select-rows :filter-rows :drop-rows},
:last-modified "2025-05-20"},
"grouping-aggregating.md"
:description "Grouping datasets and performing aggregations",
{:tags #{:aggregate-data :calculate-statistics :group-data},
:last-modified "2025-05-20"},
"distribution-sampling.md"
:description
{"Sampling from statistical distributions with fastmath.random",
:tags
:sample-distribution :calculate-probability :generate-samples},
#{:last-modified "2025-05-20"},
"statistical-visualization.md"
:description
{"Creating statistical plots for data analysis and exploration",
:tags
:visualize-statistics :analyze-distribution :show-correlation},
#{:last-modified "2025-05-20"}}}
You see, it is an annotated collection of Markdown files (which are the prompts themselves).
More specifically, let us explore all supported tasks:
(prompts/all-tags)
:add-column
(:aggregate-data
:analyze-distribution
:analyze-temporal
:build-chart
:calculate-probability
:calculate-statistics
:combine-datasets
:create-dataset
:create-fractal
:create-low-discrepancy
:create-map
:create-rng
:create-visualization
:customize-plot
:define-dataset
:drop-column
:drop-rows
:filter-rows
:generate-noise
:generate-random
:generate-samples
:generate-sequence
:group-data
:join-datasets
:load-data
:merge-data
:plot-spatial
:plot-trends
:rename-column
:sample-distribution
:sample-randomly
:sample-space
:seed-random
:select-rows
:show-correlation
:shuffle-collection
:simulate-terrain
:update-column
:visualize-geographic
:visualize-statistics
:visualize-time-series
:weight-samples)
Let us generate a prompt for the tasks which are relevant to this notebook. It is created by concatenating the relevant prompts from the catalogue (along with a tiny base prompt).
->> [:creating-datasets :plot-trends :aggregate-data]
(
(prompts/generate-prompt)"CLAUDE.md")) (spit
nil
Here is the prompt we just prepared:
(kind/hiccup:div {:style {:height "400px"
[:overflow-y "scroll"
:background-color "floralwhite"}}
(kind/mdslurp "CLAUDE.md"))]) (
General guidelines
When handling requests in code, please do not erase the text of the request. In other words, please keep the paragraphs beginning with "@claude". Your role is only to add code, not change or erase anything.
Time Series Visualization with Tableplot
Purpose
This prompt focuses specifically on visualizing time series data using Tableplot's Plotly API.
Setup
(require '[scicloj.tableplot.v1.plotly :as plotly]) (require '[scicloj.tablecloth.api :as tc]) (require '[java-time :as jt])
Basic Time Series Plot
;; Sample time series data (def time-data (tc/dataset {:date (map #(jt/local-date 2025 1 %) (range 1 31)) :value (map #(+ 100 (* 10 (Math/sin (/ % 3.0)))) (range 30)) :category (repeatedly 30 #(rand-nth ["A" "B" "C"]))})) ;; Basic time series line plot (-> time-data (plotly/base {:=layout {:xaxis {:type "date"}}}) (plotly/layer-line {:=x :date :=y :value}) (plotly/plot))
Time Series with Date Range Selector
(-> time-data (plotly/base {:=layout {:xaxis {:type "date" :rangeselector {:buttons [ {:count 7 :label "1w" :step "day" :stepmode "backward"} {:count 14 :label "2w" :step "day" :stepmode "backward"} {:step "all"}]}} :title "Time Series with Range Selector"}}) (plotly/layer-line {:=x :date :=y :value}) (plotly/plot))
Time Series with Range Slider
(-> time-data (plotly/base {:=layout {:xaxis {:type "date" :rangeslider {:visible true}} :title "Time Series with Range Slider"}}) (plotly/layer-line {:=x :date :=y :value}) (plotly/plot))
Multiple Time Series by Category
(-> time-data (plotly/base {:=layout {:xaxis {:type "date"}}}) (plotly/layer-line {:=x :date :=y :value :=color :category}) (plotly/plot))
Filled Time Series (Area Chart)
(-> time-data (plotly/base {:=layout {:xaxis {:type "date"}}}) (plotly/layer-area {:=x :date :=y :value :=color :category}) (plotly/plot))
Candlestick Chart for Financial Data
;; Sample financial data (def financial-data (tc/dataset {:date (map #(jt/local-date 2025 1 %) (range 1 31)) :open (repeatedly 30 #(+ 100 (rand 10))) :high (repeatedly 30 #(+ 105 (rand 10))) :low (repeatedly 30 #(+ 95 (rand 10))) :close (repeatedly 30 #(+ 100 (rand 10)))})) ;; Candlestick chart (-> financial-data (plotly/base {:=layout {:xaxis {:type "date"} :title "Stock Price"}}) (plotly/layer-candlestick {:=x :date :=open :open :=high :high :=low :low :=close :close}) (plotly/plot))
Time Series with Annotations
;; Define important events (def events [{:date (jt/local-date 2025 1 5) :event "Product Launch"} {:date (jt/local-date 2025 1 15) :event "Quarterly Report"} {:date (jt/local-date 2025 1 25) :event "Conference"}]) ;; Create annotations for the events (def annotations (mapv (fn [{:keys [date event]}] (let [y-val (+ 100 (* 10 (Math/sin (/ (.getDayOfMonth date) 3.0))))] {:x date :y y-val :xref "x" :yref "y" :text event :showarrow true :arrowhead 2 :arrowsize 1 :arrowwidth 1 :ax 0 :ay -40})) events)) ;; Plot with annotations (-> time-data (plotly/base {:=layout {:xaxis {:type "date"} :annotations annotations :title "Time Series with Event Annotations"}}) (plotly/layer-line {:=x :date :=y :value}) (plotly/plot))
Time Series Heatmap Calendar
;; Generate a year of daily data (def calendar-data (let [dates (map #(jt/local-date 2025 (inc (quot % 31)) (inc (rem % 31))) (range 365)) values (map #(+ 50 (* 30 (Math/sin (/ % 30.0)))) (range 365))] (tc/dataset {:date dates :value values :day (map #(.getDayOfMonth %) dates) :month (map #(.getMonthValue %) dates) :weekday (map #(.getValue (.getDayOfWeek %)) dates)}))) ;; Create a calendar heatmap (-> calendar-data (plotly/base {:=layout {:title "Calendar Heatmap"}}) (plotly/layer-heatmap {:=x :day :=y :month :=z :value :=color-scale "Viridis"}) (plotly/plot))
Time Series Decomposition
;; Generate time series with trend, seasonality, and noise (def decomp-data (let [dates (map #(jt/local-date 2025 (inc (quot % 30)) (inc (rem % 30))) (range 365)) trend (map #(+ 100 (* 0.1 %)) (range 365)) seasonality (map #(* 15 (Math/sin (/ % 30.0))) (range 365)) noise (repeatedly 365 #(* 5 (- (rand) 0.5))) values (map + trend seasonality noise)] (tc/dataset {:date dates :value values :trend trend :seasonality seasonality :noise noise}))) ;; Plot the decomposition (-> decomp-data (plotly/base {:=layout {:title "Time Series Decomposition" :grid {:rows 4 :columns 1 :pattern "independent"} :height 800}}) (plotly/layer-line {:=x :date :=y :value :=name "Original" :=subplot 1}) (plotly/layer-line {:=x :date :=y :trend :=name "Trend" :=subplot 2}) (plotly/layer-line {:=x :date :=y :seasonality :=name "Seasonality" :=subplot 3}) (plotly/layer-line {:=x :date :=y :noise :=name "Residual" :=subplot 4}) (plotly/plot))
Multiple Time Series with Synchronized Axes
;; Multiple metrics (def multi-metric-data (tc/dataset {:date (map #(jt/local-date 2025 1 %) (range 1 31)) :revenue (map #(+ 1000 (* 100 (Math/sin (/ % 5.0)))) (range 30)) :customers (map #(+ 500 (* 50 (Math/cos (/ % 7.0)))) (range 30)) :profit (map #(+ 300 (* 30 (Math/sin (/ % 3.0)))) (range 30))})) ;; Synchronized subplots (-> multi-metric-data (plotly/base {:=layout {:title "Business Metrics" :grid {:rows 3 :columns 1 :pattern "independent" :roworder "top to bottom"} :height 800 :xaxis {:type "date"} :xaxis2 {:type "date"} :xaxis3 {:type "date"}}}) (plotly/layer-line {:=x :date :=y :revenue :=name "Revenue" :=subplot 1}) (plotly/layer-line {:=x :date :=y :customers :=name "Customers" :=subplot 2}) (plotly/layer-line {:=x :date :=y :profit :=name "Profit" :=subplot 3}) (plotly/plot))
Time Series Forecasting Visualization
;; Historical and forecast data (def forecast-data (let [dates (concat (map #(jt/local-date 2025 1 %) (range 1 31)) (map #(jt/local-date 2025 2 %) (range 1 15))) historical (take 30 dates) forecast (drop 30 dates) hist-values (map #(+ 100 (* 10 (Math/sin (/ % 7.0)))) (range 30)) forecast-values (map #(+ 100 (* 10 (Math/sin (/ (+ % 30) 7.0)))) (range 14)) confidence-low (map #(- % 15) forecast-values) confidence-high (map #(+ % 15) forecast-values)] (tc/dataset {:date (concat historical forecast) :value (concat hist-values (repeat 14 nil)) :forecast (concat (repeat 30 nil) forecast-values) :conf-low (concat (repeat 30 nil) confidence-low) :conf-high (concat (repeat 30 nil) confidence-high) :is-forecast (concat (repeat 30 false) (repeat 14 true))}))) ;; Visualize forecast with confidence interval (-> forecast-data (plotly/base {:=layout {:title "Time Series Forecast" :xaxis {:type "date"} :shapes [{:type "rect" :xref "x" :yref "paper" :x0 (jt/local-date 2025 1 31) :y0 0 :x1 (jt/local-date 2025 2 15) :y1 1 :fillcolor "#f8f9fa" :opacity 0.3 :line {:width 0}}]}}) (plotly/layer-line {:=x :date :=y :value :=name "Historical" :=mark-width 2}) (plotly/layer-line {:=x :date :=y :forecast :=name "Forecast" :=mark-width 2 :=mark-dash "dash"}) (plotly/layer-ribbon {:=x :date :=y-min :conf-low :=y-max :conf-high :=name "Confidence Interval" :=mark-opacity 0.2}) (plotly/plot))
Tips for Time Series Visualization
- Always specify
{:xaxis {:type "date"}}
in the layout for proper date handling - Use range selectors and sliders for interactive exploration of long time series
- Consider synchronized subplots for multiple related time series
- Add annotations to highlight important events
- Use different chart types based on the data pattern:
- Line charts for general trends
- Area charts for cumulative values
- Candlestick charts for OHLC financial data
- For forecasting visualization, clearly distinguish historical vs. predicted data
- Use confidence intervals to show prediction uncertainty
Grouping and Aggregating Data with Tablecloth
Purpose
This prompt demonstrates how to group datasets and perform aggregations using Tablecloth.
Setup
(require '[tablecloth.api :as tc]) ;; Sample dataset for examples (def DS (tc/dataset {:V1 [1 2 1 3 2] :V2 [10 20 30 40 50] :V3 [0.1 0.2 0.3 0.4 0.5] :V4 ["A" "B" "A" "C" "B"]}))
Grouping Data
Group by Single Column
(tc/group-by DS :V1)
Group by Multiple Columns
(tc/group-by DS [:V1 :V4])
Access Groups
;; Get first group (-> DS (tc/group-by :V1) (tc/select-rows 0)) ;; Get values for a specific group (->> DS (tc/group-by :V1) (filter #(= 1 (:V1 (:name %)))))
Ungrouping
(->> DS (tc/group-by :V1) (tc/ungroup))
Aggregating Data
Simple Aggregation
;; Sum one column (tc/aggregate DS #(reduce + (% :V2)))
Named Aggregations
(tc/aggregate DS {:sum-of-V2 #(reduce + (% :V2)) :mean-of-V3 #(/ (reduce + (% :V3)) (count (% :V3)))})
Group and Aggregate
;; Group and apply single aggregation (->> DS (tc/group-by :V4) (tc/aggregate #(reduce + (% :V2)))) ;; Multiple aggregations with names (->> DS (tc/group-by [:V4]) (tc/aggregate {:sum-v1 #(reduce + (% :V1)) :prod-v3 #(reduce * (% :V3)) :count #(count (% :V1))}))
Common Aggregation Functions
(require '[tech.v3.datatype.functional :as dfn]) (->> DS (tc/group-by :V4) (tc/aggregate {:count count :sum-v2 #(dfn/sum (% :V2)) :mean-v2 #(dfn/mean (% :V2)) :min-v2 #(dfn/min (% :V2)) :max-v2 #(dfn/max (% :V2)) :std-v2 #(dfn/standard-deviation (% :V2))}))
Tips
- Grouped datasets retain all columns in each group
- Use tech.v3.datatype.functional for efficient numeric operations
- Combine aggregations with other operations for complex analysis
- Groups are named by their grouping values
Exploration
@claude please show a few rows of the economics dataset.
10) (tc/head economics
https://vincentarelbundock.github.io/Rdatasets/csv/ggplot2/economics_long.csv [10 5]:
:rownames | :date | :variable | :value | :value-01 |
---|---|---|---|---|
1 | 1967-07-01 | pce | 506.7 | 0.00000000 |
2 | 1967-08-01 | pce | 509.8 | 0.00026525 |
3 | 1967-09-01 | pce | 515.6 | 0.00076152 |
4 | 1967-10-01 | pce | 512.2 | 0.00047060 |
5 | 1967-11-01 | pce | 517.4 | 0.00091554 |
6 | 1967-12-01 | pce | 525.1 | 0.00157439 |
7 | 1968-01-01 | pce | 530.9 | 0.00207066 |
8 | 1968-02-01 | pce | 533.6 | 0.00230168 |
9 | 1968-03-01 | pce | 544.3 | 0.00321722 |
10 | 1968-04-01 | pce | 544.0 | 0.00319155 |
@claude pleaes show the distinct values of :variable
and their counts.
-> economics
(:variable)
(tc/group-by :count tc/row-count})) (tc/aggregate {
_unnamed [5 2]:
:$group-name | :count |
---|---|
pce | 574 |
pop | 574 |
psavert | 574 |
uempmed | 574 |
unemploy | 574 |
@claude please define a new dataset that only includes the cases where :variable
is "unemploy"
.
def unemploy-data
(= "unemploy" (:variable %)))) (tc/select-rows economics #(
@claude please plot the time series of unemployment data.
-> unemploy-data
(:xaxis {:type "date"}
(plotly/base {:=layout {:title "US Unemployment Over Time"}})
:date :=y :value})
(plotly/layer-line {:=x (plotly/plot))
@claude please adapt the Plotly layout to support a few time ranges: last year, last decade, and whole period.
-> unemploy-data
(:xaxis {:type "date"
(plotly/base {:=layout {:rangeselector {:buttons [
:count 1 :label "1Y" :step "year" :stepmode "backward"}
{:count 10 :label "10Y" :step "year" :stepmode "backward"}
{:step "all" :label "All"}]}}
{:title "US Unemployment Over Time with Range Selector"}})
:date :=y :value})
(plotly/layer-line {:=x (plotly/plot))
@claude for the whole-period series, please add a running index number and a regression line of the unemployment rate by this number. Do the whole thing in one ->
pipeline.
-> unemploy-data
(:index (range (tc/row-count unemploy-data)))
(tc/add-column :xaxis {:title "Time Index"}
(plotly/base {:=layout {:yaxis {:title "Unemployment Rate"}
:title "Unemployment Rate vs Time Index with Regression"}})
:index :=y :value :=name "Unemployment"})
(plotly/layer-point {:=x :index :=y :value :=name "Linear Trend"})
(plotly/layer-smooth {:=x (plotly/plot))
source: notebooks/explore.clj