scicloj.metamorph.ml.regression

Regression models for continuous target prediction.

This namespace provides implementations of various regression algorithms with a consistent metamorph.ml training and prediction interface. Models support statistical output formats (tidy, glance, augment) for analysis and diagnostics.

Available Models:

OLS (Ordinary Least Squares)

  • :metamorph.ml/ols: Apache Commons Math implementation (Java-based)
  • :fastmath/ols: FastMath implementation (pure Clojure) Solves for regression coefficients β in: y = Xβ + ε Assumes linear relationships and homoscedastic errors.

GLM (Generalized Linear Model)

  • :fastmath/glm: FastMath GLM implementation Extends linear regression to non-normal distributions and non-linear relationships via link functions and variance models.

Baseline Model

  • :metamorph.ml/dummy-regressor: Predicts mean of training target Useful sanity check - models should outperform this baseline.

Model Output Functions:

  • :tidy-fn: Extracts model coefficients with statistics Returns dataset with :term, :estimate, :std.error, :statistic, :p.value
  • :glance-fn: Extracts model-level diagnostics Returns dataset with :r.squared, :adj.r.squared, :rss, :aic, :bic, etc.
  • :augment-fn: Adds model predictions and residuals to data Returns augmented dataset with :.fitted and :.resid columns

Example Usage (in metamorph pipeline):

(ml/train
  data
  {:model-type :fastmath/ols})
   

Model Diagnostics:

(ml/glance model)        ; Overall model metrics
(ml/tidy model)          ; Coefficient table
(ml/augment model data)  ; Predicted values and residuals

See also: scicloj.metamorph.ml.r-model-matrix for R-formula-based feature engineering

extend-intervall

(extend-intervall [start end] extension)

inclusive-range

(inclusive-range start end)(inclusive-range start end step)

Return a sequence of nums from START to END, both inclusive, by STEP.

lay-cooks-d

(lay-cooks-d pose cooks-d params-count pos-neg min-std-resid max-std-resid max-hat)

min-max-extended

(min-max-extended s extension)

residual-vs-leverage-pose

(residual-vs-leverage-pose augmented-ds model options)