scicloj.tcutils.api

between

(between ds col-name low high)(between ds col-selector low high {:keys [missing-default]})

Detect where values fall in a specified range in a numeric column. This is a shortcut for (< low x high).

Usage

(between ds col-name low high)

(between ds col-name low high {:missing-default val})

Arguments

ds - A tech.ml.dataset (i.e a tablecloth dataset)
column-name - Name of the column to use in the comparison
low - Lower bound for values of column-name
high - Upper bound for values of column-name
options - optional Options map containing the key missing-default to specify what value to use in the case that the value of (col-name row) is nil. Throws an error if there are any missing values in the column and this option is not provided.

Returns

A dataset with only rows that contain values between low and high in column col-name

view source

clean-column-names

(clean-column-names ds)

Convert column names of a dataset into ASCII-only, kebab-cased keywords. Throws an error if any column would be left with no name, e.g. one that was an all non-ASCII string.

Usage

clean-column-names(ds)

Arguments

ds - A tech.ml.dataset (i.e a tablecloth dataset)

Returns

A dataset with the column names converted to ASCII-only, kebab-cased keywords.

view source

cumsum

(cumsum ds column-name)(cumsum ds new-column-name column-name)

Compute the cumulative sum of a column

Usage

(cumsum ds column-name)

(cumsum ds new-column-name column-name)

Arguments

ds - A tech.ml.dataset (i.e a tablecloth dataset)
new-column-name - optional Name for the column where newly computed values will go. When ommitted new column name defaults to the keyword <old-column-name>-cumulative-sum
column-name - Name of the column to use to compute the cumulative sum

Returns

A dataset with the additional column containing the cumulative sum.

view source

duplicate-rows

(duplicate-rows ds)

Filter a dataset for only duplicated rows.

Usage

(duplicate-rows ds)

Arguments

ds - A tech.ml.dataset (i.e a tablecloth dataset)

Returns

A dataset containing only rows that are exact duplicates.

view source

lag

(lag ds column-name lag-size)(lag ds new-column-name column-name lag-size)

Compute previous (lagged) values from one column in a new column, can be used e.g. to compare values behind the current value.

Usage

(lag ds column-name lag-size)

(lag ds new-column-name column-name lag-size)

Arguments

ds - A tech.ml.dataset (i.e a tablecloth dataset)
new-column-name - optional Name for the column where newly computed values will go. When ommitted new column name defaults to the keyword <old-column-name>-lag-<lag-size>
column-name - Name of the column to use to compute the lagged values
lag-size - positive integer indicating how many rows to skip over to compute the lag

Returns

A dataset with the new column populated with the lagged values.

view source

lead

(lead ds column-name lead-size)(lead ds new-column-name column-name lead-size)

Compute next (lead) values from one column in a new column, can be used e.g. to compare values ahead of the current value.

Usage

(lead ds column-name lead-size)

(lead ds new-column-name column-name lead-size)

Arguments

ds - A tech.ml.dataset (i.e a tablecloth dataset)
new-column-name - optional Name for the column where newly computed values will go. When ommitted new column name defaults to the keyword <old-column-name>-lead-<lead-size>
column-name - Name of the column to use to compute the lead values
lead-size - positive integer indicating how many rows to skip over to compute the lead

Returns

A dataset with the column populated with the lead values.

view source

Generated by Codox with RDash UI theme

tcutils 0.1.0-alpha2

Project

Topics

Namespaces

Public Vars

scicloj.tcutils.api

between

Usage

Arguments

Returns

clean-column-names

Usage

Arguments

Returns

cumsum

Usage

Arguments

Returns

duplicate-rows

Usage

Arguments

Returns

lag

Usage

Arguments

Returns

lead

Usage

Arguments

Returns