4 Cardio Algorithms Explained

A guide to the cardiovascular signal processing algorithms in Ripple, explaining why each step is needed, the mathematical intuition behind it, and how to use the Clojure implementation.

Ripple implements ECG and PPG processing pipelines based on NeuroKit2 and HeartPy. This notebook walks through each algorithm in pipeline order, using synthetic signals that make the effect of each step clearly visible.

References:

McSharry, P. E. et al. (2003). “A dynamical model for generating synthetic electrocardiogram signals.” IEEE TBME, 50(3), 289-294.
Makowski, D. et al. (2021). “NeuroKit2: A Python toolbox for neurophysiological signal processing.” Behavior Research Methods, 53(4), 1689-1696. doi:10.3758/s13428-020-01516-y
van Gent, P. et al. (2019). “HeartPy: A novel heart rate algorithm for the analysis of noisy signals.” Transportation Research Part F, 66, 368-378. doi:10.1016/j.trf.2019.09.015
Pham, T. et al. (2021). “Heart Rate Variability in Psychology: A Review of HRV Indices and an Analysis Tutorial.” Sensors, 21(12),
1. doi:10.3390/s21123998

(ns ripple-book.cardio-algorithms-explained
  (:require
   ;; Ripple cardio public API:
   [scicloj.ripple.cardio :as cardio]
   ;; Table processing (https://scicloj.github.io/tablecloth/):
   [tablecloth.api :as tc]
   ;; High-performance array math (https://github.com/cnuernber/dtype-next):
   [tech.v3.datatype.functional :as dfn]
   ;; Interactive plotting via Plotly (https://scicloj.github.io/tableplot/):
   [scicloj.tableplot.v1.plotly :as plotly]
   ;; Annotating kinds of visualizations (https://scicloj.github.io/kindly-noted/):
   [scicloj.kindly.v4.kind :as kind]))

The Cardiovascular Signal Processing Pipeline

Electrocardiography (ECG) records the electrical activity of the heart via skin electrodes. Photoplethysmography (PPG) measures blood volume changes optically, typically from a fingertip or wrist sensor. Both produce periodic signals from which heart rate and heart rate variability (HRV) can be extracted.

Raw signals contain noise and artifacts. Before analysis, each signal passes through a processing pipeline:

Step	Algorithm	Purpose
1	Butterworth filtering	Remove drift and powerline noise
2	R-peak / systolic detection	Find heartbeat locations
3	Artifact correction	Fix missed and extra beats
4	RR interval extraction	Measure time between beats
5	Time-domain HRV	Compute heart rate variability indices

Synthetic ECG Signal

We use the ECGSYN dynamical model (McSharry et al. 2003) to generate a realistic synthetic ECG with PQRST morphology and beat-to-beat variability. The model uses three coupled ODEs on a limit cycle, producing natural heart rate variability via LF (Mayer waves) and HF (respiratory) spectral components.

The clean signal is then corrupted with artifacts to demonstrate the cleaning pipeline.

(def sampling-rate 500.0)

(def duration 10.0)

(def n-samples (int (* sampling-rate duration)))

(def heart-rate-bpm 72.0)

(def time-axis
  (vec (map #(/ (double %) sampling-rate) (range n-samples))))

Generate a clean ECG using the ECGSYN model:

(def clean-ecg
  (cardio/synthetic-ecg {:sampling-rate sampling-rate
                         :duration duration
                         :heart-rate heart-rate-bpm
                         :seed 42}))

(count clean-ecg)

Add realistic artifacts for the cleaning demonstration: baseline drift (0.3 Hz respiratory), powerline interference (50 Hz), and broadband noise.

(def drift-signal
  (double-array (map #(* 0.2 (Math/sin (* 2.0 Math/PI 0.3 (/ (double %) sampling-rate))))
                     (range n-samples))))

(def powerline-signal
  (double-array (map #(* 0.1 (Math/sin (* 2.0 Math/PI 50.0 (/ (double %) sampling-rate))))
                     (range n-samples))))

(def noise-signal
  (double-array (map (fn [i]
                       (let [id (double i)]
                         (+ (* 0.03 (Math/sin (* id 0.7)))
                            (* 0.02 (Math/sin (* id 2.3)))
                            (* 0.01 (Math/sin (* id 5.1))))))
                     (range n-samples))))

(def raw-ecg
  (double-array (dfn/+ clean-ecg drift-signal powerline-signal noise-signal)))

The raw ECG signal

(-> (tc/dataset {:time time-axis :amplitude (vec raw-ecg)})
    (plotly/base {:=width 800 :=height 300 :=x :time :=y :amplitude
                  :=x-title "Time (s)" :=y-title "Amplitude"
                  :=title "Raw ECG signal (ECGSYN + artifacts)"})
    (plotly/layer-line {:=mark-opacity 0.8})
    plotly/plot)

The raw signal shows the PQRST complexes but also baseline wander (the slow undulation), high-frequency powerline noise (the fine ripple visible on the flat segments), and broadband noise.

ECG Cleaning — Butterworth Filtering

The problem

ECG recordings contain two main types of noise:

Baseline wander from respiration and electrode movement (< 1 Hz)
Powerline interference from mains electricity (50 Hz in EU/Israel, 60 Hz in US/Japan)

Both obscure the true ECG morphology and can confuse peak detection.

The solution: Butterworth filters

A Butterworth filter has a maximally flat magnitude response in the passband – no ripples, just a smooth roll-off. Ripple implements three Butterworth filter types via jDSP:

High-pass: removes frequencies below the cutoff (drift removal)
Band-stop (notch): removes a narrow frequency band (powerline)
Band-pass: keeps only frequencies in a range (QRS isolation)

The NeuroKit2 default cleaning method chains a high-pass at 0.5 Hz (order 5) with a notch at the powerline frequency.

(Makowski et al. 2021)

Seeing each filter’s effect

First, high-pass only – removes drift but keeps the 50 Hz noise:

(def ecg-highpass-only
  (cardio/butterworth-highpass raw-ecg sampling-rate {:cutoff 0.5 :order 5}))

Notch only – removes 50 Hz but keeps drift:

(def ecg-notch-only
  (cardio/butterworth-bandstop raw-ecg sampling-rate {:low-cutoff 49.0 :high-cutoff 51.0 :order 5}))

Both together – the NeuroKit default method:

(def ecg-cleaned
  (cardio/ecg-clean raw-ecg sampling-rate {:method :neurokit :powerline 50}))

Zoomed view of the first 2 seconds, showing all variants:

(let [zoom-end 1000
      sub-time (subvec time-axis 0 zoom-end)
      nt (count sub-time)]
  (-> (tc/dataset {:time (vec (concat sub-time sub-time sub-time sub-time))
                   :amplitude (vec (concat (take zoom-end raw-ecg)
                                           (take zoom-end ecg-highpass-only)
                                           (take zoom-end ecg-notch-only)
                                           (take zoom-end ecg-cleaned)))
                   :series (vec (concat (repeat nt "raw")
                                        (repeat nt "highpass only")
                                        (repeat nt "notch only")
                                        (repeat nt "cleaned (both)")))})
      (plotly/base {:=width 800 :=height 400 :=x :time :=y :amplitude :=color :series
                    :=x-title "Time (s)" :=y-title "Amplitude"})
      (plotly/layer-line {:=mark-opacity 0.7})
      plotly/plot))

The “highpass only” trace removes the slow drift but retains the 50 Hz ripple. The “notch only” trace removes the powerline noise but still drifts. The cleaned signal (both filters) has a stable baseline and clean morphology.

Verify the cleaned signal has the same length:

(count ecg-cleaned)

Hamilton cleaning method

The Hamilton (2002) method takes a different approach: a narrow bandpass filter at 8-16 Hz isolates the QRS complex frequency band. This aggressively removes everything except the sharp R-peak, which makes peak detection easier at the cost of losing P-wave and T-wave morphology.

(def ecg-hamilton-cleaned
  (cardio/ecg-clean raw-ecg sampling-rate {:method :hamilton}))

(let [zoom-end 1500
      sub-time (subvec time-axis 0 zoom-end)
      nt (count sub-time)]
  (-> (tc/dataset {:time (vec (concat sub-time sub-time))
                   :amplitude (vec (concat (take zoom-end ecg-cleaned)
                                           (take zoom-end ecg-hamilton-cleaned)))
                   :series (vec (concat (repeat nt "NeuroKit (broadband)")
                                        (repeat nt "Hamilton (8-16 Hz)")))})
      (plotly/base {:=width 800 :=height 350 :=x :time :=y :amplitude :=color :series
                    :=x-title "Time (s)" :=y-title "Amplitude"})
      (plotly/layer-line {:=mark-opacity 0.8})
      plotly/plot))

The Hamilton-cleaned signal retains only the QRS complex – the P-wave and T-wave are filtered out. This is useful when the goal is purely R-peak detection, not morphological analysis.

Synthetic PPG Signal

Photoplethysmography measures changes in blood volume using light. The PPG waveform is broader and smoother than ECG, with a systolic peak (blood volume maximum) and sometimes a dicrotic notch (a small dip from aortic valve closure).

PPG contains lower-frequency content than ECG, so it uses a bandpass filter at 0.5-8 Hz rather than the ECG’s highpass + notch combination.

(def ppg-heart-rate 75.0)

(def clean-ppg
  (cardio/synthetic-ppg {:sampling-rate sampling-rate
                         :duration duration
                         :heart-rate ppg-heart-rate
                         :seed 42}))

Add artifacts to demonstrate cleaning:

(def raw-ppg
  (let [drift (double-array (map #(* 0.15 (Math/sin (* 2.0 Math/PI 0.1 (/ (double %) sampling-rate))))
                                 (range n-samples)))
        noise (double-array (map (fn [i]
                                   (+ (* 0.03 (Math/sin (* (double i) 0.5)))
                                      (* 0.01 (Math/sin (* (double i) 2.3)))))
                                 (range n-samples)))]
    (double-array (dfn/+ clean-ppg drift noise))))

PPG cleaning with Elgendi method

The Elgendi method applies a 3rd-order Butterworth bandpass at 0.5-8 Hz, which removes both the low-frequency drift and any high-frequency noise while preserving the pulse waveform.

(Elgendi et al. 2013)

(def ppg-cleaned
  (cardio/ppg-clean raw-ppg sampling-rate {:method :elgendi}))

(let [zoom-end 2500
      sub-time (subvec time-axis 0 zoom-end)
      nt (count sub-time)]
  (-> (tc/dataset {:time (vec (concat sub-time sub-time))
                   :amplitude (vec (concat (take zoom-end raw-ppg)
                                           (take zoom-end ppg-cleaned)))
                   :series (vec (concat (repeat nt "raw PPG")
                                        (repeat nt "cleaned PPG")))})
      (plotly/base {:=width 800 :=height 350 :=x :time :=y :amplitude :=color :series
                    :=x-title "Time (s)" :=y-title "Amplitude"})
      (plotly/layer-line {:=mark-opacity 0.7})
      plotly/plot))

After cleaning, the baseline is centered and noise is reduced, while the pulse waveform shape is preserved.

(count ppg-cleaned)

ECG R-Peak Detection

The problem

The R-peak is the tallest, sharpest deflection in the ECG – it marks ventricular depolarization. Locating each R-peak gives us the time of each heartbeat, from which we compute heart rate and HRV.

Simple thresholding fails because ECG amplitude varies across the recording. Adaptive methods are needed.

Method 1: NeuroKit gradient-based detection

Compute the absolute gradient (rate of change) of the signal
Smooth the gradient with a short boxcar filter (0.1 s)
Compute a wider moving average (0.75 s) as an adaptive threshold
Mark regions where the smoothed gradient exceeds 1.5x the average
Within each region, the sample with the highest original amplitude is the R-peak
Enforce a minimum 0.3 s gap between peaks (physiological limit: 200 BPM)

(def peaks-neurokit
  (cardio/ecg-findpeaks ecg-cleaned sampling-rate {:method :neurokit}))

(count peaks-neurokit)

Method 2: Hamilton (2002) adaptive threshold

Compute the absolute first difference of the signal
Smooth with a moving average (0.08 s window)
Maintain running estimates of signal-peak and noise-peak amplitudes
Threshold = noise_avg + 0.45 x (signal_avg - noise_avg)
Detect local maxima exceeding this adaptive threshold
If an RR interval exceeds 1.5x the mean, search back for missed beats at a lower threshold (0.5x)

(Hamilton, P. (2002). Open Source ECG Analysis Software Documentation.)

(def peaks-hamilton
  (cardio/ecg-findpeaks ecg-hamilton-cleaned sampling-rate {:method :hamilton}))

(count peaks-hamilton)

Comparing both methods

(let [nk-peak-times (mapv #(/ (double %) sampling-rate) peaks-neurokit)
      nk-peak-amps (mapv #(aget ecg-cleaned %) peaks-neurokit)
      ham-peak-times (mapv #(/ (double %) sampling-rate) peaks-hamilton)
      ham-peak-amps (mapv #(aget ecg-cleaned %) peaks-hamilton)
      sig-times (subvec time-axis 0 (count ecg-cleaned))
      sig-amps (vec ecg-cleaned)
      ns-sig (count sig-times)
      ns-nk (count nk-peak-times)
      ns-ham (count ham-peak-times)]
  (-> (tc/dataset {:time (vec (concat sig-times nk-peak-times ham-peak-times))
                   :amplitude (vec (concat sig-amps nk-peak-amps ham-peak-amps))
                   :series (vec (concat (repeat ns-sig "cleaned ECG")
                                        (repeat ns-nk "NeuroKit peaks")
                                        (repeat ns-ham "Hamilton peaks")))})
      (plotly/base {:=width 800 :=height 400 :=x :time :=y :amplitude :=color :series
                    :=x-title "Time (s)" :=y-title "Amplitude"})
      (plotly/layer-line {:=mark-opacity 0.4})
      (plotly/layer-point {:=mark-size 8 :=mark-opacity 0.9})
      plotly/plot))

Both methods find peaks at the same locations. The Hamilton method starts detecting slightly later (it needs time to build up its running averages), but on stable signals the results converge.

PPG Systolic Peak Detection

Elgendi et al. (2013) method

PPG peaks are broader than ECG R-peaks, so a different detection strategy is needed:

Square positive values, zero-out negatives – emphasizes peaks
Compute two moving averages of the squared signal:
- Peak MA (0.111 s) – follows individual peaks
- Beat MA (0.667 s) – follows the overall beat envelope
Threshold = beat MA + 0.02 x mean(squared signal)
Find regions where peak MA exceeds the threshold
Within each region, locate the maximum of the original signal
Enforce minimum 0.3 s delay between peaks

(Elgendi et al. 2013. PLoS ONE, 8(10), e76585.)

(def ppg-peaks
  (cardio/ppg-findpeaks ppg-cleaned sampling-rate {:method :elgendi}))

(count ppg-peaks)

Visualizing PPG peaks

(let [peak-times (mapv #(/ (double %) sampling-rate) ppg-peaks)
      peak-amps (mapv #(aget ppg-cleaned %) ppg-peaks)
      sig-times (subvec time-axis 0 (count ppg-cleaned))
      sig-amps (vec ppg-cleaned)
      ns (count sig-times)
      np (count peak-times)]
  (-> (tc/dataset {:time (vec (concat sig-times peak-times))
                   :amplitude (vec (concat sig-amps peak-amps))
                   :series (vec (concat (repeat ns "cleaned PPG")
                                        (repeat np "systolic peaks")))})
      (plotly/base {:=width 800 :=height 350 :=x :time :=y :amplitude :=color :series
                    :=x-title "Time (s)" :=y-title "Amplitude"})
      (plotly/layer-line {:=mark-opacity 0.5})
      (plotly/layer-point {:=mark-size 8 :=mark-opacity 0.9})
      plotly/plot))

Each systolic peak (maximum blood volume) is correctly detected.

Peak Artifact Correction

The problem

Peak detection is imperfect. Common artifacts include:

Extra beats: noise spikes misidentified as peaks (too-short intervals)
Missed beats: low-amplitude beats not detected (too-long intervals)

The solution

A two-pass correction strategy:

Remove extra beats: discard peaks closer than a minimum interval (default 0.3 s – corresponding to 200 BPM maximum heart rate)
Interpolate missed beats: when an interval exceeds 1.5x the median interval, uniformly insert the expected number of beats

This is based on HeartPy’s approach (van Gent et al. 2019).

To demonstrate, we create a deliberately corrupted peak list from our detected NeuroKit peaks:

(def corrupted-peaks
  (let [good-peaks (vec peaks-neurokit)
        ;; Add an extra beat halfway between beats 3 and 4
        extra-beat (int (/ (+ (nth good-peaks 2) (nth good-peaks 3)) 2))
        ;; Remove beat 7 to simulate a missed beat
        with-extra (sort (conj good-peaks extra-beat))
        without-missed (vec (concat (take 7 with-extra) (drop 8 with-extra)))]
    (int-array without-missed)))

Before correction:

(vec corrupted-peaks)

[21 431 852 1063 1275 1685 2098 2942 3354 3766 4187 4611]

After correction:

(def corrected-peaks
  (cardio/fix-peaks corrupted-peaks sampling-rate
                    {:interval-min 0.3 :interval-max-factor 1.5}))

(vec corrected-peaks)

[21 431 852 1063 1275 1685 2098 2520 2942 3354 3766 4187 4611]

The extra beat (the too-close one) is removed, and the gap left by the missing beat is filled by interpolation.

(let [original-count (count peaks-neurokit)
      corrupted-count (count corrupted-peaks)
      corrected-count (count corrected-peaks)]
  {:original original-count
   :corrupted corrupted-count
   :corrected corrected-count})

{:original 12, :corrupted 12, :corrected 13}

RR Interval Extraction

From peaks to intervals

The RR interval is the time between successive R-peaks, measured in milliseconds:

RR[i] = (peak[i+1] - peak[i]) / sampling_rate x 1000

The RR interval series is the foundation of heart rate variability analysis. From it we can also compute instantaneous heart rate:

HR[i] = 60000 / RR[i] (in BPM)

(def rr-intervals
  (cardio/peaks->rr-intervals peaks-neurokit sampling-rate))

(def heart-rates
  (cardio/rr->heart-rate rr-intervals))

RR tachogram

The tachogram plots RR interval vs. beat number – it shows beat-to-beat variability at a glance:

(let [n-rr (count rr-intervals)
      beat-numbers (vec (range 1 (inc n-rr)))]
  (-> (tc/dataset {:beat beat-numbers :rr-ms (vec rr-intervals)})
      (plotly/base {:=width 800 :=height 300 :=x :beat :=y :rr-ms
                    :=x-title "Beat number" :=y-title "RR interval (ms)"
                    :=title "RR Tachogram"})
      (plotly/layer-point {:=mark-size 8 :=mark-opacity 0.8})
      (plotly/layer-line {:=mark-opacity 0.5})
      plotly/plot))

Instantaneous heart rate

(let [n-hr (count heart-rates)
      beat-numbers (vec (range 1 (inc n-hr)))]
  (-> (tc/dataset {:beat beat-numbers :hr-bpm (vec heart-rates)})
      (plotly/base {:=width 800 :=height 300 :=x :beat :=y :hr-bpm
                    :=x-title "Beat number" :=y-title "Heart rate (BPM)"
                    :=title "Instantaneous Heart Rate"})
      (plotly/layer-point {:=mark-size 8 :=mark-opacity 0.8})
      (plotly/layer-line {:=mark-opacity 0.5})
      plotly/plot))

The mean RR interval should be close to the expected value for 72 BPM (833.3 ms). The ECGSYN model introduces natural beat-to-beat variation, so exact match is not expected:

(def mean-rr (/ (reduce + (vec rr-intervals)) (count rr-intervals)))

mean-rr

834.5454545454545

Time-Domain HRV

Heart rate variability (HRV) quantifies the fluctuations in RR intervals. Higher HRV generally indicates better cardiovascular health and stronger autonomic nervous system function.

Key time-domain indices

Time-domain indices are computed directly from the RR interval series and its successive differences. They require no spectral analysis.

Index	Formula	Interpretation
MeanNN	mean(RR)	Average heart period
SDNN	std(RR)	Overall variability
RMSSD	sqrt(mean(diff(RR)^2))	Parasympathetic (vagal) tone
SDSD	std(diff(RR))	Short-term variability
pNN50	% of abs(diff(RR)) > 50 ms	Parasympathetic activity
pNN20	% of abs(diff(RR)) > 20 ms	More sensitive version of pNN50
CVNN	SDNN / MeanNN	Normalized variability
CVSD	RMSSD / MeanNN	Normalized parasympathetic tone
MedianNN	median(RR)	Robust central tendency
MadNN	MAD(RR) x 1.4826	Robust dispersion estimate

(Task Force, 1996. Circulation 93(5), 1043-1065; Shaffer & Ginsberg, 2017. Frontiers in Public Health 5, 258)

(def hrv-results
  (cardio/hrv-time rr-intervals))

HRV results

(kind/table
 {:column-names ["Index" "Value" "Unit"]
  :row-vectors (mapv (fn [[k v unit]]
                       [(name k) (format "%.2f" (double v)) unit])
                     [[:mean-nn (:mean-nn hrv-results) "ms"]
                      [:sdnn (:sdnn hrv-results) "ms"]
                      [:rmssd (:rmssd hrv-results) "ms"]
                      [:sdsd (:sdsd hrv-results) "ms"]
                      [:pnn50 (:pnn50 hrv-results) "%"]
                      [:pnn20 (:pnn20 hrv-results) "%"]
                      [:cvnn (:cvnn hrv-results) ""]
                      [:cvsd (:cvsd hrv-results) ""]
                      [:median-nn (:median-nn hrv-results) "ms"]
                      [:mad-nn (:mad-nn hrv-results) "ms"]
                      [:min-nn (:min-nn hrv-results) "ms"]
                      [:max-nn (:max-nn hrv-results) "ms"]
                      [:mean-hr (:mean-hr hrv-results) "BPM"]
                      [:sd-hr (:sd-hr hrv-results) "BPM"]
                      [:min-hr (:min-hr hrv-results) "BPM"]
                      [:max-hr (:max-hr hrv-results) "BPM"]])})

Index	Value	Unit
mean-nn	834.55	ms
sdnn	11.53	ms
rmssd	15.21	ms
sdsd	15.75	ms
pnn50	0.00	%
pnn20	20.00	%
cvnn	0.01
cvsd	0.02
median-nn	842.00	ms
mad-nn	8.90	ms
min-nn	820.00	ms
max-nn	848.00	ms
mean-hr	71.91	BPM
sd-hr	1.00	BPM
min-hr	70.75	BPM
max-hr	73.17	BPM

The ECGSYN model produces realistic beat-to-beat variability, so the HRV indices reflect physiologically plausible values. SDNN and RMSSD are non-trivial, unlike the old fixed-interval synthetic signal.

Verify key indices:

(:mean-hr hrv-results)

71.9079242963633

Summary

Algorithm	Function	Key Parameters	Reference
Butterworth high-pass	cardio/butterworth-highpass	:cutoff, :order	(standard DSP)
Butterworth band-pass	cardio/butterworth-bandpass	:low-cutoff, :high-cutoff, :order	(standard DSP)
Butterworth band-stop	cardio/butterworth-bandstop	:low-cutoff, :high-cutoff, :order	(standard DSP)
ECG cleaning	cardio/ecg-clean	:method, :powerline	Makowski et al. 2021
PPG cleaning	cardio/ppg-clean	:method	Elgendi et al. 2013
ECG R-peak detection	cardio/ecg-findpeaks	:method	Hamilton 2002
PPG systolic detection	cardio/ppg-findpeaks	:method	Elgendi et al. 2013
Artifact correction	cardio/fix-peaks	:interval-min, :interval-max-factor	van Gent et al. 2019
RR interval extraction	cardio/peaks->rr-intervals	(sampling-rate)	(standard)
Heart rate conversion	cardio/rr->heart-rate	(none)	(standard)
Time-domain HRV	cardio/hrv-time	(none)	Task Force 1996

source: notebooks/ripple_book/cardio_algorithms_explained.clj