4 Cardio Algorithms Explained
A guide to the cardiovascular signal processing algorithms in Ripple, explaining why each step is needed, the mathematical intuition behind it, and how to use the Clojure implementation.
Ripple implements ECG and PPG processing pipelines based on NeuroKit2 and HeartPy. This notebook walks through each algorithm in pipeline order, using synthetic signals that make the effect of each step clearly visible.
References:
McSharry, P. E. et al. (2003). “A dynamical model for generating synthetic electrocardiogram signals.” IEEE TBME, 50(3), 289-294.
Makowski, D. et al. (2021). “NeuroKit2: A Python toolbox for neurophysiological signal processing.” Behavior Research Methods, 53(4), 1689-1696. doi:10.3758/s13428-020-01516-y
van Gent, P. et al. (2019). “HeartPy: A novel heart rate algorithm for the analysis of noisy signals.” Transportation Research Part F, 66, 368-378. doi:10.1016/j.trf.2019.09.015
Pham, T. et al. (2021). “Heart Rate Variability in Psychology: A Review of HRV Indices and an Analysis Tutorial.” Sensors, 21(12),
(ns ripple-book.cardio-algorithms-explained
(:require
;; Ripple cardio public API:
[scicloj.ripple.cardio :as cardio]
;; Table processing (https://scicloj.github.io/tablecloth/):
[tablecloth.api :as tc]
;; High-performance array math (https://github.com/cnuernber/dtype-next):
[tech.v3.datatype.functional :as dfn]
;; Interactive plotting via Plotly (https://scicloj.github.io/tableplot/):
[scicloj.tableplot.v1.plotly :as plotly]
;; Annotating kinds of visualizations (https://scicloj.github.io/kindly-noted/):
[scicloj.kindly.v4.kind :as kind]))The Cardiovascular Signal Processing Pipeline
Electrocardiography (ECG) records the electrical activity of the heart via skin electrodes. Photoplethysmography (PPG) measures blood volume changes optically, typically from a fingertip or wrist sensor. Both produce periodic signals from which heart rate and heart rate variability (HRV) can be extracted.
Raw signals contain noise and artifacts. Before analysis, each signal passes through a processing pipeline:
| Step | Algorithm | Purpose |
|---|---|---|
| 1 | Butterworth filtering | Remove drift and powerline noise |
| 2 | R-peak / systolic detection | Find heartbeat locations |
| 3 | Artifact correction | Fix missed and extra beats |
| 4 | RR interval extraction | Measure time between beats |
| 5 | Time-domain HRV | Compute heart rate variability indices |
Synthetic ECG Signal
We use the ECGSYN dynamical model (McSharry et al. 2003) to generate a realistic synthetic ECG with PQRST morphology and beat-to-beat variability. The model uses three coupled ODEs on a limit cycle, producing natural heart rate variability via LF (Mayer waves) and HF (respiratory) spectral components.
The clean signal is then corrupted with artifacts to demonstrate the cleaning pipeline.
(def sampling-rate 500.0)(def duration 10.0)(def n-samples (int (* sampling-rate duration)))(def heart-rate-bpm 72.0)(def time-axis
(vec (map #(/ (double %) sampling-rate) (range n-samples))))Generate a clean ECG using the ECGSYN model:
(def clean-ecg
(cardio/synthetic-ecg {:sampling-rate sampling-rate
:duration duration
:heart-rate heart-rate-bpm
:seed 42}))(count clean-ecg)5000Add realistic artifacts for the cleaning demonstration: baseline drift (0.3 Hz respiratory), powerline interference (50 Hz), and broadband noise.
(def drift-signal
(double-array (map #(* 0.2 (Math/sin (* 2.0 Math/PI 0.3 (/ (double %) sampling-rate))))
(range n-samples))))(def powerline-signal
(double-array (map #(* 0.1 (Math/sin (* 2.0 Math/PI 50.0 (/ (double %) sampling-rate))))
(range n-samples))))(def noise-signal
(double-array (map (fn [i]
(let [id (double i)]
(+ (* 0.03 (Math/sin (* id 0.7)))
(* 0.02 (Math/sin (* id 2.3)))
(* 0.01 (Math/sin (* id 5.1))))))
(range n-samples))))(def raw-ecg
(double-array (dfn/+ clean-ecg drift-signal powerline-signal noise-signal)))The raw ECG signal
(-> (tc/dataset {:time time-axis :amplitude (vec raw-ecg)})
(plotly/base {:=width 800 :=height 300 :=x :time :=y :amplitude
:=x-title "Time (s)" :=y-title "Amplitude"
:=title "Raw ECG signal (ECGSYN + artifacts)"})
(plotly/layer-line {:=mark-opacity 0.8})
plotly/plot)The raw signal shows the PQRST complexes but also baseline wander (the slow undulation), high-frequency powerline noise (the fine ripple visible on the flat segments), and broadband noise.
ECG Cleaning — Butterworth Filtering
The problem
ECG recordings contain two main types of noise:
- Baseline wander from respiration and electrode movement (< 1 Hz)
- Powerline interference from mains electricity (50 Hz in EU/Israel, 60 Hz in US/Japan)
Both obscure the true ECG morphology and can confuse peak detection.
The solution: Butterworth filters
A Butterworth filter has a maximally flat magnitude response in the passband – no ripples, just a smooth roll-off. Ripple implements three Butterworth filter types via jDSP:
- High-pass: removes frequencies below the cutoff (drift removal)
- Band-stop (notch): removes a narrow frequency band (powerline)
- Band-pass: keeps only frequencies in a range (QRS isolation)
The NeuroKit2 default cleaning method chains a high-pass at 0.5 Hz (order 5) with a notch at the powerline frequency.
(Makowski et al. 2021)
Seeing each filter’s effect
First, high-pass only – removes drift but keeps the 50 Hz noise:
(def ecg-highpass-only
(cardio/butterworth-highpass raw-ecg sampling-rate {:cutoff 0.5 :order 5}))Notch only – removes 50 Hz but keeps drift:
(def ecg-notch-only
(cardio/butterworth-bandstop raw-ecg sampling-rate {:low-cutoff 49.0 :high-cutoff 51.0 :order 5}))Both together – the NeuroKit default method:
(def ecg-cleaned
(cardio/ecg-clean raw-ecg sampling-rate {:method :neurokit :powerline 50}))Zoomed view of the first 2 seconds, showing all variants:
(let [zoom-end 1000
sub-time (subvec time-axis 0 zoom-end)
nt (count sub-time)]
(-> (tc/dataset {:time (vec (concat sub-time sub-time sub-time sub-time))
:amplitude (vec (concat (take zoom-end raw-ecg)
(take zoom-end ecg-highpass-only)
(take zoom-end ecg-notch-only)
(take zoom-end ecg-cleaned)))
:series (vec (concat (repeat nt "raw")
(repeat nt "highpass only")
(repeat nt "notch only")
(repeat nt "cleaned (both)")))})
(plotly/base {:=width 800 :=height 400 :=x :time :=y :amplitude :=color :series
:=x-title "Time (s)" :=y-title "Amplitude"})
(plotly/layer-line {:=mark-opacity 0.7})
plotly/plot))The “highpass only” trace removes the slow drift but retains the 50 Hz ripple. The “notch only” trace removes the powerline noise but still drifts. The cleaned signal (both filters) has a stable baseline and clean morphology.
Verify the cleaned signal has the same length:
(count ecg-cleaned)5000Hamilton cleaning method
The Hamilton (2002) method takes a different approach: a narrow bandpass filter at 8-16 Hz isolates the QRS complex frequency band. This aggressively removes everything except the sharp R-peak, which makes peak detection easier at the cost of losing P-wave and T-wave morphology.
(def ecg-hamilton-cleaned
(cardio/ecg-clean raw-ecg sampling-rate {:method :hamilton}))(let [zoom-end 1500
sub-time (subvec time-axis 0 zoom-end)
nt (count sub-time)]
(-> (tc/dataset {:time (vec (concat sub-time sub-time))
:amplitude (vec (concat (take zoom-end ecg-cleaned)
(take zoom-end ecg-hamilton-cleaned)))
:series (vec (concat (repeat nt "NeuroKit (broadband)")
(repeat nt "Hamilton (8-16 Hz)")))})
(plotly/base {:=width 800 :=height 350 :=x :time :=y :amplitude :=color :series
:=x-title "Time (s)" :=y-title "Amplitude"})
(plotly/layer-line {:=mark-opacity 0.8})
plotly/plot))The Hamilton-cleaned signal retains only the QRS complex – the P-wave and T-wave are filtered out. This is useful when the goal is purely R-peak detection, not morphological analysis.
Synthetic PPG Signal
Photoplethysmography measures changes in blood volume using light. The PPG waveform is broader and smoother than ECG, with a systolic peak (blood volume maximum) and sometimes a dicrotic notch (a small dip from aortic valve closure).
PPG contains lower-frequency content than ECG, so it uses a bandpass filter at 0.5-8 Hz rather than the ECG’s highpass + notch combination.
(def ppg-heart-rate 75.0)(def clean-ppg
(cardio/synthetic-ppg {:sampling-rate sampling-rate
:duration duration
:heart-rate ppg-heart-rate
:seed 42}))Add artifacts to demonstrate cleaning:
(def raw-ppg
(let [drift (double-array (map #(* 0.15 (Math/sin (* 2.0 Math/PI 0.1 (/ (double %) sampling-rate))))
(range n-samples)))
noise (double-array (map (fn [i]
(+ (* 0.03 (Math/sin (* (double i) 0.5)))
(* 0.01 (Math/sin (* (double i) 2.3)))))
(range n-samples)))]
(double-array (dfn/+ clean-ppg drift noise))))PPG cleaning with Elgendi method
The Elgendi method applies a 3rd-order Butterworth bandpass at 0.5-8 Hz, which removes both the low-frequency drift and any high-frequency noise while preserving the pulse waveform.
(def ppg-cleaned
(cardio/ppg-clean raw-ppg sampling-rate {:method :elgendi}))(let [zoom-end 2500
sub-time (subvec time-axis 0 zoom-end)
nt (count sub-time)]
(-> (tc/dataset {:time (vec (concat sub-time sub-time))
:amplitude (vec (concat (take zoom-end raw-ppg)
(take zoom-end ppg-cleaned)))
:series (vec (concat (repeat nt "raw PPG")
(repeat nt "cleaned PPG")))})
(plotly/base {:=width 800 :=height 350 :=x :time :=y :amplitude :=color :series
:=x-title "Time (s)" :=y-title "Amplitude"})
(plotly/layer-line {:=mark-opacity 0.7})
plotly/plot))After cleaning, the baseline is centered and noise is reduced, while the pulse waveform shape is preserved.
(count ppg-cleaned)5000ECG R-Peak Detection
The problem
The R-peak is the tallest, sharpest deflection in the ECG – it marks ventricular depolarization. Locating each R-peak gives us the time of each heartbeat, from which we compute heart rate and HRV.
Simple thresholding fails because ECG amplitude varies across the recording. Adaptive methods are needed.
Method 1: NeuroKit gradient-based detection
- Compute the absolute gradient (rate of change) of the signal
- Smooth the gradient with a short boxcar filter (0.1 s)
- Compute a wider moving average (0.75 s) as an adaptive threshold
- Mark regions where the smoothed gradient exceeds 1.5x the average
- Within each region, the sample with the highest original amplitude is the R-peak
- Enforce a minimum 0.3 s gap between peaks (physiological limit: 200 BPM)
(def peaks-neurokit
(cardio/ecg-findpeaks ecg-cleaned sampling-rate {:method :neurokit}))(count peaks-neurokit)12Method 2: Hamilton (2002) adaptive threshold
- Compute the absolute first difference of the signal
- Smooth with a moving average (0.08 s window)
- Maintain running estimates of signal-peak and noise-peak amplitudes
- Threshold = noise_avg + 0.45 x (signal_avg - noise_avg)
- Detect local maxima exceeding this adaptive threshold
- If an RR interval exceeds 1.5x the mean, search back for missed beats at a lower threshold (0.5x)
(Hamilton, P. (2002). Open Source ECG Analysis Software Documentation.)
(def peaks-hamilton
(cardio/ecg-findpeaks ecg-hamilton-cleaned sampling-rate {:method :hamilton}))(count peaks-hamilton)12Comparing both methods
(let [nk-peak-times (mapv #(/ (double %) sampling-rate) peaks-neurokit)
nk-peak-amps (mapv #(aget ecg-cleaned %) peaks-neurokit)
ham-peak-times (mapv #(/ (double %) sampling-rate) peaks-hamilton)
ham-peak-amps (mapv #(aget ecg-cleaned %) peaks-hamilton)
sig-times (subvec time-axis 0 (count ecg-cleaned))
sig-amps (vec ecg-cleaned)
ns-sig (count sig-times)
ns-nk (count nk-peak-times)
ns-ham (count ham-peak-times)]
(-> (tc/dataset {:time (vec (concat sig-times nk-peak-times ham-peak-times))
:amplitude (vec (concat sig-amps nk-peak-amps ham-peak-amps))
:series (vec (concat (repeat ns-sig "cleaned ECG")
(repeat ns-nk "NeuroKit peaks")
(repeat ns-ham "Hamilton peaks")))})
(plotly/base {:=width 800 :=height 400 :=x :time :=y :amplitude :=color :series
:=x-title "Time (s)" :=y-title "Amplitude"})
(plotly/layer-line {:=mark-opacity 0.4})
(plotly/layer-point {:=mark-size 8 :=mark-opacity 0.9})
plotly/plot))Both methods find peaks at the same locations. The Hamilton method starts detecting slightly later (it needs time to build up its running averages), but on stable signals the results converge.
PPG Systolic Peak Detection
Elgendi et al. (2013) method
PPG peaks are broader than ECG R-peaks, so a different detection strategy is needed:
Square positive values, zero-out negatives – emphasizes peaks
Compute two moving averages of the squared signal:
- Peak MA (0.111 s) – follows individual peaks
- Beat MA (0.667 s) – follows the overall beat envelope
Threshold = beat MA + 0.02 x mean(squared signal)
Find regions where peak MA exceeds the threshold
Within each region, locate the maximum of the original signal
Enforce minimum 0.3 s delay between peaks
(Elgendi et al. 2013. PLoS ONE, 8(10), e76585.)
(def ppg-peaks
(cardio/ppg-findpeaks ppg-cleaned sampling-rate {:method :elgendi}))(count ppg-peaks)13Visualizing PPG peaks
(let [peak-times (mapv #(/ (double %) sampling-rate) ppg-peaks)
peak-amps (mapv #(aget ppg-cleaned %) ppg-peaks)
sig-times (subvec time-axis 0 (count ppg-cleaned))
sig-amps (vec ppg-cleaned)
ns (count sig-times)
np (count peak-times)]
(-> (tc/dataset {:time (vec (concat sig-times peak-times))
:amplitude (vec (concat sig-amps peak-amps))
:series (vec (concat (repeat ns "cleaned PPG")
(repeat np "systolic peaks")))})
(plotly/base {:=width 800 :=height 350 :=x :time :=y :amplitude :=color :series
:=x-title "Time (s)" :=y-title "Amplitude"})
(plotly/layer-line {:=mark-opacity 0.5})
(plotly/layer-point {:=mark-size 8 :=mark-opacity 0.9})
plotly/plot))Each systolic peak (maximum blood volume) is correctly detected.
Peak Artifact Correction
The problem
Peak detection is imperfect. Common artifacts include:
- Extra beats: noise spikes misidentified as peaks (too-short intervals)
- Missed beats: low-amplitude beats not detected (too-long intervals)
The solution
A two-pass correction strategy:
- Remove extra beats: discard peaks closer than a minimum interval (default 0.3 s – corresponding to 200 BPM maximum heart rate)
- Interpolate missed beats: when an interval exceeds 1.5x the median interval, uniformly insert the expected number of beats
This is based on HeartPy’s approach (van Gent et al. 2019).
To demonstrate, we create a deliberately corrupted peak list from our detected NeuroKit peaks:
(def corrupted-peaks
(let [good-peaks (vec peaks-neurokit)
;; Add an extra beat halfway between beats 3 and 4
extra-beat (int (/ (+ (nth good-peaks 2) (nth good-peaks 3)) 2))
;; Remove beat 7 to simulate a missed beat
with-extra (sort (conj good-peaks extra-beat))
without-missed (vec (concat (take 7 with-extra) (drop 8 with-extra)))]
(int-array without-missed)))Before correction:
(vec corrupted-peaks)[21 431 852 1063 1275 1685 2098 2942 3354 3766 4187 4611]After correction:
(def corrected-peaks
(cardio/fix-peaks corrupted-peaks sampling-rate
{:interval-min 0.3 :interval-max-factor 1.5}))(vec corrected-peaks)[21 431 852 1063 1275 1685 2098 2520 2942 3354 3766 4187 4611]The extra beat (the too-close one) is removed, and the gap left by the missing beat is filled by interpolation.
(let [original-count (count peaks-neurokit)
corrupted-count (count corrupted-peaks)
corrected-count (count corrected-peaks)]
{:original original-count
:corrupted corrupted-count
:corrected corrected-count}){:original 12, :corrupted 12, :corrected 13}RR Interval Extraction
From peaks to intervals
The RR interval is the time between successive R-peaks, measured in milliseconds:
RR[i] = (peak[i+1] - peak[i]) / sampling_rate x 1000
The RR interval series is the foundation of heart rate variability analysis. From it we can also compute instantaneous heart rate:
HR[i] = 60000 / RR[i] (in BPM)
(def rr-intervals
(cardio/peaks->rr-intervals peaks-neurokit sampling-rate))(def heart-rates
(cardio/rr->heart-rate rr-intervals))RR tachogram
The tachogram plots RR interval vs. beat number – it shows beat-to-beat variability at a glance:
(let [n-rr (count rr-intervals)
beat-numbers (vec (range 1 (inc n-rr)))]
(-> (tc/dataset {:beat beat-numbers :rr-ms (vec rr-intervals)})
(plotly/base {:=width 800 :=height 300 :=x :beat :=y :rr-ms
:=x-title "Beat number" :=y-title "RR interval (ms)"
:=title "RR Tachogram"})
(plotly/layer-point {:=mark-size 8 :=mark-opacity 0.8})
(plotly/layer-line {:=mark-opacity 0.5})
plotly/plot))Instantaneous heart rate
(let [n-hr (count heart-rates)
beat-numbers (vec (range 1 (inc n-hr)))]
(-> (tc/dataset {:beat beat-numbers :hr-bpm (vec heart-rates)})
(plotly/base {:=width 800 :=height 300 :=x :beat :=y :hr-bpm
:=x-title "Beat number" :=y-title "Heart rate (BPM)"
:=title "Instantaneous Heart Rate"})
(plotly/layer-point {:=mark-size 8 :=mark-opacity 0.8})
(plotly/layer-line {:=mark-opacity 0.5})
plotly/plot))The mean RR interval should be close to the expected value for 72 BPM (833.3 ms). The ECGSYN model introduces natural beat-to-beat variation, so exact match is not expected:
(def mean-rr (/ (reduce + (vec rr-intervals)) (count rr-intervals)))mean-rr834.5454545454545Time-Domain HRV
Heart rate variability (HRV) quantifies the fluctuations in RR intervals. Higher HRV generally indicates better cardiovascular health and stronger autonomic nervous system function.
Key time-domain indices
Time-domain indices are computed directly from the RR interval series and its successive differences. They require no spectral analysis.
| Index | Formula | Interpretation |
|---|---|---|
| MeanNN | mean(RR) | Average heart period |
| SDNN | std(RR) | Overall variability |
| RMSSD | sqrt(mean(diff(RR)^2)) | Parasympathetic (vagal) tone |
| SDSD | std(diff(RR)) | Short-term variability |
| pNN50 | % of abs(diff(RR)) > 50 ms | Parasympathetic activity |
| pNN20 | % of abs(diff(RR)) > 20 ms | More sensitive version of pNN50 |
| CVNN | SDNN / MeanNN | Normalized variability |
| CVSD | RMSSD / MeanNN | Normalized parasympathetic tone |
| MedianNN | median(RR) | Robust central tendency |
| MadNN | MAD(RR) x 1.4826 | Robust dispersion estimate |
(Task Force, 1996. Circulation 93(5), 1043-1065; Shaffer & Ginsberg, 2017. Frontiers in Public Health 5, 258)
(def hrv-results
(cardio/hrv-time rr-intervals))HRV results
(kind/table
{:column-names ["Index" "Value" "Unit"]
:row-vectors (mapv (fn [[k v unit]]
[(name k) (format "%.2f" (double v)) unit])
[[:mean-nn (:mean-nn hrv-results) "ms"]
[:sdnn (:sdnn hrv-results) "ms"]
[:rmssd (:rmssd hrv-results) "ms"]
[:sdsd (:sdsd hrv-results) "ms"]
[:pnn50 (:pnn50 hrv-results) "%"]
[:pnn20 (:pnn20 hrv-results) "%"]
[:cvnn (:cvnn hrv-results) ""]
[:cvsd (:cvsd hrv-results) ""]
[:median-nn (:median-nn hrv-results) "ms"]
[:mad-nn (:mad-nn hrv-results) "ms"]
[:min-nn (:min-nn hrv-results) "ms"]
[:max-nn (:max-nn hrv-results) "ms"]
[:mean-hr (:mean-hr hrv-results) "BPM"]
[:sd-hr (:sd-hr hrv-results) "BPM"]
[:min-hr (:min-hr hrv-results) "BPM"]
[:max-hr (:max-hr hrv-results) "BPM"]])})| Index | Value | Unit |
|---|---|---|
| mean-nn | 834.55 | ms |
| sdnn | 11.53 | ms |
| rmssd | 15.21 | ms |
| sdsd | 15.75 | ms |
| pnn50 | 0.00 | % |
| pnn20 | 20.00 | % |
| cvnn | 0.01 | |
| cvsd | 0.02 | |
| median-nn | 842.00 | ms |
| mad-nn | 8.90 | ms |
| min-nn | 820.00 | ms |
| max-nn | 848.00 | ms |
| mean-hr | 71.91 | BPM |
| sd-hr | 1.00 | BPM |
| min-hr | 70.75 | BPM |
| max-hr | 73.17 | BPM |
The ECGSYN model produces realistic beat-to-beat variability, so the HRV indices reflect physiologically plausible values. SDNN and RMSSD are non-trivial, unlike the old fixed-interval synthetic signal.
Verify key indices:
(:mean-hr hrv-results)71.9079242963633Summary
| Algorithm | Function | Key Parameters | Reference |
|---|---|---|---|
| Butterworth high-pass | cardio/butterworth-highpass | :cutoff, :order | (standard DSP) |
| Butterworth band-pass | cardio/butterworth-bandpass | :low-cutoff, :high-cutoff, :order | (standard DSP) |
| Butterworth band-stop | cardio/butterworth-bandstop | :low-cutoff, :high-cutoff, :order | (standard DSP) |
| ECG cleaning | cardio/ecg-clean | :method, :powerline | Makowski et al. 2021 |
| PPG cleaning | cardio/ppg-clean | :method | Elgendi et al. 2013 |
| ECG R-peak detection | cardio/ecg-findpeaks | :method | Hamilton 2002 |
| PPG systolic detection | cardio/ppg-findpeaks | :method | Elgendi et al. 2013 |
| Artifact correction | cardio/fix-peaks | :interval-min, :interval-max-factor | van Gent et al. 2019 |
| RR interval extraction | cardio/peaks->rr-intervals | (sampling-rate) | (standard) |
| Heart rate conversion | cardio/rr->heart-rate | (none) | (standard) |
| Time-domain HRV | cardio/hrv-time | (none) | Task Force 1996 |