Engee documentation
Notebook

Detection of R-peaks on an ECG signal

The project considers ECG signal processing for the detection of R-peaks. The signal is loaded, network interference is filtered, the signal is prepared for detection and the found R-peaks are displayed on the ECG signal.

Display Settings

Function engee.clear() cleans up workspaces:

In [ ]:
engee.clear()

Two graph rendering modes are available to visualize the results.:

  • For static, fast rendering, use gr().

  • For interactive, dynamic visualization with the ability to scale and view values, use plotlyjs().

Select one of the commands by commenting out the second one. Both backends are mutually exclusive, and the last one called will be activated.

In [ ]:
# gr()
plotlyjs()
Out[0]:
Plots.PlotlyJSBackend()

We will connect using the function include the "read" file.jl" for reading files:

In [ ]:
include("$(@__DIR__)/read.jl")
Out[0]:
safe_read_ann (generic function with 1 method)

Installing the necessary libraries

At this stage, the necessary Julia libraries are checked and automatically installed if they are not available in the work environment. The example in question uses the library Statistics necessary for performing basic statistical calculations in the analysis of heart rate variability, including processing RR intervals, calculating averages, and estimating data distribution.

In [ ]:
let
    installed_packages = collect(x.name for (_, x) in Pkg.dependencies() if x.is_direct_dep)
    list_packages = ["Statistics"]
    for pack in list_packages
        pack in installed_packages || Pkg.add(pack)
    end
end
using Statistics

The library is being activated EngeeDSP and the functions necessary for digital ECG signal processing are imported, including filtering, spectral analysis, peak search, and convolution.

In [ ]:
import EngeeDSP.Functions: fft, butter, filter, findpeaks, conv, freqz

ECG signal processing and analysis

This section discusses the sequence of ECG signal processing and analysis, including data loading, spectral analysis, signal filtering, R-peak detection, and calculation of RR intervals. The results obtained are then used to evaluate heart rate variability using an RR tachogram, a histogram of the distribution of RR intervals, and a scatterogram.

Signal spectrum calculation function

A function is being created calc_fft, designed to calculate the amplitude spectrum of a signal using a discrete Fourier transform. The function generates a one-sided spectrum, calculates the corresponding frequency axis, and determines the amplitude spectrum on a linear and logarithmic scale. The data obtained is then used for spectral analysis of the ECG signal and evaluation of its frequency composition.

In [ ]:
function calc_fft(x, fs)
    len = length(x)
    S = fft(x)  # DFT
    
    # One-way spectrum
    K = len ÷ 2
    S = S[1:K]      
    freq = (0:K-1) .* fs ./ len     # frequency axis from 0 to fs/2

    # One-sided amplitude spectrum
    S_amp = 2 .* abs.(S) ./ len
    S_amp[1] = 0.0

    # One-sided amplitude spectrum in dB
    S_dB = 20 .* log10.(S_amp .+ 1e-12)

    return freq, S_amp, S_dB
end
Out[0]:
calc_fft (generic function with 1 method)

Uploading and preparing data

The selected fragment of the ECG signal is loaded from the MIT-BIH database. Based on the information from the header file, the sampling frequency of the recording is determined, after which the beginning and duration of the analyzed signal segment are set. Next, the corresponding time fragment of the ECG recording is read and the first signal channel corresponding to the second standard ECG lead is selected, which is widely used in heart rate analysis and R-peak detection.

In [ ]:
data = "$(@__DIR__)/mitdb/100"

# Reading the header file and getting the sampling rate
hdr = read_header(data)
fs = Float64(hdr.fs)

# Selecting a signal fragment for analysis
t_start = 5.0 * 60.0        # The fragment starts at 5 min.
t_dur   = 5.0  * 60.0      # The duration of the fragment is 5 minutes.

# Converting time from seconds to countdown numbers
sampfrom = Int(round(t_start * fs))
sampto   = sampfrom + Int(round(t_dur * fs))

# Reading the selected fragment of the ECG signal
_, raw, phys = read_dat_212(data; sampfrom=sampfrom, sampto=sampto)

# Selection of the lead channel for analysis
ecg = phys === nothing ? Float64.(raw[:,1]) : Float64.(phys[:,1])

# Calculating the number of counts and forming the time axis
N = length(ecg)
t = collect(0:N-1) ./ fs
Out[0]:
108000-element Vector{Float64}:
   0.0
   0.002777777777777778
   0.005555555555555556
   0.008333333333333333
   0.011111111111111112
   0.013888888888888888
   0.016666666666666666
   0.019444444444444445
   0.022222222222222223
   0.025
   0.027777777777777776
   0.030555555555555555
   0.03333333333333333
   ⋮
 299.96666666666664
 299.96944444444443
 299.97222222222223
 299.975
 299.97777777777776
 299.98055555555555
 299.98333333333335
 299.9861111111111
 299.9888888888889
 299.9916666666667
 299.99444444444447
 299.9972222222222

Displaying the original ECG signal

The original fragment of the ECG signal is displayed in the time domain.

In [ ]:
p1 = plot(t, ecg,
    xlabel="Time, from",
    ylabel="Amplitude, mV",
    label="The original signal"
)
Out[0]:

Spectral analysis of the initial ECG signal

Using a previously created function calc_fft The ECG signal spectrum is calculated and a one-sided amplitude spectrum is constructed. The resulting graph allows us to estimate the frequency distribution of the signal amplitudes and determine the main spectral components of the ECG signal.

In [ ]:
freq0, S0_amp, _ = calc_fft(ecg, fs)

p2 = plot(freq0, S0_amp,
    xlabel="Frequency, Hz",
    ylabel="Amplitude, mV",
    label="The original signal",
    xlim=(0, fs/2)
)
Out[0]:

From the obtained spectrum of the ECG signal, a pronounced spectral component in the 60 Hz region can be observed, corresponding to network interference. The presence of this component is due to the influence of the electrical network and signal recording equipment. To reduce the impact of network interference, digital filtering of the ECG signal will be performed next.

ECG signal filtering

To suppress network interference, a Butterworth notch filter is synthesized. A frequency of 60 Hz is selected as the central suppression frequency, corresponding to the spectral component of the network interference previously detected by spectral analysis of the signal.

In [ ]:
f_center = 60.0     # resonant frequency, Hz
bw = 6.0        # filter bandwidth, Hz
f1 = f_center - bw/2      # lower limit frequency, Hz
f2 = f_center + bw/2      # upper limit frequency, Hz

# Normalized frequency relative to sampling rate
wn1 = f1 / (fs / 2)
wn2 = f2 / (fs / 2)

order = 6      # the order of the projected filter

b, a = butter(order, [wn1, wn2], "stop")    # synthesis of the Butterworth notch filter
Out[0]:
2-element Vector{Any}:
 [0.8167515449979387 -4.907234464272076 … -4.907234464272074 0.8167515449979387]
 [1.0 -5.805671517442012 … -4.14311738707587 0.6670830862565154]

To analyze the properties of the synthesized filter, its amplitude-frequency response is calculated and constructed. Using the function freqz The complex transmission coefficient of the filter is determined, after which the angular frequency is converted to hertz and the frequency response is calculated on a logarithmic scale.

In [ ]:
# Calculation of the amplitude-frequency response (AFC) of the filter
h, w = freqz(b, a, 2048)
f_resp = w .* fs ./ (2π)    # converting the angular frequency to Hz

# Calculation of frequency response in dB
amp_resp = 20 .* log10.(abs.(h) .+ 1e-12)

plot(f_resp, amp_resp,
    xlabel="Frequency, Hz",
    ylabel="Frequency response, dB",
    legend=false,
    xlim=(0, fs/2)
)
Out[0]:

According to the graph obtained, it is possible to observe the area of signal suppression near the frequency of 60 Hz, corresponding to network interference.

The ECG signal is filtered using a previously synthesized notch filter. During processing, the spectral component in the 60 Hz region is attenuated, corresponding to network interference. This makes it possible to reduce the influence of external electrical interference and improve the quality of the ECG signal before further detection of R-peaks and calculation of RR intervals.

In [ ]:
ecg_filt = filter(b, a, ecg)        # ECG signal filtering
Out[0]:
108000-element Vector{Float64}:
 -0.2613604943993404
 -0.21250240167607903
 -0.26573066807757795
 -0.36614738177362005
 -0.4161172656214138
 -0.39324744502909514
 -0.32258636917030514
 -0.2986311375931025
 -0.3313726245576049
 -0.3962893283973373
 -0.40059139122729925
 -0.3890850840151229
 -0.36649244512364276
  ⋮
 -0.34206831520401354
 -0.3327934850479681
 -0.3324339179822744
 -0.347472287278352
 -0.3435299337450428
 -0.3383511370436336
 -0.33223499162723336
 -0.33419189908414015
 -0.31892828917862315
 -0.3344302495989344
 -0.320462905333482
 -0.3240091843406425

The spectrum of the filtered ECG signal is calculated and displayed on a previously constructed graph. This allows you to visually compare the spectral composition of the signal before and after filtering, as well as evaluate the effectiveness of network interference suppression in the 60 Hz region.

In [ ]:
freq1, S1_amp, _ = calc_fft(ecg_filt, fs)

plot!(p2, freq1, S1_amp, label="After filtering")
Out[0]:
In [ ]:
plot!(p1, t, ecg_filt, label="After filtering")
Out[0]:

After applying the notch filter, the spectral component at a frequency of 60 Hz is noticeably suppressed. The main energy of the ECG signal remains concentrated in the low-frequency region, so the useful waveform is preserved. This shows that the selected filter effectively reduces network interference and does not significantly distort the informative part of the ECG signal.

Detection of R-peaks

To increase the severity of QRS complexes and R peaks, the ECG signal derivative is calculated. The derivative makes it possible to identify sharp changes in the signal characteristic of the QRS complex region. After applying the function diff the number of samples is reduced by one, so using the function vcat A zero value is added to the beginning of the array, which allows you to save the original length of the signal.

In [ ]:
d_ecg = vcat(0.0, diff(ecg_filt))   # calculation of the ECG signal derivative

plot(t, d_ecg,
    xlabel="Time, from",
    ylabel="The amplitude of the derivative, rel. units",
    title="The derivative of the signal",
    legend=false
)
Out[0]:

After calculating the derivative, the signal is squared. This operation makes it possible to amplify areas with large amplitudes corresponding to QRS complexes and R peaks, as well as additionally suppress low-amplitude components of the signal. In addition, after squaring, all signal values become positive, which simplifies further processing and detection of R-peaks.

In [ ]:
sq_ecg = d_ecg .^ 2     # squaring the derivative of a signal

plot(t, sq_ecg,
    xlabel="Time, from",
    ylabel="Amplitude, relative units",
    title="The square of the derivative",
    legend=false
)
Out[0]:

After squaring the derivative, the signal is smoothed using the moving average method. To do this, a window of about 20 ms is set, which is converted into the number of samples, taking into account the sampling frequency. Next, a set of identical averaging coefficients is formed, the sum of which is equal to one. The application of such a window to the signal is performed using the function conv from the EngeeDSP library, which implements the convolution operation.

In [ ]:
# Smoothing the signal using the moving average method
win = max(3, Int(round(0.010 * fs)))
kernel = ones(win) ./ win

det_sig = conv(sq_ecg, kernel)  # signal smoothing using convolution
det_sig = det_sig[1:N]  # reducing the signal length to its original size

plot(t, det_sig,
    xlabel="Time, from",
    ylabel="Amplitude, relative units",
    title="Signal for detecting R-peaks",
    legend=false
)
Out[0]:

A prepared signal is used to detect R-peaks. det_sig, in which the regions of the QRS complexes become more pronounced. The detection threshold is calculated adaptively as the sum of the average signal value and 1.5 standard deviations. This method is better than a fixed threshold from the maximum, since it depends less on single emissions and takes into account the overall signal strength.

Parameter MinPeakDistance sets the minimum distance between adjacent peaks found. This is necessary so that multiple maxima are not detected within a single QRS complex. For the analysis of normal rhythm, bradycardia and tachycardia, a distance of about 0.30 s is used, since a value of 0.50 s limits the detection of rhythms with a frequency above 120 beats/min and can lead to missing peaks in tachycardia.

Function findpeaks from the EngeeDSP library, it searches for local maxima that meet the specified conditions in height and distance between peaks. As a result, an array of amplitudes of the peaks found is formed. pks and an array of their positions locs , which are then used to calculate the RR intervals.

In [ ]:
thr = mean(det_sig) + 1.5 * std(det_sig)    # adaptive detection threshold
min_dist = Int(round(0.5 * fs))   # minimum distance between R-peaks

# Search for R-peaks
pks, locs = findpeaks(
    det_sig, out=:data,
    MinPeakDistance = min_dist,
    MinPeakHeight  = thr
)
Out[0]:
(Ypk = [0.07705275825787196, 0.0916773825278909, 0.10834663376541775, 0.09481633031944532, 0.09963496404605718, 0.10460719300050016, 0.08650497487378789, 0.08319442366179106, 0.09247478104048162, 0.11295427417467355  …  0.1004942866529982, 0.08518680051583521, 0.08025327798569036, 0.08981339449670496, 0.0967951803374813, 0.08915943080450855, 0.08684800494412531, 0.08676418960626395, 0.09614281275354113, 0.08352697256693856], Xpk = [52, 349, 650, 932, 1206, 1493, 1780, 2083, 2381, 2680  …  105334, 105614, 105901, 106188, 106472, 106744, 107012, 107288, 107570, 107857], Wpk = [4.3850150608365865, 4.442932153811, 4.291772746874244, 4.367091061929614, 4.328827355369185, 4.38801284823262, 4.733936150335694, 4.452001752022625, 4.403056166576334, 4.284659123849451  …  4.387227236977196, 4.42038689427136, 4.44097986187262, 4.375775133899879, 4.294981876926613, 4.350528891707654, 4.336650176890544, 4.633688054411323, 4.2678951177804265, 4.363150334407692], Ppk = [0.07704879058713884, 0.09167628535186181, 0.10834571777219215, 0.09481181465597753, 0.09963224995512014, 0.10460611620966644, 0.08650225879249397, 0.0831901731489352, 0.09247192121249741, 0.11295348704301397  …  0.10049394311552719, 0.0851814030130436, 0.08024837323306237, 0.08981076469939231, 0.09679261413663849, 0.08915510722090965, 0.08684034685887856, 0.08675754683512554, 0.09613799667239774, 0.08351632852143341])

The results of the initial detection of R-peaks are displayed

In [ ]:
plot(t, det_sig,
    xlabel="Time, from",
    ylabel="Amplitude, relative units",
    title="Found R-peaks",
    legend=false
)
r_times = (locs .- 1) ./ fs   # converting the indices of the peaks found to seconds

scatter!(r_times, pks, markersize=3)  # display of the found peaks on the detection signal
Out[0]:

Visual verification of the result shows that the prepared signal is suitable for searching for R-peaks: the found maxima coincide with the regions of the QRS complexes, and the selected threshold and minimum distance parameters help to avoid repeated triggers within the same cardiac cycle.

Since the primary search is performed not by the ECG signal itself, but by the signal det_sig the position of the found maximum may be slightly shifted relative to the actual R-peak. Therefore, the coordinates are further refined: a small section of the filtered ECG signal is analyzed near each found position, and its local maximum is taken as the R-peak.

In [ ]:
# Clarifying the position of the R-peaks
halfwin = Int(round(0.05 * fs))   # the search box
r_locs = Int[]

for idx in locs
    # The boundaries of the area around the found peak
    l = max(1, idx - halfwin)
    r = min(N, idx + halfwin)

     # Searching for the maximum inside the selected area
    seg = ecg_filt[l:r]
    _, imax = findmax(seg)

    push!(r_locs, l + imax - 1)
end

r_locs = unique(sort(r_locs))   # removing duplicate indexes

r_times = (r_locs .- 1) ./ fs   # time of the found R-peaks
r_vals = ecg_filt[r_locs]       # the amplitudes of the found R-peaks

plot(t, ecg_filt,
    xlabel="Time, from",
    ylabel="Amplitude, mV",
    title="Found R-peaks",
    legend=false
)

scatter!(r_times, r_vals, markersize=3)
Out[0]:

The results show that the refined positions of the R-peaks correctly correspond to the maxima of the filtered ECG signal. This approach makes it possible to link the primary detection based on the prepared signal with the actual shape of the ECG and obtain more accurate time coordinates of the R-peaks.

Calculation of RR intervals

The time coordinates of the found R peaks are used to calculate the RR intervals. First, the difference between neighboring R-peaks is determined, that is, the duration of each cardiac cycle in seconds. Then the obtained values are converted to milliseconds, since this dimension is more convenient for analyzing the heart rate. Additionally, the instantaneous heart rate is calculated as the inverse of the RR interval.

In [ ]:
rr = diff(r_times)  # the difference between neighboring R-peaks, with

rr_ms = rr.* 1e3  # converting RR intervals to ms

rr_time = r_times[2:end]    # time axis for RR intervals

heart_rate = 60.0 ./ rr # Calculation of heart rate, beats/min

plot(rr_time, rr_ms,
    xlabel="Time, from",
    ylabel="RR-interval, ms",
    title="RR intervals",
    legend=false
)
Out[0]:

The graph shows the change in RR intervals over time. It can be used to assess the uniformity of the heart rate and see how much the duration of cardiac cycles varies on the selected fragment of the ECG signal.

A graph of changes in heart rate over time is constructed. Heart rate values are calculated using RR intervals: the smaller the distance between adjacent R peaks, the higher the heart rate, and vice versa.

In [ ]:
plot(rr_time, heart_rate,
    xlabel="Time, from",
    ylabel="Heart rate, beats/min",
    title="Heart rate",
    legend=false
)
Out[0]:

The resulting graph allows you to evaluate the dynamics of the heart rate on the selected fragment of the ECG signal. With a relatively stable rhythm, heart rate values change smoothly and are in a close range. Sudden jumps on the graph can be associated with real changes in the heart rate, as well as with errors in detecting R-peaks or artifacts in the signal.

Materials used

The work used real experimental electrocardiogram recordings obtained from the open source PhysioNet. The [MIT-BIH Arrhythmia Database] was used as the source data (https://physionet.org/content/mitdb/1.0.0 /). In this example, the entry is used 100, represented by files 100.hea and 100.dat.

Conclusion

In this example, digital ECG signal processing using the EngeeDSP library for detecting R-peaks and subsequent calculation of RR intervals and heart rate was considered.

The work of the example includes the following main steps:

  • Data download. The initial ECG data was obtained from the open database PhysioNet MIT-BIH Arrhythmia Database. A recording fragment was used for the analysis. 100 containing a real experimental ECG signal.
  • Spectral analysis. To estimate the frequency composition of the original signal, a one-sided amplitude spectrum was constructed. The spectrum revealed a pronounced component in the 60 Hz region, corresponding to network interference.
  • Signal filtering. A Butterworth notch filter was synthesized to suppress network interference. Using functions butter, freqz, filter from the EngeeDSP libraries.
  • Preparing the signal for detection. To isolate the areas of the QRS complexes, the derivative of the filtered ECG signal was calculated, the derivative was squared and smoothed using the moving average method. Smoothing is implemented using the function conv from the EngeeDSP library.
  • Detection of R-peaks. The function was used to search for local maxima findpeaks from the EngeeDSP library. The detection was performed taking into account the adaptive threshold and the minimum distance between adjacent peaks, which made it possible to avoid repeated triggers within the same cardiac cycle.
  • Clarifying the position of the R-peaks. After the initial detection, the positions of the R-peaks were clarified on the filtered ECG signal. This made it possible to link the search results for the prepared signal with the actual shape of the ECG signal.
  • Calculation of RR intervals and heart rate. Based on the time coordinates of the found R-peaks, the RR intervals and the instantaneous heart rate were calculated. The obtained graphs allow us to estimate the change in the duration of cardiac cycles and the dynamics of the heart rate on the selected recording fragment.

As a result of the example, the sequence of ECG signal processing was implemented from loading the source data to detecting R-peaks, calculating RR intervals and heart rate.