Engee documentation

edr

The editing distance of the real signals.

Library

EngeeDSP

Syntax

Function call

  • dist = edr(x, y, tol) — returns Real signal editing distance between sequences x and y. Function edr returns the minimum number of items to be removed from x, y or from both sequences x and y so that the sum of the Euclidean distances between the remaining elements of the signal is within the specified tolerance tol.

  • dist, ix, iy = edr(x, y, tol) — also returns the conversion path where x(ix) and y(iy) they have the minimum possible distance between each other. If x and y — matrices, then ix and iy such that x(:,ix) and y(:,iy) minimally separated.

  • _ = edr(x, y, maxsamp) — restricts insertion operations so that the conversion path remains within maxsamp counts of a straight line between x and y. This syntax returns any of the output arguments of the previous syntaxes.

  • _ = edr(_, metric) — uses the distance metric in addition to any input arguments in previous syntaxes. Metric metric can take values of "euclidean", "absolute", "squared" or "symmkl".

Arguments

Input arguments

# x — input signal

+ vector | the matrix

Details

An input signal specified as a real or complex vector or matrix.

Типы данных

Float32, Float64

Support for complex numbers

Yes

# y — input signal

+ vector | the matrix

Details

An input signal specified as a real or complex vector or matrix.

Типы данных

Float32, Float64

Support for complex numbers

Yes

# tol — tolerance

+ scalar

Details

The tolerance is set as a positive scalar.

Типы данных

Float32, Float64

# maxsamp — width of the setup window

+ Inf (by default) | scalar

Details

The width of the setup window, set as a positive integer scalar.

Типы данных

Float32, Float64

# metric — distance metric

+ "euclidean" (by default) | "absolute" | "squared" | "symmkl"

Details

The distance metric set as one of the values: "euclidean", "absolute", "squared" or "symmkl". If and are -measured signals, then the metric sets — the distance between -m countdown and -m countdown . Learn more about see in the section Dynamic time conversion.

  • "euclidean" — the root of the sum of the squares of the differences, also known as the Euclidean metric or :

  • "absolute" — the sum of absolute differences, also known as the distance of city blocks, the Manhattan metric, the taxi metric, or :

  • "squared" — the square of the Euclidean metric, consisting of the sum of the squares of the differences:

  • "symmkl" — symmetric Kullback—Leibler metric. This metric is only true for real and positive numbers. and :

Output arguments

# dist — minimum distance

+ scalar

Details

The minimum distance between the signals, returned as a positive real scalar.

# ix, iy — conversion path

+ dimension

Details

The conversion path returned as index vectors. Arguments ix and iy they have the same length. Each vector contains a monotonically increasing sequence in which the indices of the elements of the corresponding signal x or y they are repeated the required number of times.

Examples

The editing distance between a cosine with varying frequency and a sinusoid with outliers

Details

We will generate two real signals: a cosine with varying frequency and a sinusoid. Let’s add an obviously drop-down section to each signal. Let’s draw the signals on a graph.

import EngeeDSP.Functions: edr


x = cos.(2 * pi * (3 * (1:1000) / 1000) .^ 2)
y = cos.(2 * pi * 9 * (1:399) / 400)

x[400:410] .= 7
y[100:115] .= 7

tol = 0.1

p1 = plot(1:length(x), x, linewidth=2, color=:blue)
plot!(1:length(y), y, linewidth=2, color=:red, linestyle=:dash, legend=false)
title!("Original signals")

edr 1

We transform the signals so that the distance between them is minimal. Specify the tolerance 0.1. Let’s plot the aligned signals after the transformation.

d,x1,y1 = edr(x, y, tol)

x_aligned = x[x1]
y_aligned = y[y1]


p2 = plot(1:length(x_aligned), x_aligned, linewidth=2, color=:blue)
plot!(1:length(y_aligned), y_aligned, linewidth=2, color=:red, linestyle=:dash, legend=false)
title!("Aligned signals")

edr 2

Additional Info

Real signal editing distance

Details

Two signals with equivalent characteristics arranged in the same order may look completely different due to the difference in the duration of their segments. Function edr transforms these durations in such a way that the corresponding characteristics appear in the same place on the common time axis, thereby emphasizing the similarity between the signals. The criterion used for conversion is designed to be emission-resistant.

Consider two -three dimensional signals:

and

which contain and counts, respectively. When set in metric the distance value between -m countdown and -m countdown , function edr stretches and for a common set of time points in such a way that the editing distance between the signals is minimal.

If available , a real number representing the tolerance set in tol, it is announced that -th countdown and -th countdown they match if . If there are two counts, and , do not match, they can be matched in any of three ways:

  1. Remove from the first signal, for example, when the next countdown coincides with . This deletion is equivalent to adding to the second signal and receive two consecutive matches.

  2. Extend the first signal by adding a countdown to the position corresponding to , and by shifting the remaining counts by one position. This addition is equivalent to deleting a mismatched one. from the second signal.

  3. Replace on in the first signal or, equivalently, delete and , and .

The edit distance is the total number of operations required to bring the two signals to match. This number is not unique. To calculate the minimum possible editing distance between and We proceed from the following facts:

  1. Two empty signals have zero distance between each other.

  2. The distance between the empty signal and the signal with counts are equal to because that is how many samples must be added to the empty signal in order to restore the other one. Similarly, — this is the number of samples that must be removed from the signal from count down to make it empty.

Create a matrix size on in such a way that

  1. ;

  2. for ;

  3. for ;

  4. For

The smallest editing distance between and in this case, it is equal to .

The conversion path via , resulting in this smallest transformation distance, is parameterized by two sequences of the same length, ix and iy, and represents a combination of moves of the "chess king":

  • Vertical passages: correspond to the removal of the reference from or adding a reference to . Each move increases the edit distance by 1;

  • Horizontal moves: correspond to the removal of the reference from or adding a reference to . Each move increases the edit distance by 1;

  • Diagonal moves: match if , or removing one sample from each signal if . Coincidences do not increase the distance. Deletions increase it by 1.

This structure ensures that any acceptable path aligns the full signals, does not skip samples, and does not repeat signal features. In addition, the desired path runs close to the diagonal line drawn between and . This is an additional limitation imposed by the argument maxsamp, ensures that segments of the same length are compared during the conversion.

The deterioration due to the coincidence of two samples does not depend on the difference in values between the samples. Two counts that differ slightly more than the tolerance are subject to the same degradation as two markedly different counts. For this reason, the editing distance is not affected by outliers. Conversely, repeating the samples to align the two signals has its costs, which is not the case with dynamic time conversion.

Literature

  1. Chen, Lei, M. Tamer Özsu, and Vincent Oria. «Robust and Fast Similarity Search for Moving Object Trajectories.» Proceedings of 24th ACM International Conference on management of Data (SIGMOD '05). 2005, pp. 491–502.

  2. Paliwal, K. K., Anant Agarwal, and Sarvajit S. Sinha. «A modification over Sakoe and Chiba’s Dynamic Time Warping Algorithm for Isolated Word Recognition.» Signal Processing. Vol. 4, 1982, pp. 329–333.

  3. Sakoe, Hiroaki, and Seibi Chiba. «Dynamic Programming Algorithm Optimization for Spoken Word Recognition.» IEEE® Transactions on Acoustics, Speech, and Signal Processing. Vol. ASSP-26, no. 1, 1978, pp. 43–49.