Engee documentation
Notebook

Linear Discriminant Analysis (LDA)

In this example, we will consider the application of linear discriminant analysis (LDA) to the "Irises of Fischer" dataset.A comparison with the principal component analysis (PCA) will also be conducted.

Linear discriminant analysis (LDA) is a statistical analysis method that allows you to find a linear combination of features to divide observations into two classes.

In [ ]:
Pkg.add(["MultivariateStats", "RDatasets"])
   Resolving package versions...
  No Changes to `~/.project/Project.toml`
  No Changes to `~/.project/Manifest.toml`

Let's assume that the samples of positive and negative classes have average values:

(for the positive class),

(for the negative class),
as well as covariance matrices and .

According to the Fisher criterion for linear discriminant, the optimal projection direction is given by the formula:

where — an arbitrary non-negative coefficient.

Installing and connecting the necessary libraries:

In [ ]:
using MultivariateStats, RDatasets

Downloading data from the Fischer's Iris dataset:

In [ ]:
iris = dataset("datasets", "iris")
Out[0]:
150×5 DataFrame
125 rows omitted
RowSepalLengthSepalWidthPetalLengthPetalWidthSpecies
Float64Float64Float64Float64Cat…
15.13.51.40.2setosa
24.93.01.40.2setosa
34.73.21.30.2setosa
44.63.11.50.2setosa
55.03.61.40.2setosa
65.43.91.70.4setosa
74.63.41.40.3setosa
85.03.41.50.2setosa
94.42.91.40.2setosa
104.93.11.50.1setosa
115.43.71.50.2setosa
124.83.41.60.2setosa
134.83.01.40.1setosa
1396.03.04.81.8virginica
1406.93.15.42.1virginica
1416.73.15.62.4virginica
1426.93.15.12.3virginica
1435.82.75.11.9virginica
1446.83.25.92.3virginica
1456.73.35.72.5virginica
1466.73.05.22.3virginica
1476.32.55.01.9virginica
1486.53.05.22.0virginica
1496.23.45.42.3virginica
1505.93.05.11.8virginica

Extracting a matrix of observational objects with features from a dataset - X and the class vectors of these objects - X_labels:

In [ ]:
X = Matrix(iris[1:2:end,1:4])'
X_labels = Vector(iris[1:2:end,5])
Out[0]:
75-element Vector{CategoricalArrays.CategoricalValue{String, UInt8}}:
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 ⋮
 "virginica"
 "virginica"
 "virginica"
 "virginica"
 "virginica"
 "virginica"
 "virginica"
 "virginica"
 "virginica"
 "virginica"
 "virginica"
 "virginica"

Let's compare linear discriminant analysis with the PCA (principal component method) method.

Learning the PCA model:

In [ ]:
pca = fit(PCA, X; maxoutdim=2)
Out[0]:
PCA(indim = 4, outdim = 2, principalratio = 0.9741445733283195)

Pattern matrix (unstandardized loadings):
────────────────────────
         PC1         PC2
────────────────────────
1   0.70954    0.344711
2  -0.227592   0.29865
3   1.77976   -0.0797511
4   0.764206  -0.0453779
────────────────────────

Importance of components:
──────────────────────────────────────────────
                                PC1        PC2
──────────────────────────────────────────────
SS Loadings (Eigenvalues)  4.3068    0.216437
Variance explained         0.927532  0.0466128
Cumulative variance        0.927532  0.974145
Proportion explained       0.95215   0.04785
Cumulative proportion      0.95215   1.0
──────────────────────────────────────────────

Applying PCA to data:

In [ ]:
Ypca = predict(pca, X)
Out[0]:
2×75 Matrix{Float64}:
 2.71359    2.90321   2.75875   …  -2.39001   -1.51972   -1.87717
 0.238246  -0.233575  0.228345      0.333917  -0.297498   0.0985705

Learning the LDA model:

In [ ]:
lda = fit(MulticlassLDA, X, X_labels; outdim=2);

Applying LDA to Data:

In [ ]:
Ylda = predict(lda, X)
Out[0]:
2×75 Matrix{Float64}:
 -0.758539  -0.685016  -0.773267  …   0.976876   0.790049   0.84761
 -0.766144  -0.703192  -0.79546      -1.0143    -0.682331  -1.01696

Visualization of results:

In [ ]:
using Plots
p = plot(layout=(1,2), size=(800,300))

for s in ["setosa", "versicolor", "virginica"]

    points = Ypca[:,X_labels.==s]
    scatter!(p[1], points[1,:],points[2,:], label=s)
    points = Ylda[:,X_labels.==s]
    scatter!(p[2], points[1,:],points[2,:], label=false, legend=:bottomleft)

end
plot!(p[1], title="PCA")
plot!(p[2], title="LDA")
Out[0]:

Conclusions:

PCA and LDA are dimensionality reduction methods with different goals: PCA maximizes global data variance and is suitable for visualization without class labels, while LDA optimizes class separation using label information, which makes it effective for classification tasks. In the example with Fischer's Irises, LDA provided a clear separation of classes in the projection, while PCA retained the general data structure, but with overlapping classes. The choice of method depends on the task: PCA — for data analysis, LDA — for improving classification in the presence of marked classes.