Linear discriminant analysis (LDA)¶

This case study will examine the application of Linear Discriminant Analysis (LDA) to the Fisher's Iris dataset.A comparison with the Principal Component Analysis (PCA) method will also be made.

Linear discriminant analysis (LDA) is a statistical analysis technique that finds a linear combination of features to separate observations into two classes.

Pkg.add(["MultivariateStats", "RDatasets"])

   Resolving package versions...
  No Changes to `~/.project/Project.toml`
  No Changes to `~/.project/Manifest.toml`

Suppose that the samples of positive and negative classes have mean values:

$\mu_p$ (for the positive class),

$\mu_n$ (for the negative class), and covariance matrices $C_p$ and $C_n$.

According to Fisher's criterion for the linear discriminant, the optimal projection direction is given by the formula: $$w = \alpha \cdot (C_p + C_n)^{-1} (\mu_p - \mu_n),$$ Where $\alpha$ is an arbitrary non-negative coefficient.

Installation and connection of necessary libraries:

using MultivariateStats, RDatasets

Loading data from the Iris Fischer dataset:

iris = dataset("datasets", "iris")

Extracting from the dataset the matrix of observation objects with features - X and the vector of classes of these objects - X_labels:

X = Matrix(iris[1:2:end,1:4])'
X_labels = Vector(iris[1:2:end,5])

75-element Vector{CategoricalArrays.CategoricalValue{String, UInt8}}:
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 ⋮
 "virginica"
 "virginica"
 "virginica"
 "virginica"
 "virginica"
 "virginica"
 "virginica"
 "virginica"
 "virginica"
 "virginica"
 "virginica"
 "virginica"

Let's compare linear discriminant analysis with PCA method (principal component analysis).

Training the PCA model:

pca = fit(PCA, X; maxoutdim=2)

PCA(indim = 4, outdim = 2, principalratio = 0.9741445733283195)

Pattern matrix (unstandardized loadings):
────────────────────────
         PC1         PC2
────────────────────────
1   0.70954    0.344711
2  -0.227592   0.29865
3   1.77976   -0.0797511
4   0.764206  -0.0453779
────────────────────────

Importance of components:
──────────────────────────────────────────────
                                PC1        PC2
──────────────────────────────────────────────
SS Loadings (Eigenvalues)  4.3068    0.216437
Variance explained         0.927532  0.0466128
Cumulative variance        0.927532  0.974145
Proportion explained       0.95215   0.04785
Cumulative proportion      0.95215   1.0
──────────────────────────────────────────────

Applying PCA to data:

Ypca = predict(pca, X)

2×75 Matrix{Float64}:
 2.71359    2.90321   2.75875   …  -2.39001   -1.51972   -1.87717
 0.238246  -0.233575  0.228345      0.333917  -0.297498   0.0985705

Training the LDA model:

lda = fit(MulticlassLDA, X, X_labels; outdim=2);

Applying LDA to data:

Ylda = predict(lda, X)

2×75 Matrix{Float64}:
 -0.758539  -0.685016  -0.773267  …   0.976876   0.790049   0.84761
 -0.766144  -0.703192  -0.79546      -1.0143    -0.682331  -1.01696

Visualising the results:¶

using Plots
p = plot(layout=(1,2), size=(800,300))

for s in ["setosa", "versicolor", "virginica"]

    points = Ypca[:,X_labels.==s]
    scatter!(p[1], points[1,:],points[2,:], label=s)
    points = Ylda[:,X_labels.==s]
    scatter!(p[2], points[1,:],points[2,:], label=false, legend=:bottomleft)

end
plot!(p[1], title="PCA")
plot!(p[2], title="LDA")

Conclusions:¶

PCA and LDA are dimensionality reduction methods with different goals: PCA maximises the global variance of the data and is suitable for visualisation without considering class labels, whereas LDA optimises class separation using label information, making it effective for classification tasks. In the Fisher's Iris example, LDA provided a clear separation of classes in the projection, while PCA maintained the overall structure of the data but with overlapping classes. The choice of method depends on the task: PCA for exploring the data, LDA for improving classification in the presence of labelled classes.

Row	SepalLength	SepalWidth	PetalLength	PetalWidth	Species
	Float64	Float64	Float64	Float64	Cat…
1	5.1	3.5	1.4	0.2	setosa
2	4.9	3.0	1.4	0.2	setosa
3	4.7	3.2	1.3	0.2	setosa
4	4.6	3.1	1.5	0.2	setosa
5	5.0	3.6	1.4	0.2	setosa
6	5.4	3.9	1.7	0.4	setosa
7	4.6	3.4	1.4	0.3	setosa
8	5.0	3.4	1.5	0.2	setosa
9	4.4	2.9	1.4	0.2	setosa
10	4.9	3.1	1.5	0.1	setosa
11	5.4	3.7	1.5	0.2	setosa
12	4.8	3.4	1.6	0.2	setosa
13	4.8	3.0	1.4	0.1	setosa
⋮	⋮	⋮	⋮	⋮	⋮
139	6.0	3.0	4.8	1.8	virginica
140	6.9	3.1	5.4	2.1	virginica
141	6.7	3.1	5.6	2.4	virginica
142	6.9	3.1	5.1	2.3	virginica
143	5.8	2.7	5.1	1.9	virginica
144	6.8	3.2	5.9	2.3	virginica
145	6.7	3.3	5.7	2.5	virginica
146	6.7	3.0	5.2	2.3	virginica
147	6.3	2.5	5.0	1.9	virginica
148	6.5	3.0	5.2	2.0	virginica
149	6.2	3.4	5.4	2.3	virginica
150	5.9	3.0	5.1	1.8	virginica