Linear discriminant analysis (LDA)¶
This case study will examine the application of Linear Discriminant Analysis (LDA) to the Fisher's Iris dataset.A comparison with the Principal Component Analysis (PCA) method will also be made.
Linear discriminant analysis (LDA) is a statistical analysis technique that finds a linear combination of features to separate observations into two classes.
Pkg.add(["MultivariateStats", "RDatasets"])
Suppose that the samples of positive and negative classes have mean values:
$\mu_p$ (for the positive class),
$\mu_n$ (for the negative class), and covariance matrices $C_p$ and $C_n$.
According to Fisher's criterion for the linear discriminant, the optimal projection direction is given by the formula: $$w = \alpha \cdot (C_p + C_n)^{-1} (\mu_p - \mu_n),$$ Where $\alpha$ is an arbitrary non-negative coefficient.
Installation and connection of necessary libraries:
using MultivariateStats, RDatasets
Loading data from the Iris Fischer dataset:
iris = dataset("datasets", "iris")
Extracting from the dataset the matrix of observation objects with features - X
and the vector of classes of these objects - X_labels
:
X = Matrix(iris[1:2:end,1:4])'
X_labels = Vector(iris[1:2:end,5])
Let's compare linear discriminant analysis with PCA method (principal component analysis).
Training the PCA model:
pca = fit(PCA, X; maxoutdim=2)
Applying PCA to data:
Ypca = predict(pca, X)
Training the LDA model:
lda = fit(MulticlassLDA, X, X_labels; outdim=2);
Applying LDA to data:
Ylda = predict(lda, X)
Visualising the results:¶
using Plots
p = plot(layout=(1,2), size=(800,300))
for s in ["setosa", "versicolor", "virginica"]
points = Ypca[:,X_labels.==s]
scatter!(p[1], points[1,:],points[2,:], label=s)
points = Ylda[:,X_labels.==s]
scatter!(p[2], points[1,:],points[2,:], label=false, legend=:bottomleft)
end
plot!(p[1], title="PCA")
plot!(p[2], title="LDA")
Conclusions:¶
PCA and LDA are dimensionality reduction methods with different goals: PCA maximises the global variance of the data and is suitable for visualisation without considering class labels, whereas LDA optimises class separation using label information, making it effective for classification tasks. In the Fisher's Iris example, LDA provided a clear separation of classes in the projection, while PCA maintained the overall structure of the data but with overlapping classes. The choice of method depends on the task: PCA for exploring the data, LDA for improving classification in the presence of labelled classes.