Building and training a neural network for handwritten digit recognition
In this example, we will consider data processing and training based on a neural network model for image classification. The MNIST dataset was selected as a set of observational objects, which contains 70,000 labeled images of handwritten numbers. The example uses a .csv format file in which the images are expanded as tabular data containing brightness values for each pixel.
Connecting libraries for data processing:
Pkg.add(["Colors", "CSV", "Flux", "Optimisers"])
using CSV, DataFrames
Loading data into a variable:
df = DataFrame(CSV.File("$(@__DIR__)/mnist_784.csv"));
Conclusion of the first five lines of the dataframe:
first(df,5)
Conclusion of the first five rows and the last column with data, which indicates which class the observation object belongs to.:
df[1:5,780:785]
Splitting the data set into a training and test sample in the ratio of 8 to 2:
X_train, y_train = Matrix(df[1:56000,1:784]), df[1:56000,785]
X_test, y__test = Matrix(df[56001:end,1:784]), df[56001:end,785]
Converting samples to formats acceptable for neural network processing:
X_train, X_test = convert(Matrix{Float32}, X_train), convert(Matrix{Float32}, X_test)
y_train, y__test = convert(Vector{Float32}, y_train), convert(Vector{Float32}, y__test)
Connecting a library for data visualization:
using Plots
Displaying an object and its class:
test_img = Vector(df[60000,1:784])
test_img = (reshape(test_img, 28, 28)) / 256
using Colors
println("Object class: ", df[60000,785])
plot(Gray.(test_img))
The final transformation of data for processing by a neural network:
X_train, X_test = X_train', X_test'
y_train, y__test = y_train', y__test'
Connecting the machine learning library:
using Flux, Optimisers;
Defining the structure of a neural network:
model = Chain(
Dense(784, 15,elu),
Dense(15, 10,sigmoid),
softmax
)
Test recognition result (before training the model):
Defining learning parameters:
learning_rate = 0.01f0
opt = Optimisers.Adam(learning_rate)
state = Optimisers.setup(opt, model)
function loss(model, x, y)
y_oh = Flux.onehotbatch(y, 0:9) # size (10, 1, N)
y_pred = model(x) # size (10, N)
# Adding a dimension to match y_oh
y_pred_reshaped = Flux.unsqueeze(y_pred, dims=2) # now (10, 1, N)
return Flux.mse(y_pred_reshaped, y_oh)
end
Defining a function to calculate the accuracy of the model:
function accuracy(model, X, y)
correct = 0
for i in 1:length(y)
# Input preparation: add a batch dimension
x_input = reshape(X[:, i], :, 1) # (features, 1)
# Model prediction
probs = model(x_input) # size (10, 1)
# Converting to a number
predicted_digit = argmax(probs)[1] - 1
# Comparison with the true label
if predicted_digit == y[i]
correct += 1
end
end
return correct / length(y)
end
The iterative learning process of the model:
loss_history = []
epochs = 100
for epoch in 1:epochs
# Calculating gradients
grads = gradient(model) do m
loss(m, X_train, y_train)
end
# Updating the model and status
state, model = Optimisers.update(state, model, grads[1])
# Calculation and preservation of losses
current_loss = loss(model, X_train, y_train)
push!(loss_history, current_loss)
# Calculating accuracy
acc = accuracy(model, X_test, y__test) * 100
# Logging
if epoch == 1 || epoch % 1 == 0
println("Epoch $epoch: Training Loss = $current_loss, Accuracy = $acc%")
end
end
Visualization of changes in the loss function at each training step:
plot((1:epochs), loss_history, title="Changing the loss function", xlabel="The learning step", ylabel="Loss function")
Displaying results:
number = 3000
test_img = Vector(df[56000+number,1:784])
test_img = (reshape(test_img, 28, 28)) / 256
using Colors
result = model(X_test[:,number])
println("Known object class: ", df[56000+number,785], "\n ", "The number recognized by the neural network: ", (findfirst(x -> x == maximum(result), result)-1))
plot(Gray.(test_img'))
Conclusion
In this example, pixel brightness data was preprocessed, and the neural network architecture, optimizer parameters, and loss function were determined.
The model was trained and showed a fairly accurate, but not perfect, class breakdown. To improve the quality of recognition, the neural network can be modified by changing the architecture of the layers and increasing the training sample.