Building and training a neural network for handwritten digit recognition¶

In this example we will consider data processing and training of a neural network model for image classification. The MNIST dataset, which contains 70000 labelled images of handwritten digits, is chosen as a set of observation objects. The example will use a .csv file in which the images are expanded as tabular data containing brightness values for each pixel.

Connection of libraries for data processing:¶

Pkg.add(["Colors", "CSV", "Flux"])

using CSV, DataFrames

Loading data into a variable:¶

df = DataFrame(CSV.File("$(@__DIR__)/mnist_784.csv"));

Conclusion the first five lines of the dataframe:¶

first(df,5)

Conclusion the first five rows and the last column of the dataframe, which tells you what class the object of observation belongs to:¶

df[1:5,780:785]

Dividing the data set into training and test samples at a ratio of 8 to 2:¶

X_train, y_train = Matrix(df[1:56000,1:784]), df[1:56000,785]
X_test, y__test = Matrix(df[56001:end,1:784]), df[56001:end,785]

([0 0 … 0 0; 0 0 … 0 0; … ; 0 0 … 0 0; 0 0 … 0 0], [1, 8, 5, 9, 8, 0, 3, 1, 3, 2  …  7, 8, 9, 0, 1, 2, 3, 4, 5, 6])

Conversion of samples into formats acceptable for neural network processing:¶

X_train, X_test = convert(Matrix{Float32}, X_train), convert(Matrix{Float32}, X_test)
y_train, y__test = convert(Vector{Float32}, y_train), convert(Vector{Float32}, y__test)

(Float32[5.0, 0.0, 4.0, 1.0, 9.0, 2.0, 1.0, 3.0, 1.0, 4.0  …  4.0, 0.0, 9.0, 0.0, 6.0, 1.0, 2.0, 2.0, 3.0, 3.0], Float32[1.0, 8.0, 5.0, 9.0, 8.0, 0.0, 3.0, 1.0, 3.0, 2.0  …  7.0, 8.0, 9.0, 0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0])

Connecting a library for data visualisation:¶

using Plots

Displaying an object and its class:¶

test_img = Vector(df[60000,1:784])
test_img = (reshape(test_img, 28, 28)) / 256
using Colors
println("Класс объекта: ", df[60000,785])
plot(Gray.(test_img))

Класс объекта: 8

Final data transformation for neural network processing:¶

X_train, X_test = X_train', X_test'
y_train, y__test = y_train', y__test'

(Float32[5.0 0.0 … 3.0 3.0], Float32[1.0 8.0 … 5.0 6.0])

Connecting a machine learning library:¶

using Flux
using Flux: train!

Defining the structure of a neural network:¶

model = Chain(
    Dense(784, 15,elu),
    Dense(15, 10,sigmoid),
    softmax
)

Chain(
  Dense(784 => 15, elu),                # 11_775 parameters
  Dense(15 => 10, σ),                   # 160 parameters
  NNlib.softmax,
)                   # Total: 4 arrays, 11_935 parameters, 46.871 KiB.

Test result of recognition (before training the model):¶

predict = model(X_test)

10×14000 Matrix{Float32}:
 0.0729828  0.133842   0.0592658  …  0.157915   0.133842   0.0600273
 0.0729828  0.133842   0.0592658     0.0580938  0.133842   0.0588685
 0.198388   0.133842   0.161101      0.0768005  0.133842   0.0588685
 0.0729828  0.0492376  0.0592658     0.0580938  0.0492376  0.160021
 0.0729828  0.0492376  0.0592658     0.0580938  0.0492376  0.0588685
 0.154932   0.0492376  0.161101   …  0.0591634  0.0492376  0.160021
 0.0729923  0.133842   0.0592659     0.157915   0.133842   0.160021
 0.0729828  0.133842   0.0592658     0.0580938  0.133842   0.0588747
 0.0729855  0.133842   0.161101      0.157915   0.133842   0.0644081
 0.135789   0.0492376  0.161101      0.157915   0.0492376  0.160021

Determination of training parameters:¶

loss(x, y) = Flux.mse(model(x), reshape(Flux.onehotbatch(y_train,0:9), 10, 56000)) # функция потерь

ps = Flux.params(model) # указание на параметры корректируемые в процессе обучения

opt = Adam(0.01) # выбор оптимизатора

Adam(0.01, (0.9, 0.999), 1.0e-8, IdDict{Any, Any}())

Defining a function to calculate the accuracy of the model:¶

function accuracy()
    correct = 0
    for index in 1:length(y__test)
        probs = model(Flux.unsqueeze(X_test[:,index],dims=3))
        predicted_digit = argmax(probs)[1]-1
        if predicted_digit == y__test'[index]
            correct +=1
        end
    end
    return correct/length(y__test')
end

accuracy (generic function with 1 method)

Iterative model training process:¶

loss_history = [] # определение пустого массива для записи функции потерь

epochs = 50 # определение количества шагов обучения модели

for epoch in 1:epochs # повторение обучения модели на каждом шаге
    train!(loss, ps, [(X_train, y_train)], opt) # обучающая функция, корректирующая параметры модели
    train_loss = loss(X_train, y_train) # расчёт функции потерь на текущем шаге
    push!(loss_history, train_loss) # запись функции потерь
    acc = accuracy() * 100
    if epoch == 1 # условие для отображения функции потерь на каждом шаге обучения
        println("Epoch = $epoch : Training Loss = $train_loss, Model Accuracy = $acc %");
    elseif epoch % 1 == 0
        println("Epoch = $epoch : Training Loss = $train_loss, Model Accuracy = $acc %");
    end
end

Epoch = 1 : Training Loss = 0.08882678, Model Accuracy = 21.357142857142858 %
Epoch = 2 : Training Loss = 0.08633855, Model Accuracy = 22.87857142857143 %
Epoch = 3 : Training Loss = 0.08413743, Model Accuracy = 29.114285714285714 %
Epoch = 4 : Training Loss = 0.082765914, Model Accuracy = 34.31428571428572 %
Epoch = 5 : Training Loss = 0.08176625, Model Accuracy = 36.614285714285714 %
Epoch = 6 : Training Loss = 0.08065751, Model Accuracy = 38.121428571428574 %
Epoch = 7 : Training Loss = 0.079435244, Model Accuracy = 45.050000000000004 %
Epoch = 8 : Training Loss = 0.078252606, Model Accuracy = 55.16428571428571 %
Epoch = 9 : Training Loss = 0.07740078, Model Accuracy = 61.79285714285714 %
Epoch = 10 : Training Loss = 0.07679748, Model Accuracy = 66.17857142857143 %
Epoch = 11 : Training Loss = 0.07649854, Model Accuracy = 69.19999999999999 %
Epoch = 12 : Training Loss = 0.07618407, Model Accuracy = 70.92857142857143 %
Epoch = 13 : Training Loss = 0.07574901, Model Accuracy = 72.02142857142857 %
Epoch = 14 : Training Loss = 0.075383395, Model Accuracy = 72.48571428571428 %
Epoch = 15 : Training Loss = 0.07500039, Model Accuracy = 73.20714285714286 %
Epoch = 16 : Training Loss = 0.07469048, Model Accuracy = 73.08571428571429 %
Epoch = 17 : Training Loss = 0.07434183, Model Accuracy = 73.94285714285715 %
Epoch = 18 : Training Loss = 0.07392979, Model Accuracy = 75.37857142857143 %
Epoch = 19 : Training Loss = 0.073610745, Model Accuracy = 76.27142857142857 %
Epoch = 20 : Training Loss = 0.07343323, Model Accuracy = 76.24285714285715 %
Epoch = 21 : Training Loss = 0.07315283, Model Accuracy = 76.32857142857142 %
Epoch = 22 : Training Loss = 0.07284213, Model Accuracy = 76.21428571428571 %
Epoch = 23 : Training Loss = 0.07260198, Model Accuracy = 75.86428571428571 %
Epoch = 24 : Training Loss = 0.0723972, Model Accuracy = 76.44285714285715 %
Epoch = 25 : Training Loss = 0.07218366, Model Accuracy = 78.10000000000001 %
Epoch = 26 : Training Loss = 0.072051615, Model Accuracy = 79.37857142857143 %
Epoch = 27 : Training Loss = 0.07194763, Model Accuracy = 80.12142857142858 %
Epoch = 28 : Training Loss = 0.07184025, Model Accuracy = 80.92857142857143 %
Epoch = 29 : Training Loss = 0.07170713, Model Accuracy = 81.10714285714286 %
Epoch = 30 : Training Loss = 0.0715578, Model Accuracy = 81.35 %
Epoch = 31 : Training Loss = 0.071458206, Model Accuracy = 81.42857142857143 %
Epoch = 32 : Training Loss = 0.071334094, Model Accuracy = 81.78571428571428 %
Epoch = 33 : Training Loss = 0.07123414, Model Accuracy = 82.19285714285715 %
Epoch = 34 : Training Loss = 0.071115755, Model Accuracy = 82.39285714285714 %
Epoch = 35 : Training Loss = 0.07096117, Model Accuracy = 82.54285714285714 %
Epoch = 36 : Training Loss = 0.07084565, Model Accuracy = 82.62142857142857 %
Epoch = 37 : Training Loss = 0.070747726, Model Accuracy = 82.8 %
Epoch = 38 : Training Loss = 0.07065126, Model Accuracy = 83.22857142857143 %
Epoch = 39 : Training Loss = 0.07053765, Model Accuracy = 83.71428571428572 %
Epoch = 40 : Training Loss = 0.07047585, Model Accuracy = 83.82857142857144 %
Epoch = 41 : Training Loss = 0.070413895, Model Accuracy = 83.76428571428572 %
Epoch = 42 : Training Loss = 0.07036532, Model Accuracy = 83.85714285714285 %
Epoch = 43 : Training Loss = 0.07026137, Model Accuracy = 84.17857142857143 %
Epoch = 44 : Training Loss = 0.07019601, Model Accuracy = 84.37142857142858 %
Epoch = 45 : Training Loss = 0.070140265, Model Accuracy = 84.17857142857143 %
Epoch = 46 : Training Loss = 0.07010393, Model Accuracy = 83.99285714285715 %
Epoch = 47 : Training Loss = 0.07005022, Model Accuracy = 83.96428571428571 %
Epoch = 48 : Training Loss = 0.06997585, Model Accuracy = 84.16428571428571 %
Epoch = 49 : Training Loss = 0.06992216, Model Accuracy = 84.42857142857143 %
Epoch = 50 : Training Loss = 0.06987861, Model Accuracy = 85.0142857142857 %

Visualisation of the change in the loss function at each training step:¶

plot((1:epochs), loss_history, title="Изменение функции потерь", xlabel="Шаг обучения", ylabel="Функция потерь")

Displaying results:¶

number = 13000
test_img = Vector(df[56000+number,1:784])
test_img = (reshape(test_img, 28, 28)) / 256
using Colors
result = model(X_test[:,number])
println("Известный класс объекта: ", df[56000+number,785], "\n  ", "Вектор, характеризующий класс объекта: ", result)
plot(Gray.(test_img))

Известный класс объекта: 0
  Вектор, характеризующий класс объекта: Float32[0.23196934, 0.08533675, 0.08533675, 0.08533675, 0.08533675, 0.08533675, 0.08533675, 0.08533675, 0.08533675, 0.08533675]

Conclusion¶

In this example, the pixel brightness data was preprocessed and the neural network architecture, optimiser parameters and loss function were defined. The model was trained and showed reasonably accurate, but not perfect class partitioning. To improve the recognition quality, the neural network can be modified by changing the architecture of layers and increasing the training sample.

	pixel1	pixel2	pixel3	pixel4	pixel5	pixel6	pixel7	pixel8	pixel9	pixel10
	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64	Int64
1	0	0	0	0	0	0	0	0	0	0
2	0	0	0	0	0	0	0	0	0	0
3	0	0	0	0	0	0	0	0	0	0
4	0	0	0	0	0	0	0	0	0	0
5	0	0	0	0	0	0	0	0	0	0

	pixel780	pixel781	pixel782	pixel783	pixel784	class
	Int64	Int64	Int64	Int64	Int64	Int64
1	0	0	0	0	0	5
2	0	0	0	0	0	0
3	0	0	0	0	0	4
4	0	0	0	0	0	1
5	0	0	0	0	0	9