Building and training a neural network for handwritten digit recognition
In this example, we will consider data processing and training based on a neural network model for image classification. The MNIST dataset was selected as a set of observational objects, which contains 70,000 labeled images of handwritten numbers. The example uses a .csv format file in which the images are expanded as tabular data containing brightness values for each pixel.
Connecting libraries for data processing:
Pkg.add(["Colors", "CSV", "Flux", "Optimisers"])
using CSV, DataFrames
Loading data into a variable:
df = DataFrame(CSV.File("$(@__DIR__)/mnist_784.csv"));
Conclusion of the first five lines of the dataframe:
first(df,5)
Conclusion of the first five rows and the last column with data, which indicates which class the observation object belongs to.:
df[1:5,780:785]
Splitting the data set into a training and test sample in an 8 to 2 ratio:
X_train, y_train = Matrix(df[1:56000,1:784]), df[1:56000,785]
X_test, y__test = Matrix(df[56001:end,1:784]), df[56001:end,785]
Converting samples to formats acceptable for neural network processing:
X_train, X_test = convert(Matrix{Float32}, X_train), convert(Matrix{Float32}, X_test)
y_train, y__test = convert(Vector{Float32}, y_train), convert(Vector{Float32}, y__test)
Connecting a library for data visualization:
using Plots
Displaying an object and its class:
test_img = Vector(df[60000,1:784])
test_img = (reshape(test_img, 28, 28)) / 256
using Colors
println("Класс объекта: ", df[60000,785])
plot(Gray.(test_img))
The final transformation of data for processing by a neural network:
X_train, X_test = X_train', X_test'
y_train, y__test = y_train', y__test'
Connecting the machine learning library:
using Flux, Optimisers;
Defining the structure of a neural network:
model = Chain(
Dense(784, 15,elu),
Dense(15, 10,sigmoid),
softmax
)
Test recognition result (before training the model):
Defining learning parameters:
learning_rate = 0.01f0
opt = Optimisers.Adam(learning_rate)
state = Optimisers.setup(opt, model)
function loss(model, x, y)
y_oh = Flux.onehotbatch(y, 0:9) # размер (10, 1, N)
y_pred = model(x) # размер (10, N)
# Добавляем размерность для совпадения с y_oh
y_pred_reshaped = Flux.unsqueeze(y_pred, dims=2) # теперь (10, 1, N)
return Flux.mse(y_pred_reshaped, y_oh)
end
Defining a function to calculate the accuracy of the model:
function accuracy(model, X, y)
correct = 0
for i in 1:length(y)
# Подготовка входа: добавить измерение батча
x_input = reshape(X[:, i], :, 1) # (features, 1)
# Предсказание модели
probs = model(x_input) # размер (10, 1)
# Преобразование в цифру
predicted_digit = argmax(probs)[1] - 1
# Сравнение с истинной меткой
if predicted_digit == y[i]
correct += 1
end
end
return correct / length(y)
end
The iterative learning process of the model:
loss_history = []
epochs = 100
for epoch in 1:epochs
# Вычисление градиентов
grads = gradient(model) do m
loss(m, X_train, y_train)
end
# Обновление модели и состояния
state, model = Optimisers.update(state, model, grads[1])
# Расчет и сохранение потерь
current_loss = loss(model, X_train, y_train)
push!(loss_history, current_loss)
# Расчет точности
acc = accuracy(model, X_test, y__test) * 100
# Логирование
if epoch == 1 || epoch % 1 == 0
println("Epoch $epoch: Training Loss = $current_loss, Accuracy = $acc%")
end
end
Visualization of changes in the loss function at each training step:
plot((1:epochs), loss_history, title="Изменение функции потерь", xlabel="Шаг обучения", ylabel="Функция потерь")
Displaying results:
number = 3000
test_img = Vector(df[56000+number,1:784])
test_img = (reshape(test_img, 28, 28)) / 256
using Colors
result = model(X_test[:,number])
println("Известный класс объекта: ", df[56000+number,785], "\n ", "Распознанная нейросетью цифра: ", (findfirst(x -> x == maximum(result), result)-1))
plot(Gray.(test_img'))
Conclusion
In this example, pixel brightness data was preprocessed, and the neural network architecture, optimizer parameters, and loss function were determined.
The model was trained and showed a fairly accurate, but not perfect, class breakdown. To improve the quality of recognition, the neural network can be modified by changing the architecture of the layers and increasing the training sample.