手写数字识别神经网络的构建与训练
在这个例子中,我们将考虑基于神经网络模型进行图像分类的数据处理和训练。 MNIST数据集被选为一组观察对象,其中包含70,000张手写数字的标记图像。 该示例使用**。csv**格式文件,其中图像被扩展为包含每个像素的亮度值的表格数据。
连接用于数据处理的库:
In [ ]:
Pkg.add(["Colors", "CSV", "Flux", "Optimisers"])
In [ ]:
using CSV, DataFrames
将数据加载到变量中:
In [ ]:
df = DataFrame(CSV.File("$(@__DIR__)/mnist_784.csv"));
Dataframe前五行的输出:
In [ ]:
first(df,5)
Out[0]:
带数据的前五行和最后一列的输出,该数据指示观察对象属于哪个类。:
In [ ]:
df[1:5,780:785]
Out[0]:
以8比2的比例将数据集拆分为训练和测试样本:
In [ ]:
X_train, y_train = Matrix(df[1:56000,1:784]), df[1:56000,785]
X_test, y__test = Matrix(df[56001:end,1:784]), df[56001:end,785]
Out[0]:
将样本转换为神经网络处理可接受的格式:
In [ ]:
X_train, X_test = convert(Matrix{Float32}, X_train), convert(Matrix{Float32}, X_test)
y_train, y__test = convert(Vector{Float32}, y_train), convert(Vector{Float32}, y__test)
Out[0]:
连接用于数据可视化的库:
In [ ]:
using Plots
显示对象及其类:
In [ ]:
test_img = Vector(df[60000,1:784])
test_img = (reshape(test_img, 28, 28)) / 256
using Colors
println("Класс объекта: ", df[60000,785])
plot(Gray.(test_img))
Out[0]:
由神经网络处理的数据的最终转换:
In [ ]:
X_train, X_test = X_train', X_test'
y_train, y__test = y_train', y__test'
Out[0]:
连接机器学习库:
In [ ]:
using Flux, Optimisers;
定义神经网络的结构:
In [ ]:
model = Chain(
Dense(784, 15,elu),
Dense(15, 10,sigmoid),
softmax
)
Out[0]:
测试识别结果(训练模型前):
定义学习参数:
In [ ]:
learning_rate = 0.01f0
opt = Optimisers.Adam(learning_rate)
state = Optimisers.setup(opt, model)
function loss(model, x, y)
y_oh = Flux.onehotbatch(y, 0:9) # размер (10, 1, N)
y_pred = model(x) # размер (10, N)
# Добавляем размерность для совпадения с y_oh
y_pred_reshaped = Flux.unsqueeze(y_pred, dims=2) # теперь (10, 1, N)
return Flux.mse(y_pred_reshaped, y_oh)
end
Out[0]:
定义一个函数来计算模型的精度:
In [ ]:
function accuracy(model, X, y)
correct = 0
for i in 1:length(y)
# Подготовка входа: добавить измерение батча
x_input = reshape(X[:, i], :, 1) # (features, 1)
# Предсказание модели
probs = model(x_input) # размер (10, 1)
# Преобразование в цифру
predicted_digit = argmax(probs)[1] - 1
# Сравнение с истинной меткой
if predicted_digit == y[i]
correct += 1
end
end
return correct / length(y)
end
Out[0]:
模型的迭代学习过程:
In [ ]:
loss_history = []
epochs = 100
for epoch in 1:epochs
# Вычисление градиентов
grads = gradient(model) do m
loss(m, X_train, y_train)
end
# Обновление модели и состояния
state, model = Optimisers.update(state, model, grads[1])
# Расчет и сохранение потерь
current_loss = loss(model, X_train, y_train)
push!(loss_history, current_loss)
# Расчет точности
acc = accuracy(model, X_test, y__test) * 100
# Логирование
if epoch == 1 || epoch % 1 == 0
println("Epoch $epoch: Training Loss = $current_loss, Accuracy = $acc%")
end
end
可视化每个训练步骤中损失函数的变化:
In [ ]:
plot((1:epochs), loss_history, title="Изменение функции потерь", xlabel="Шаг обучения", ylabel="Функция потерь")
Out[0]:
显示结果:
In [ ]:
number = 3000
test_img = Vector(df[56000+number,1:784])
test_img = (reshape(test_img, 28, 28)) / 256
using Colors
result = model(X_test[:,number])
println("Известный класс объекта: ", df[56000+number,785], "\n ", "Распознанная нейросетью цифра: ", (findfirst(x -> x == maximum(result), result)-1))
plot(Gray.(test_img'))
Out[0]:
结论
在本例中,对像素亮度数据进行预处理,并确定神经网络架构、优化器参数和损失函数。
该模型经过训练,显示出相当准确但并不完美的班级细分。 为了提高识别质量,可以通过改变层的体系结构和增加训练样本来修改神经网络。