Using a pre-trained ResNet neural network for image classification
Neural networks are a convenient flexible algorithm that can be trained in computing. With a successful learning process, the same neural network can be used in many different tasks. This, for example, is the famous family of neural networks. ResNet, created for image classification. Partially or entirely, such neural networks are used in a wide variety of tasks where you need to work with numerical representations of images, from graphical databases to style transfer.
In this example, we will run a neural network. ResNet small depth (18 layers) for image classification. We will show the entire chain of data preparation and processing of output information, which for any image will give us a more or less suitable text label characterizing the object depicted in the picture.
Preparatory work
An object Tape libraries Umlaut, at the moment, is the main container for calculations, where you can decompress a neural network from the format ONNX. This trigger mechanism is likely to be changed in the near future, as the library ONNX it is currently in the process of updating.
Pkg.add(["Umlaut", "ONNX"])
import Pkg; Pkg.add("Umlaut", io=devnull)
import Umlaut: Tape, play!
Of course, we will also need libraries built into Engee to work with the ONNX format, which often stores pre-trained neural networks, and a library for working with images.
using ONNX
using Images
Install the working folder
cd( @__DIR__ )
Classes of objects in ImageNet
The names of all classes of objects that our pre-trained neural network can recognize are represented as an ordered vector and loaded from the following file.
include( "imagenet_classes.jl" );
Input/output functions
Let's write three simple auxiliary functions for feeding data to the neural network and processing the results.:
- Image Upload: We scale it to size 244*244 (normalizing and cropping the edges while maintaining the proportions would be a welcome addition)
- Prediction sorting: The neural network returns a vector of numbers that indicate the probability that one or another class from the ImageNet dataset is observed in the image. Select
kThe most likely classes represented are - The shell for these functions allows you to upload an image in one action and output
kthe most likely predictions
# Загрузка изображения из файла
function imread(path::AbstractString; sz=(224,224))
img = Images.load(path);
img = imresize(img, sz);
x = convert(Array{Float32}, channelview(img))
# Заменим порядок слоев: CHW -> WHC
x = permutedims(x, (3, 2, 1))
return x
end
# Выдача индексов первых k предсказаний
function maxk(a, k)
b = partialsortperm(a, 1:k, rev=true)
return collect(zip(b, a[b]))
end
# Загрузка изображения и выдача десяти наиболее вероятных классов в убывающем порядке
function test_image(tape::Tape, path::AbstractString)
x = imread(path)
x = reshape(x, size(x)..., 1)
y = play!(tape, x)
y = reshape(y, size(y, 1))
top = maxk(y, 10)
classes = []
for (i, (idx, val)) in enumerate(top)
name = IMAGENET_CLASSES[idx - 1]
classes = [classes; "$i: $name ($val)"]
end
return join(classes, "\n")
end
Download the neural network ResNet18
We will use the neural network that is in the file with the extension *.onnx. There are libraries that allow you to create and load this neural network using even higher-level commands (for example, the Metalhead library.jl from the collection FluxML), but for now we will do it without additional libraries.
The pre-trained neural network is already located in the specified directory, so the command will be executed without downloading it again.
path = "resnet18.onnx"
if !isfile(path)
download("https://github.com/onnx/models/raw/main/vision/classification/resnet/model/resnet18-v1-7.onnx", path)
end
# Создадим пустую матрицу на месте которой будет входное изображение
img = rand( Float32, 224, 224, 3, 1 )
# Загружаем модель в виде объекта Umlaut.Tape
resnet = ONNX.load( path, img );
Upload some images
If they have already been downloaded, they will not be downloaded again.
path = "data/"
goose_path = download( "https://upload.wikimedia.org/wikipedia/commons/3/3f/Snow_goose_2.jpg", path*"goose.jpg");
dog_path = download( "https://farm4.staticflickr.com/1301/4694470234_6f27a4f602_o.jpg", path*"dog.jpg");
plane_path = download( "https://upload.wikimedia.org/wikipedia/commons/thumb/c/c9/Rossiya%2C_RA-89043%2C_Sukhoi_Superjet_100-95B_%2851271265892%29.jpg/1024px-Rossiya%2C_RA-89043%2C_Sukhoi_Superjet_100-95B_%2851271265892%29.jpg", path*"plane.jpg");
Classifying images
display( load(plane_path)[1:5:end, 1:5:end] )
print( test_image( resnet, plane_path ))
display( load(goose_path)[1:5:end, 1:5:end] )
print( test_image( resnet, goose_path ))
display( load(dog_path)[1:5:end, 1:5:end] )
print( test_image( resnet, dog_path ))
Conclusion
We have shown that it is easy to download a neural network in Engee and use it to perform calculations.
This mechanism allows you to organize a complex information processing pipeline consisting of high-level components. In particular, to entrust some stages of information processing to pre-trained neural networks.
Bibliography: https://github.com/FluxML/ONNX.jl


