Engee documentation
Notebook

Finding key points with the ORB detector

ORB detector is used for video stabilisation, panorama creation, object detection, 3D reconstruction, camera motion estimation, image alignment and localisation for augmented reality. It is also used to recognise locations such as buildings or landmarks through a keypoint database.

Theoretical part

ORB (Oriented FAST and Rotated BRIEF) is a fast keypoint detector and descriptor. It combines the algorithms FAST (for keypoint detection) and BRIEF (for keypoint descriptor).

The FAST (Features from Accelerated Segment Test) algorithm is used to find corners in an image. Harris Corner Measure (Harris Corner Measure) is used to select the most significant points to reduce the number of weak key points. Each key point is computed with an orientation. This makes ORB invariant to image rotations.

The BRIEF (Binary Robust Independent Elementary Features) algorithm creates binary descriptors. To compare points between images, ORB uses pairwise comparison of binary descriptors with the Hamming metric.

In this example, the classical use of the ORB detector will be discussed to find key points in two images and compare them. Based on the computed homography, the images will be aligned and merged, resulting in a panorama demonstrating the process of combining the two images into a single image.

Connecting the necessary packages for working with images

In [ ]:
import Pkg; 
Pkg.add("Images")
Pkg.add("ImageFeatures")
Pkg.add("ImageDraw")
Pkg.add("ImageProjectiveGeometry")
Pkg.add("LinearAlgebra")
Pkg.add("OffsetArrays")
Pkg.add("ImageTransformations")
Pkg.add("StaticArrays");
In [ ]:
using Pkg, Images, ImageFeatures, ImageDraw
using ImageProjectiveGeometry
using LinearAlgebra, OffsetArrays
using ImageTransformations, StaticArrays

Uploading images

Specify the path to the images, upload and display them:

In [ ]:
path_to_img_1 = "$(@__DIR__)/1.jpg"   
path_to_img_2 = "$(@__DIR__)/2.jpg"   

Image_1 = load(path_to_img_1)
Image_2 = load(path_to_img_2)
Out[0]:
No description has been provided for this image

ORB application

Initialise the discriminator

In [ ]:
orb_params = ORB(num_keypoints = 1000);

Let's define a function that uses the discriminator to find key points in the image

In [ ]:
function find_features(img::AbstractArray, orb_params)
    desc, ret_features = create_descriptor(Gray.(img), orb_params)
end;

Let's define a function that searches for common keypoints in two images. First, the key points and descriptors for each image are calculated, then the descriptors are matched using a given threshold.

In [ ]:
function match_points(img1::AbstractArray, img2::AbstractArray, orb_params, threshold::Float64=0.1)
    desc_1, ret_features_1 = find_features(img1, orb_params)
    desc_2, ret_features_2 = find_features(img2, orb_params)
    matches = match_keypoints(ret_features_1, ret_features_2, desc_1, desc_2, threshold);
end;

Let's find the matching key points

In [ ]:
matches = match_points(Image_1, Image_2, orb_params, 0.35);

Let's define a function that visualises the relationship of key points. The function pad_display places images side by side, horizontally on the same canvas. draw_matches draws lines that connect common key points

In [ ]:
function pad_display(img1, img2)
    img1h = length(axes(img1, 1))
    img2h = length(axes(img2, 1))
    mx = max(img1h, img2h);

    hcat(vcat(img1, zeros(RGB{Float64},
                max(0, mx - img1h), length(axes(img1, 2)))),
        vcat(img2, zeros(RGB{Float64},
                max(0, mx - img2h), length(axes(img2, 2)))))
end

function draw_matches(img1, img2, matches)
    grid = pad_display(parent(img1), parent(img2));
    offset = CartesianIndex(0, size(img1, 2));
    for m in matches
        draw!(grid, LineSegment(m[1], m[2] + offset))
    end
    grid
end;

Let's see what we have in the output

In [ ]:
draw_matches(Image_1, Image_2, matches)
Out[0]:
No description has been provided for this image

Building a panorama

Calculating homography (geometric transformation) requires the coordinates of points that correspond to each other in two images.

Here, the coordinates of points for two images are extracted from the list of matched keypoints matches. x1 - are the coordinates from the first image, x2 are the coordinates from the second image.

In [ ]:
x1 = hcat([Float64[m[1].I[1], m[1].I[2]] for m in matches]...)  # 2xN
x2 = hcat([Float64[m[2].I[1], m[2].I[2]] for m in matches]...);  # 2xN

Here, the RANSAC method is used to find the homography matrix H, which aligns the two groups of points x1 and x2 as closely as possible.

In [ ]:
t = 0.01 

# Вычисление гомографии с помощью RANSAC
H, inliers = ransacfithomography(x1, x2, t);

t - is a threshold that determines how much the points can differ after applying homography to be considered "correct" (inliers).

Structure Homography to represent the homography matrix. It contains a 3x3 matrix that defines the transformation between two images.

In [ ]:
struct Homography{T}
    m::SMatrix{3, 3, T, 9}
end

Homography(m::AbstractMatrix{T}) where {T} = Homography{T}(SMatrix{3, 3, T, 9}(m))
function (trans::Homography{M})(x::SVector{3}) where M
    out = trans.m * x
    out = out / out[end]
    SVector{2}(out[1:2])
end
function (trans::Homography{M})(x::SVector{2}) where M
    trans(SVector{3}([x[1], x[2], 1.0]))
end
function (trans::Homography{M})(x::CartesianIndex{2}) where M
    trans(SVector{3}([collect(x.I)..., 1]))
end
function (trans::Homography{M})(x::Tuple{Int, Int}) where M
    trans(CartesianIndex(x))
end
function (trans::Homography{M})(x::Array{CartesianIndex{2}, 1}) where M
    CartesianIndex{2}.([tuple(y...) for y in trunc.(Int, collect.(trans.(x)))])
end
function Base.inv(trans::Homography{M}) where M
    i = inv(trans.m)
    Homography(i ./ i[end])
end

Use the function warp, to apply the homography matrix H to the image Image_1, creating a transformed image new_img.

In [ ]:
new_img = warp(Image_1, Homography(H))
Out[0]:
No description has been provided for this image

The function combines the two images - the original and the transformed image - into one common canvas, where the new image is added on top of the old one.

In [ ]:
function merge_images(img1, new_img)
    # Вычисляем размеры холста
    axis1_size = max(last(axes(new_img, 1)), size(img1, 1)) - min(first(axes(new_img, 1)), 1) + 1
    axis2_size = max(last(axes(new_img, 2)), size(img1, 2)) - min(first(axes(new_img, 2)), 1) + 1

    # Создаем OffsetArray для объединённого изображения
    combined_image = OffsetArray(
        zeros(RGB{N0f8}, axis1_size, axis2_size), (
            min(0, first(axes(new_img, 1))),
            min(0, first(axes(new_img, 2)))))

    # Копируем первое изображение в общий холст
    combined_image[1:size(img1, 1), 1:size(img1, 2)] .= img1

    # Копируем второе изображение в общий холст
    for i in axes(new_img, 1)
        for j in axes(new_img, 2)
            if new_img[i, j] != colorant"black"  # Пропускаем чёрные пиксели
                combined_image[i, j] = new_img[i, j]
            end
        end
    end

    combined_image
end
Out[0]:
merge_images (generic function with 1 method)
In [ ]:
panorama = merge_images(Image_1, new_img)
Out[0]:
No description has been provided for this image

Conclusions

The example demonstrated the use of an ORB detector to automatically find key points and match them on two images. The constructed homography allowed the images to be aligned, and combining them made it possible to create a panorama.