Engee documentation
Notebook

Types and transformation of images

The purpose of this demonstration is to show ways of specifying images and the basic principles of image transformations, especially relying on affine transformation.

In [ ]:
Pkg.add(["TestImages", "ImageShow"])
In [ ]:
using Images # Библиотека обработки изображений
using ImageShow # Библиотека отрисовки изображений
using TestImages # Библиотека тестовых изображений

Types of colour spaces

Any image is simply an array of pixel objects. The elements of an image are called pixels, and Julia Images treats pixels as first-class objects. For example, we have Gray-pixels in grayscale, RGB colour pixels, Lab colour pixels.

Let's start our analysis with the RGB format (an abbreviation formed from the English words red, green, blue - red, green, blue) - an additive colour model that describes the way colour is encoded for colour reproduction using three colours, which are commonly called primary colours. The choice of primary colours is conditioned by peculiarities of physiology of colour perception by the retina of our eye.

In [ ]:
img_rgb = [RGB(1.0, 0.0, 0.0), RGB(0.0, 1.0, 0.0), RGB(0.0, 0.0, 1.0)]
Out[0]:
No description has been provided for this image
In [ ]:
dump(img_rgb)
Array{RGB{Float64}}((3,))
  1: RGB{Float64}
    r: Float64 1.0
    g: Float64 0.0
    b: Float64 0.0
  2: RGB{Float64}
    r: Float64 0.0
    g: Float64 1.0
    b: Float64 0.0
  3: RGB{Float64}
    r: Float64 0.0
    g: Float64 0.0
    b: Float64 1.0

Gray - is a single-colour matrix describing images in shades of grey. By default, 8-bit colour coding is used.

In [ ]:
img_gray = rand(Gray, 3, 3)
Out[0]:
No description has been provided for this image
In [ ]:
dump(img_gray)
Array{Gray{Float64}}((3, 3))
  1: Gray{Float64}
    val: Float64 0.20548107200013277
  2: Gray{Float64}
    val: Float64 0.7190843096024139
  3: Gray{Float64}
    val: Float64 0.7675684128936745
  4: Gray{Float64}
    val: Float64 0.4840881004197154
  5: Gray{Float64}
    val: Float64 0.21892864471268947
  6: Gray{Float64}
    val: Float64 0.5776559337062981
  7: Gray{Float64}
    val: Float64 0.855915975791712
  8: Gray{Float64}
    val: Float64 0.7152103286181891
  9: Gray{Float64}
    val: Float64 0.9976538325486705

LAB is an abbreviation of the name of two different (though similar) colour spaces. The better known and more common one is CIELAB (more precisely, CIE 1976 Lab*), the other is Hunter Lab (more precisely, Hunter L, a, b). Lab is thus an informal abbreviation that does not unambiguously define the colour space. In Engee, when referring to Lab space, they mean CIELAB.

In [ ]:
img_lab = rand(Lab, 3, 3)
Out[0]:
No description has been provided for this image
In [ ]:
dump(img_gray)
Array{Gray{Float64}}((3, 3))
  1: Gray{Float64}
    val: Float64 0.20548107200013277
  2: Gray{Float64}
    val: Float64 0.7190843096024139
  3: Gray{Float64}
    val: Float64 0.7675684128936745
  4: Gray{Float64}
    val: Float64 0.4840881004197154
  5: Gray{Float64}
    val: Float64 0.21892864471268947
  6: Gray{Float64}
    val: Float64 0.5776559337062981
  7: Gray{Float64}
    val: Float64 0.855915975791712
  8: Gray{Float64}
    val: Float64 0.7152103286181891
  9: Gray{Float64}
    val: Float64 0.9976538325486705

Translation between object types

In [ ]:
Gray.(img_rgb) # RGB => Gray
Out[0]:
No description has been provided for this image
In [ ]:
RGB.(img_gray) # Gray => RGB
Out[0]:
No description has been provided for this image
In [ ]:
RGB.(img_lab) # Lab => RGB
Out[0]:
No description has been provided for this image

Image transformation

Let's start by loading an image from the .jpg file.

In [ ]:
img = load( "$(@__DIR__)/4028965.jpg" )
Out[0]:
No description has been provided for this image

Let's increase the contrast of the loaded image. The adjust_histogram(Equalisation(),...) function can process different types of input data. The returned image type corresponds to the input type. For colour images, the input is converted to type YIQ and the Y channel is equalised, after which it is combined with channels I and Q.

In [ ]:
alg = Equalization(nbins = 256)
img_adjusted = adjust_histogram(img, alg)
Out[0]:
No description has been provided for this image

Reduce the image size by a factor of 4 relative to the source image. Imresize allows resizing using relationships relative to the source image, as shown in the example below, and also allows resizing using manual dimensioning of the new image, for example:

imresize(img, (400, 400)).

In [ ]:
img_small = imresize(img_adjusted, ratio=1/4)
Out[0]:
No description has been provided for this image
In [ ]:
print(size(img_adjusted), " --> ", size(img_small))
(1148, 1243) --> (287, 311)

Affine transformation (from Latin affinis "touching, close, adjacent") is a mapping of a plane or space into itself, where parallel lines become parallel lines, intersecting lines become intersecting lines, crossing lines become crossing lines. Basic image transformations use a lattice of indices to operate on an image. The transformation is defined by the image transformation matrix according to the principle described in the picture below. image.png

In [ ]:
# Вспомогательная функция контроля размерностей
function C_B_V(x, max_val)
    x[x .> max_val - 1] .= max_val - 1
    x[x .< 1] .= 1
    return x
end
Out[0]:
C_B_V (generic function with 1 method)

Next, let us declare an affine image transformation function in which:

  1. theta is the transformation matrix;
  2. img is the input image;
  3. out_size is the dimensions of the output image;
  4. grid is the pixel indexing grid.
In [ ]:
function transform(theta, img, out_size)
    grid = grid = zeros(3, out_size[1]*out_size[2])
    grid[1, :] = reshape(((-1:2/(out_size[1]-1):1)*ones(1,out_size[2])), 1, size(grid,2))
    grid[2, :] = reshape((ones(out_size[1],1)*(-1:2/(out_size[2]-1):1)'), 1, size(grid,2))
    grid[3, :] = ones(Int, size(grid, 2))

    # Умножение theta на grid
    T_g = theta * grid

    # Вычисление координат x, y
    x = (T_g[1, :] .+ 1) .* (out_size[2]) / 2
    y = (T_g[2, :] .+ 1) .* (out_size[1]) / 2

    # Округление координат
    x0 = ceil.(x)
    x1 = x0 .+ 1
    y0 = ceil.(y)
    y1 = y0 .+ 1

    # Обрезание значений x0, x1, y0, y1
    x0 = C_B_V(x0, out_size[2])
    x1 = C_B_V(x1, out_size[2])
    y0 = C_B_V(y0, out_size[1])
    y1 = C_B_V(y1, out_size[1])

    # Вычисление базовых координат
    base_y0 = y0 .* out_size[1]
    base_y1 = y1 .* out_size[1]

    # Работа с изображением
    im_flat = reshape(img, :)

    # Обрабатываем координаты
    A = (x1 .- x) .* (y1 .- y) .* im_flat[Int.(base_y0 .+ x0 .+ 1)]
    B = (x1 .- x) .* (y .- y0) .* im_flat[Int.(base_y1 .+ x0 .+ 1)]
    C = (x .- x0) .* (y1 .- y) .* im_flat[Int.(base_y0 .+ x1 .+ 1)]
    D = (x .- x0) .* (y .- y0) .* im_flat[Int.(base_y1 .+ x1 .+ 1)]

    # Расчет результата
    result = reshape((A .+ B .+ C .+ D), (out_size[1], out_size[2]))
    return result
end
Out[0]:
transform (generic function with 1 method)

First, let's apply this function to a greyscale image.

In [ ]:
img_sg = Gray.(img_small)
Out[0]:
No description has been provided for this image

As we can see from the data below, grayscale images have 8-bit colour digits, and its dimensionality is represented only by width and height.

In [ ]:
dump(img_sg[1])
Gray{N0f8}
  val: N0f8
    i: UInt8 0x0e
In [ ]:
size(img_sg)
Out[0]:
(287, 311)

Let's set the transformation matrix for this image.

In [ ]:
theta = [2 0.3 0; -0.3 2 0]
Out[0]:
2×3 Matrix{Float64}:
  2.0  0.3  0.0
 -0.3  2.0  0.0

Let's apply our function to the image. As we can see, the size is halved and rotation is performed.

In [ ]:
img_transfor = transform(theta, img_sg, [size(img_sg,1),size(img_sg,2)])
Out[0]:
No description has been provided for this image

To obtain the inverse transformation, we find the inverse matrix from the transformation matrix and round it to the fourth digit.

In [ ]:
theta_inv = hcat(inv(theta[1:2,1:2]), [-0.1;0.1])
theta_inv = round.(theta_inv.*10^4)./10^4
Out[0]:
2×3 Matrix{Float64}:
 0.489   -0.0733  -0.1
 0.0733   0.489    0.1
In [ ]:
img_sg_new = transform(theta_inv, img_transfor, [size(img_transfor,1),size(img_transfor,2)])
Out[0]:
No description has been provided for this image

Now let's apply this function to an RGB image. First, we will convert RGB images to the channel representation. Let's analyse the possibilities that are opened for us with this variant of image representation.

In [ ]:
img_CHW = channelview(img_small);
print(size(img_small), " --> ", size(img_CHW))
(287, 311) --> (3, 287, 311)

Select the red channel of the image and draw only the red channel.

In [ ]:
RGB.(img_CHW[1,:,:], 0.0, 0.0) # red
Out[0]:
No description has been provided for this image

If all channels are evenly distributed, we will get images in shades of grey, since none of the base colours dominates over the others.

In [ ]:
RGB.(img_CHW[1,:,:], img_CHW[1,:,:], img_CHW[1,:,:]) # Gray
Out[0]:
No description has been provided for this image

Building on the channel image example, we can realise affine transformation for an RGB image by running it through our function as three separate one-dimensional matrices and combining their results.

In [ ]:
img_CHW_new = zeros(size(img_CHW))

for i in 1:size(img_CHW,1)
   img_CHW_new[i,:,:] = transform(theta, img_CHW[i,:,:], [size(img_CHW,2),size(img_CHW,3)])
end 

RGB.(img_CHW_new[1,:,:], img_CHW_new[2,:,:], img_CHW_new[3,:,:])
Out[0]:
No description has been provided for this image

Conclusion

In this demonstration we have dealt with channel representations and different types of matrix representation of images, and analysed some of the image processing capabilities in Engee.