Types and transformation of images¶

The purpose of this demonstration is to show ways of specifying images and the basic principles of image transformations, especially relying on affine transformation.

Pkg.add(["TestImages", "ImageShow"])

using Images # Библиотека обработки изображений
using ImageShow # Библиотека отрисовки изображений
using TestImages # Библиотека тестовых изображений

Types of colour spaces¶

Any image is simply an array of pixel objects. The elements of an image are called pixels, and Julia Images treats pixels as first-class objects. For example, we have Gray-pixels in grayscale, RGB colour pixels, Lab colour pixels.

Let's start our analysis with the RGB format (an abbreviation formed from the English words red, green, blue - red, green, blue) - an additive colour model that describes the way colour is encoded for colour reproduction using three colours, which are commonly called primary colours. The choice of primary colours is conditioned by peculiarities of physiology of colour perception by the retina of our eye.

img_rgb = [RGB(1.0, 0.0, 0.0), RGB(0.0, 1.0, 0.0), RGB(0.0, 0.0, 1.0)]

dump(img_rgb)

Array{RGB{Float64}}((3,))
  1: RGB{Float64}
    r: Float64 1.0
    g: Float64 0.0
    b: Float64 0.0
  2: RGB{Float64}
    r: Float64 0.0
    g: Float64 1.0
    b: Float64 0.0
  3: RGB{Float64}
    r: Float64 0.0
    g: Float64 0.0
    b: Float64 1.0

Gray - is a single-colour matrix describing images in shades of grey. By default, 8-bit colour coding is used.

img_gray = rand(Gray, 3, 3)

dump(img_gray)

Array{Gray{Float64}}((3, 3))
  1: Gray{Float64}
    val: Float64 0.20548107200013277
  2: Gray{Float64}
    val: Float64 0.7190843096024139
  3: Gray{Float64}
    val: Float64 0.7675684128936745
  4: Gray{Float64}
    val: Float64 0.4840881004197154
  5: Gray{Float64}
    val: Float64 0.21892864471268947
  6: Gray{Float64}
    val: Float64 0.5776559337062981
  7: Gray{Float64}
    val: Float64 0.855915975791712
  8: Gray{Float64}
    val: Float64 0.7152103286181891
  9: Gray{Float64}
    val: Float64 0.9976538325486705

LAB is an abbreviation of the name of two different (though similar) colour spaces. The better known and more common one is CIELAB (more precisely, CIE 1976 Lab*), the other is Hunter Lab (more precisely, Hunter L, a, b). Lab is thus an informal abbreviation that does not unambiguously define the colour space. In Engee, when referring to Lab space, they mean CIELAB.

img_lab = rand(Lab, 3, 3)

dump(img_gray)

Array{Gray{Float64}}((3, 3))
  1: Gray{Float64}
    val: Float64 0.20548107200013277
  2: Gray{Float64}
    val: Float64 0.7190843096024139
  3: Gray{Float64}
    val: Float64 0.7675684128936745
  4: Gray{Float64}
    val: Float64 0.4840881004197154
  5: Gray{Float64}
    val: Float64 0.21892864471268947
  6: Gray{Float64}
    val: Float64 0.5776559337062981
  7: Gray{Float64}
    val: Float64 0.855915975791712
  8: Gray{Float64}
    val: Float64 0.7152103286181891
  9: Gray{Float64}
    val: Float64 0.9976538325486705

Translation between object types¶

Gray.(img_rgb) # RGB => Gray

RGB.(img_gray) # Gray => RGB

RGB.(img_lab) # Lab => RGB

Image transformation¶

Let's start by loading an image from the .jpg file.

img = load( "$(@__DIR__)/4028965.jpg" )

Let's increase the contrast of the loaded image. The adjust_histogram(Equalisation(),...) function can process different types of input data. The returned image type corresponds to the input type. For colour images, the input is converted to type YIQ and the Y channel is equalised, after which it is combined with channels I and Q.

alg = Equalization(nbins = 256)
img_adjusted = adjust_histogram(img, alg)

Reduce the image size by a factor of 4 relative to the source image. Imresize allows resizing using relationships relative to the source image, as shown in the example below, and also allows resizing using manual dimensioning of the new image, for example:

imresize(img, (400, 400)).

img_small = imresize(img_adjusted, ratio=1/4)

print(size(img_adjusted), " --> ", size(img_small))

(1148, 1243) --> (287, 311)

Affine transformation (from Latin affinis "touching, close, adjacent") is a mapping of a plane or space into itself, where parallel lines become parallel lines, intersecting lines become intersecting lines, crossing lines become crossing lines. Basic image transformations use a lattice of indices to operate on an image. The transformation is defined by the image transformation matrix according to the principle described in the picture below.

# Вспомогательная функция контроля размерностей
function C_B_V(x, max_val)
    x[x .> max_val - 1] .= max_val - 1
    x[x .< 1] .= 1
    return x
end

C_B_V (generic function with 1 method)

Next, let us declare an affine image transformation function in which:

theta is the transformation matrix;
img is the input image;
out_size is the dimensions of the output image;
grid is the pixel indexing grid.

function transform(theta, img, out_size)
    grid = grid = zeros(3, out_size[1]*out_size[2])
    grid[1, :] = reshape(((-1:2/(out_size[1]-1):1)*ones(1,out_size[2])), 1, size(grid,2))
    grid[2, :] = reshape((ones(out_size[1],1)*(-1:2/(out_size[2]-1):1)'), 1, size(grid,2))
    grid[3, :] = ones(Int, size(grid, 2))

    # Умножение theta на grid
    T_g = theta * grid

    # Вычисление координат x, y
    x = (T_g[1, :] .+ 1) .* (out_size[2]) / 2
    y = (T_g[2, :] .+ 1) .* (out_size[1]) / 2

    # Округление координат
    x0 = ceil.(x)
    x1 = x0 .+ 1
    y0 = ceil.(y)
    y1 = y0 .+ 1

    # Обрезание значений x0, x1, y0, y1
    x0 = C_B_V(x0, out_size[2])
    x1 = C_B_V(x1, out_size[2])
    y0 = C_B_V(y0, out_size[1])
    y1 = C_B_V(y1, out_size[1])

    # Вычисление базовых координат
    base_y0 = y0 .* out_size[1]
    base_y1 = y1 .* out_size[1]

    # Работа с изображением
    im_flat = reshape(img, :)

    # Обрабатываем координаты
    A = (x1 .- x) .* (y1 .- y) .* im_flat[Int.(base_y0 .+ x0 .+ 1)]
    B = (x1 .- x) .* (y .- y0) .* im_flat[Int.(base_y1 .+ x0 .+ 1)]
    C = (x .- x0) .* (y1 .- y) .* im_flat[Int.(base_y0 .+ x1 .+ 1)]
    D = (x .- x0) .* (y .- y0) .* im_flat[Int.(base_y1 .+ x1 .+ 1)]

    # Расчет результата
    result = reshape((A .+ B .+ C .+ D), (out_size[1], out_size[2]))
    return result
end

transform (generic function with 1 method)

First, let's apply this function to a greyscale image.

img_sg = Gray.(img_small)

As we can see from the data below, grayscale images have 8-bit colour resolution, and its dimensionality is represented only by width and height.

dump(img_sg[1])

Gray{N0f8}
  val: N0f8
    i: UInt8 0x0e

size(img_sg)

(287, 311)

Let's set the transformation matrix for this image.

theta = [2 0.3 0; -0.3 2 0]

2×3 Matrix{Float64}:
  2.0  0.3  0.0
 -0.3  2.0  0.0

Let's apply our function to the image. As we can see, the size is halved and rotation is performed.

img_transfor = transform(theta, img_sg, [size(img_sg,1),size(img_sg,2)])

To obtain the inverse transformation, we find the inverse matrix from the transformation matrix and round it to the fourth digit.

theta_inv = hcat(inv(theta[1:2,1:2]), [-0.1;0.1])
theta_inv = round.(theta_inv.*10^4)./10^4

2×3 Matrix{Float64}:
 0.489   -0.0733  -0.1
 0.0733   0.489    0.1

img_sg_new = transform(theta_inv, img_transfor, [size(img_transfor,1),size(img_transfor,2)])

Now let's apply this function to an RGB image. First, we will convert RGB images to the channel representation. Let's analyse the possibilities that are opened for us with this variant of image representation.

img_CHW = channelview(img_small);
print(size(img_small), " --> ", size(img_CHW))

(287, 311) --> (3, 287, 311)

Select the red channel of the image and draw only the red channel.

RGB.(img_CHW[1,:,:], 0.0, 0.0) # red

If all channels are evenly distributed, we will get images in shades of grey, since none of the base colours dominates over the others.

RGB.(img_CHW[1,:,:], img_CHW[1,:,:], img_CHW[1,:,:]) # Gray

Building on the channel image example, we can realise affine transformation for an RGB image by running it through our function as three separate one-dimensional matrices and combining their results.

img_CHW_new = zeros(size(img_CHW))

for i in 1:size(img_CHW,1)
   img_CHW_new[i,:,:] = transform(theta, img_CHW[i,:,:], [size(img_CHW,2),size(img_CHW,3)])
end 

RGB.(img_CHW_new[1,:,:], img_CHW_new[2,:,:], img_CHW_new[3,:,:])

Conclusion¶

In this demonstration we have dealt with channel representations and different types of matrix representation of images, and analysed some of the image processing capabilities in Engee.