Engee documentation

Technical background

Introduction

An image filter can be represented by a function

where (i = 1,2). It is common to define and , where and are integers, which ensures that the filter dimensions are of odd size. Typically, equals and so, dropping the subscripts, one speaks of a filter. Since the domain of the filter represents a grid of spatial coordinates, the filter is often called a mask and is visualized as a grid.

For example, a mask can be portrayed as follows:

The values of are referred to as filter coefficients.

Convolution versus correlation

There are two fundamental and closely related operations that one regularly performs on an image with a filter. The operations are called discrete correlation and convolution.

The correlation operation, denoted by the symbol , is given in two dimensions by the expression

whereas the comparable convolution operation, denoted by the symbol , is given in two dimensions by

Since a digital image is of finite extent, both of these operations are undefined at the borders of the image. In particular, for an image of size , the function is only defined for and . In practice one addresses this problem by artificially expanding the domain of the image. For example, one can pad the image with zeros. Other padding strategies are possible, and they are discussed in more detail in the Options section of this documentation.

One-dimensional illustration

The difference between correlation and convolution is best understood with recourse to a one-dimensional example adapted from [1]. Suppose that a filter has coefficients

Consider a discrete unit impulse function that has been padded with zeros. The function can be visualised as an image

The correlation operation can be interpreted as sliding along the image and computing the sum of products at each location. For example,

yields the output , which when visualized as a digital image, is equal to

The interpretation of the convolution operation is analogous to correlation, except that the filter has been rotated by 180 degrees. In particular,

yields the output equal to

Instead of rotating the filter mask, one could instead rotate and still obtained the same convolution result. In fact, the conventional notation for convolution indicates that is flipped and not . If is symmetric, then convolution and correlation give the same outcome.

Two-dimensional illustration

For a two-dimensional example, suppose the filter has coefficients

and consider a two-dimensional discrete unit impulse function

that has been padded with zeros:

The correlation operation yields the output

whereas the convolution operation produces

Convolution and correlation as matrix multiplication

Discrete convolution and correlation operations can also be formulated as a matrix multiplication, where one of the inputs is converted to a Toeplitz matrix, and the other is represented as a column vector. For example, consider a function and a filter . Then the matrix multiplication

is equivalent to the convolution assuming that the border of has been padded with zeros.

To represent multidimensional convolution as matrix multiplication one reshapes the multidimensional arrays into column vectors and proceeds in an analogous manner. Naturally, the result of the matrix multiplication will need to be reshaped into an appropriate multidimensional array.