Fixed-Point Arithmetic (Fixed-Point) in Engee

In the world of numerical computing, the vast majority of problems are solved by usage of floating-point numbers (Float32, Float64). However, in real-world systems - such as microcontrollers, DSPs, FPGAs or ASICs - usage of Float types may be undesirable or impossible. For example, in the STM32 family of microcontrollers, the base models do not have hardware support for float', and floating-point operations are much slower than integer operations. In such cases, fixed-point arithmetic (`Fixed-Point) is used, where fractional values are encoded in integer format with a predetermined length of the fractional part.

Fixed-Point (Fixed-Point') is a way of representing fractional values using ordinary integers and a predetermined scale (number of bits for the fractional part). Instead of resource-intensive floating-point arithmetic (`Float32, Float64), it uses a simple integer format where the "binary comma" is shifted by a given number of bits.

For example, if the length of the fractional part is , then the number is stored as an integer value , because the formula for converting the internal value (stored_integer) to the real value (real_value) will result in . The formula shows that the internal value is stored and is equal to , but during calculations it is interpreted as .

Such arithmetic has the following advantages:

Less resource consumption (relevant for microcontrollers, FPGAs and ASICs);
Controllable accuracy and range of values;
Predictable rounding behaviour^[1] and overflows^[2];
Support for code generator in Verilog (HDL) and C.

Read more about Fixed Point arithmetic in the article guide/hdl-fixed-point-arithmetic.adoc#fixed-point-arithmetic

Fixed Point Arithmetic in Engee

To work with fixed points, Engee uses its own package EngeeFixedPoint.jl, which replaces the standard Julia package FixedPointNumbers.jl. Unlike the classic package, EngeeFixedPoint.jl provides advanced features and precise control over the representation and behaviour of fixed-point numbers - especially important in resource-constrained systems, when moving computations to HDL, and in problems of strict precision.

The EngeeFixedPoint.jl package is a standard Engee package and is included in the user environment by default, so does not need to be explicitly called (via import/using) in code.

In Engee, the type of a fixed-point number is as follows:

Fixed{S, W, f, T} <: FixedPoint

Where:

S - sign (1 - sign^[3], 0 - unsigned^[4]);
W - word length (number of bits allocated per number);
f - length of fractional part (number of bits per fractional part, scale);
T - type of integer representation of a fixed-point number (Int32, UInt64, etc.).

This format allows you to specify exactly how the number will be stored, interpreted and participate in calculations due to safe typing and clear behaviour at all stages of data processing.

For convenience, ``EngeeFixedPoint.jl'' offers several ways to specify the fixed point type, from full manual specification to automatic output.

S, W, f, T = 1, 25, 10, Int32

dt1 = Fixed{S, W, f, T}
dt2 = fixdt(S, W, f)
dt3 = fixdt(Fixed{S, W, f})
dt4 = fixdt(dt2)

println(dt1 == dt2 == dt3 == dt4)  # true

Where:

dt1 = Fixed{S, W, f, T} - full manual description;
dt2 = fixdt(S, W, f) - simplified creation, type is automatically selected;
dt3 = fixdt(Fixed{S, W, f}) - obtaining a type based on an existing description;
dt4 = fixdt(dt2) - repeated usage, creates a copy from an existing type.

All these options create the same type Fixed{1, 25, 10, Int32}, and can be used depending on the task:

The full description (dt1) is useful when control over all parameters is needed;
The simplified way (dt2) is suitable for typical cases and shortens the code;
Getting a type from a type (dt3) is useful when generating code or typing data;
The reusable usage (dt4) helps to work with parameterised structures without re-entering parameters.

Constructors of type `Fixed`

Next, let’s look at specific scenarios for working with fixed points in Engee.

For example, you can directly set the type and pass the value:

x = Fixed{1, 15, 2}(25)

Conclusion:

fi(6.25, 1, 15, 2)

This means that is an integer representation (stored_integer) and the real value (real_value) will be equal to according to the formula .

`Fixed{S, W, f}(i::T)`.

The constructor of creating a fixed point by integer representation Fixed{S, W, f}(i::T) takes:

Format parameters: S (signedness), W (width in bits), f (fractional part);
Integer value i of type T (internal representation).

S, W, f = 1, 15, 2  # знаковый, 15 бит, 2 бита дробной части
i = 25
x = Fixed{S, W, f}(i) # создание из целого числа

Conclusion:

fi(6.25, 1, 15, 2)  # эквивалентное представление

`Fixed{S, W, f, T1}(i::T2)`.

Similar constructor to the previous one, with the ability to explicitly specify the storage type. The type will be automatically matched according to the parameters S, W, f, regardless of the specified T1.

T = Int128
x = Fixed{S, W, f, T}(i) # с указанием типа хранения

Conclusion:

fi(6.25, 1, 15, 2)  # результат идентичен

Constructors from FixedPointNumbers.jl

Despite usage of the new EngeeFixedPoint.jl package, it retains compatibility with the FixedPointNumbers.jl package to support a number of constructors. Only signed types are supported.

Supported:

Fixed{T, f}(i::Integer, _) - constructor by integer representation. Accepts type T and parameter f;
Fixed{T, f}(value) - constructor by real value (float).

Example:

T = Int32
x1 = Fixed{T, f}(i, nothing) # из целого числа
x2 = Fixed{T, f}(i)          # из вещественного числа

Conclusion:

6.25    # результат первого конструктора
25.0    # результат второго конструктора

Auxiliary methods `fi`

The main convenient way to create fixed-point numbers is through fi auxiliary methods. Unlike constructors, they automatically determine the parameters of the representation.

x1 = fi(3.37, 0, 63, 4)        # Полный формат с явным указанием параметров
x2 = fi(3.37, fixdt(0, 63, 4)) # Через тип данных
x3 = fi(3.37, 0, 63)           # С автоматическим определением дробной части
x4 = fi(100, 1, 8, 5)          # Демонстрация обработки переполнения

Conclusion:

3.375     # значение с учетом округления
true      # x1 и x2 идентичны
3.37      # с автоматическим подбором
3.96875   # результат насыщения при переполнении

Complex numbers

Full support for fixed-point complex numbers with the same methods of creation via fi:

s, w, f = 1, 62, 7;
v = 2.5 - 3.21im
x1 = fi(v, s, w, f)
x2 = fi(v, fixdt(s, w, f))
x3 = fi(v, s, w)
println(x1)
println(x1 == x2)
println(x3)
println()

Output:

fi(2.5, 1, 62, 7) - fi(3.2109375, 1, 62, 7)*im
true
fi(2.5, 1, 62, 59) - fi(3.21, 1, 62, 59)*im

Working with arrays and matrices

The library provides full support for vector and matrix operations with fixed point numbers. All operations preserve the element type and automatically apply the specified precision parameters to all array elements.

Vectors

Create and work with one-dimensional arrays. Fixed point parameters are applied to all elements:

s, w, f = 1, 62, 7  # знаковый тип, 62 бита, 7 бит дробной части
v = [1, 2, 3]       # исходный вектор

# Разные способы создания:
x1 = fi(v, s, w, f)        # с явным указанием параметров
x2 = fi(v, fixdt(s, w, f)) # через тип данных
x3 = fi(v, s, w)           # с автоматическим определением дробной части

println(x1)
println(x1 == x2)
println(x3)

Output:

Fixed{1, 62, 7}[1.0, 2.0, 3.0]
true
Fixed{1, 62, 59}[1.0, 2.0, 3.0]

Complex matrices

Full support for complex numbers in multidimensional arrays:

s, w, f = 1, 62, 7
m = [im 2.5; -1.2im 25-im]

# Рабочие способы создания:
x1 = fi(m, s, w, f)        # с явным указанием параметров
x2 = fi(m, fixdt(s, w, f)) # через тип данных

println(x1)
println(x1 == x2)

Output:

Complex{Fixed{1, 62, 7, Int64}}[fi(0.0, 1, 62, 7) + fi(1.0, 1, 62, 7)*im fi(2.5, 1, 62, 7) + fi(0.0, 1, 62, 7)*im; fi(0.0, 1, 62, 7) - fi(1.203125, 1, 62, 7)*im fi(25.0, 1, 62, 7) - fi(1.0, 1, 62, 7)*im]
true

Basic operations and methods

Describes methods for working with fixed point numbers, allowing you to define the allowable range of values and basic properties.

Boundary values

The typemax and typemin methods allow you to define the maximum and minimum possible values for a particular fixed point type.

dt = fixdt(0, 25, -2)  # беззнаковый тип с 25 битами и дробной частью -2
x = fi(1.5, dt)        # создаем число фиксированной точки
println(typemax(x))    # 1.34217724e8 – максимальное представимое значение
println(typemin(x))    # 0.0 – минимальное значение для беззнакового типа

Mathematical operations

The system automatically selects the optimal format for the result of operations, maintaining accuracy and preventing overflow. All basic arithmetic operations (addition, subtraction, multiplication, division) are supported:

x1 = fi(1.5, 0, 15, 3)
x2 = fi(1.5, 1, 25, 14)
y1 = x1+x2
y2 = x1-x2
y3 = x1*x2
y4 = x1/x2
println(y1)
println(y2)
println(y3)
println(y4)
println(typeof(y1))
println(typeof(y2))
println(typeof(y3))
println(typeof(y4))

println(x1 == x2)
println(x1 <= x2)
println(x1 > x2)

Conclusion:

3.0
0.0
2.25
0.0
Fixed{1, 28, 14, Int32}
Fixed{1, 28, 14, Int32}
Fixed{1, 40, 17, Int64}
Fixed{1, 25, -11, Int32}
true
true
false

Rounding

Various rounding strategies allow you to control the accuracy of your calculations. By default, RoundNearestTiesUp rounding is used.

x = fi(1.5, 1, 14, 3)  # знаковый, 14 бит, 3 бита дробной части
println(round(x))      # 2.0 – округление к ближайшему целому (1.5 → 2)
println(trunc(x))      # 1.0 – отбрасывание дробной части
println(ceil(x))       # 2.0 – округление вверх к большему целому
println(floor(x))      # 1.0 – округление вниз к меньшему целому

Where:

round - bank rounding (to the nearest even at 0.5);
trunc - discard fractional part;
ceil - always upwards;
floor - always downward.

Type conversion (conversion)

Conversion to standard data types is useful when interacting with other libraries. When converting, rounding rules are taken into account.

x = fi(1.5, 1, 12, 4)
y1 = Int64(x)
y2 = UInt8(x)
y3 = Float64(x)
y4 = convert(fixdt(0, 5, 2), x)
println(y1)
println(y2)
println(y3)
println(y4)
println(typeof(y1))
println(typeof(y2))
println(typeof(y3))
println(typeof(y4))

Conclusion:

1
1
1.5
1.5
Int64
UInt8
Float64
Fixed{0, 5, 2, UInt8}

Conclusion

In summary, the EngeeFixedPoint.jl package provides the following benefits:

*Expanded type system:
- Full support for both signed and unsigned numbers;
- Arbitrary bit size (any bit size, not just 8/16/32/64/128);
- Flexible fractional part setting (including negative values and cases when , fractional part length is greater than word length ).
*Improved type output rules:
- Automatic format detection for results of mathematical operations (+, -, *, /);
- Removing restrictions on automatic type inheritance in blocks Divide/Add/Divide/Gain.
*Platform-dependent code generation:
- Different type inheritance rules for target platforms (C or Verilog);
- Predictable behaviour on 128-bit boundary overflow (unlike analogues).
*Expanded functionality:
- Optimised handling of arrays and matrices (cf. Working with arrays and matrices);
- Full support for complex numbers (see ). Complex numbers);
- Efficient rounding methods (round/trunc/ceil/floor, see ). Rounding).
- Support for basic methods (zero/one/typemin/typemax, see ). Boundary values.

Useful links

Fixed point data types.

1. Rounding is the process of bringing a value to a valid form given a bounded fractional part (f). Since fixed-point numbers cannot accurately represent all possible fractional values, operations round the result according to a given strategy (e.g., to the nearest value or with truncation).

2. Overflow is the process that occurs if the result of a computation exceeds the limits allowed for a given word size (W) and signability (S). In such cases, an overflow handling strategy is applied: saturation, where the value is limited to the maximum/minimum allowed, or truncation or error output

3. Signs - can store both positive and negative values. Example: Int8, Int16, Fixed{1, 16, 4}.

4. Unsigned - can only store positive values and zero. Example: UInt8, UInt16, Fixed{0, 16, 4}.

Fixed-Point Arithmetic (Fixed-Point) in Engee

Fixed Point Arithmetic in Engee

Constructors of type Fixed

Fixed{S, W, f}(i::T).

Fixed{S, W, f, T1}(i::T2).

Constructors from FixedPointNumbers.jl

Auxiliary methods fi

Complex numbers

Working with arrays and matrices

Vectors

Complex matrices

Basic operations and methods

Boundary values

Mathematical operations

Rounding

Type conversion (conversion)

Conclusion

Useful links

Constructors of type `Fixed`

`Fixed{S, W, f}(i::T)`.

`Fixed{S, W, f, T1}(i::T2)`.

Auxiliary methods `fi`