Fixed-Point arithmetic in Engee
In the world of numerical computing, the vast majority of tasks are solved using floating point numbers (Float32, Float64`However, in real systems such as microcontrollers, DSPs, FPGAs, or ASICs, the use of Float types may be undesirable or impossible. For example, in the STM32 family of microcontrollers, the base models do not have hardware support. `float, and floating-point operations in them are performed much slower than integer ones. In such cases, fixed-point arithmetic is used (Fixed-Point), in which fractional values are encoded in an integer format with a predefined length of the fractional part.
Fixed-point numbers (Fixed-Point) is a way of representing fractional values using ordinary integers and a predefined scale (the number of bits for the fractional part). Instead of resource-intensive floating-point arithmetic (Float32, Float64), a simple integer format is used here, where the "binary comma" is shifted by a set number of bits.
For example, if the length of the fractional part is , that number stored as an integer value , because according to the formula for recalculating the internal value (stored_integer) in real time (real_value) we get . The formula shows that the internal value is preserved and is equal to , and in calculations it is interpreted as .
- Such arithmetic has the following advantages
-
-
Lower resource consumption (relevant for microcontrollers, FPGAs, and ASICs);
-
Controlled accuracy and range of values;
-
Support code generator on Verilog (HDL) and C.
-
Read more about Fixed Point arithmetic in the article guide/hdl-fixed-point-arithmetic.adoc#fixed-point-arithmetic
Working with a fixed point in Engee
Engee uses its own package to work with fixed points. EngeeFixedPoint.jl, which replaces the standard Julia package FixedPointNumbers.jl. Unlike the classic package, EngeeFixedPoint.jl It provides advanced features and precise control over the representation and behavior of fixed—point numbers, which is especially important in systems with limited resources, when transferring calculations to HDL, as well as in precision tasks.
Package EngeeFixedPoint.jl it is a standard Engee package and is included in the user environment by default, so it does not require an explicit call (via import/using) in the code.
|
In Engee, the type of a fixed-point number looks like this:
Fixed{S, W, f, T} <: FixedPoint
where:
-
S— a sign (1— Signed [3],0— Unsigned note:common_1[Unsigned — can only store positive values and zero. Example:UInt8,UInt16,Fixed{0, 16, 4}.]); -
W— the length of the word (the number of bits allocated to the number); -
f— length of the fractional part (number of bits per fractional part, scale); -
T— the type of integer representation of a fixed-point number (Int32,UInt64and so on).
This format allows you to specify exactly how a number will be stored, interpreted, and involved in calculations due to secure typing and clear behavior at all stages of data processing.
For convenience EngeeFixedPoint.jl It offers several ways to set the type of fixed point, from complete manual assignment to automatic output.
S, W, f, T = 1, 25, 10, Int32
dt1 = Fixed{S, W, f, T}
dt2 = fixdt(S, W, f)
dt3 = fixdt(Fixed{S, W, f})
dt4 = fixdt(dt2)
println(dt1 == dt2 == dt3 == dt4) # true
where:
-
dt1 = Fixed{S, W, f, T}— full manual description; -
dt2 = fixdt(S, W, f)— simplified creation, type it is selected automatically; -
dt3 = fixdt(Fixed{S, W, f})— getting a type based on an existing description; -
dt4 = fixdt(dt2)— reuse, creates a copy from an existing type.
All of these options create the same type. Fixed{1, 25, 10, Int32}, and can be used depending on the task:
-
Full description (
dt1) it is convenient when you need control over all parameters.; -
A simplified way (
dt2) is suitable for typical cases and reduces the code; -
Getting a type from a type (
dt3) useful when generating code or typing data; -
Reuse (
dt4) helps to work with parameterized structures without re-entering parameters.
Type constructors Fixed
Next, let’s look at specific scenarios for working with fixed points in Engee.
So, you can directly set the type and pass the value.:
x = Fixed{1, 15, 2}(25)
Conclusion:
fi(6.25, 1, 15, 2)
This means that — integer representation (stored_integer), and the real value (real_value) will be equal to according to the formula .
Fixed{S, W, f}(i::T)
Constructor for creating a fixed point based on an integer representation Fixed{S, W, f}(i::T) accepts:
-
Format Parameters:
S(familiarity),W(width in bits),f(fractional part); -
Integer value
ilikeT(internal representation).
S, W, f = 1, 15, 2 # signed, 15 bits, 2 fractional bits
i = 25
x = Fixed{S, W, f}(i) # creation from an integer
Conclusion:
fi(6.25, 1, 15, 2) # equivalent representation
Fixed{S, W, f, T1}(i::T2)
The constructor is similar to the previous one, with the ability to explicitly specify the storage type. The type will be automatically selected according to the parameters. S, W, f, regardless of the specified T1.
T = Int128
x = Fixed{S, W, f, T}(i) # indicating the type of storage
Conclusion:
fi(6.25, 1, 15, 2) # the result is identical
Constructors from FixedPointNumbers.jl
Despite using the new package EngeeFixedPoint.jl, it still has compatibility with the package FixedPointNumbers.jl to support a number of constructors. Only signed types are supported.
Supported:
-
Fixed{T, f}(i::Integer, _)— constructor for the integer representation. Accepts the typeTand the parameterf; -
Fixed{T, f}(value)— real-world value constructor (float).
Example:
T = Int32
x1 = Fixed{T, f}(i, nothing) # from an integer
x2 = Fixed{T, f}(i) # from a real number
Conclusion:
6.25 # result of the first constructor
25.0 # result of the second constructor
Auxiliary methods fi
The main convenient way to create fixed point numbers is through auxiliary methods fi. Unlike constructors, they automatically determine the presentation parameters.
x1 = fi(3.37, 0, 63, 4) # Full format with explicit parameters
x2 = fi(3.37, fixdt(0, 63, 4)) # Via the data type
x3 = fi(3.37, 0, 63) # With automatic fractional part detection
x4 = fi(100, 1, 8, 5) # Demonstration of overflow handling
Conclusion:
3.375 # value adjusted for rounding
true # x1 and x2 are identical
3.37 # with automatic selection
3.96875 # saturation result at overflow
Complex numbers
Full support for fixed-point complex numbers with the same creation methods via fi:
s, w, f = 1, 62, 7;
v = 2.5 - 3.21im
x1 = fi(v, s, w, f)
x2 = fi(v, fixdt(s, w, f))
x3 = fi(v, s, w)
println(x1)
println(x1 == x2)
println(x3)
println()
Conclusion:
fi(2.5, 1, 62, 7) - fi(3.2109375, 1, 62, 7)*im
true
fi(2.5, 1, 62, 59) - fi(3.21, 1, 62, 59)*im
Working with arrays and matrices
The library provides full support for vector and matrix operations with fixed-point numbers. All operations save the element type and automatically apply the specified precision parameters to all elements of the array.
Vectors
Creating and working with one-dimensional arrays. Fixed point parameters are applied to all elements:
s, w, f = 1, 62, 7 # signed type, 62 bits, 7 bits of fractional part
v = [1, 2, 3] # the original vector
# Different ways to create:
x1 = fi(v, s, w, f) # with explicit parameters specified
x2 = fi(v, fixdt(s, w, f)) # via the data type
x3 = fi(v, s, w) # with automatic fractional part detection
println(x1)
println(x1 == x2)
println(x3)
Conclusion:
Fixed{1, 62, 7}[1.0, 2.0, 3.0]
true
Fixed{1, 62, 59}[1.0, 2.0, 3.0]
Complex matrices
Full support for complex numbers in multidimensional arrays:
s, w, f = 1, 62, 7
m = [im 2.5; -1.2im 25-im]
# Working ways to create:
x1 = fi(m, s, w, f) # with explicit parameters specified
x2 = fi(m, fixdt(s, w, f)) # via the data type
println(x1)
println(x1 == x2)
Conclusion:
Complex{Fixed{1, 62, 7, Int64}}[fi(0.0, 1, 62, 7) + fi(1.0, 1, 62, 7)*im fi(2.5, 1, 62, 7) + fi(0.0, 1, 62, 7)*im; fi(0.0, 1, 62, 7) - fi(1.203125, 1, 62, 7)*im fi(25.0, 1, 62, 7) - fi(1.0, 1, 62, 7)*im]
true
Basic operations and methods
It describes methods for working with fixed-point numbers, allowing you to determine the acceptable range of values and basic properties.
Boundary values
Methods typemax and typemin They allow you to determine the maximum and minimum possible values for a specific type of fixed point.
dt = fixdt(0, 25, -2) # an unsigned type with 25 bits and a fractional part of -2
x = fi(1.5, dt) # creating a fixed point number
println(typemax(x)) # 1.34217724e8 is the maximum representable value
println(typemin(x)) # 0.0 is the minimum value for an unsigned type.
Mathematical operations
The system automatically selects the optimal format for the result of operations, maintaining accuracy and preventing overflow. All basic arithmetic operations (addition, subtraction, multiplication, division) are supported:
x1 = fi(1.5, 0, 15, 3)
x2 = fi(1.5, 1, 25, 14)
y1 = x1+x2
y2 = x1-x2
y3 = x1*x2
y4 = x1/x2
println(y1)
println(y2)
println(y3)
println(y4)
println(typeof(y1))
println(typeof(y2))
println(typeof(y3))
println(typeof(y4))
println(x1 == x2)
println(x1 <= x2)
println(x1 > x2)
Conclusion:
3.0
0.0
2.25
0.0
Fixed{1, 28, 14, Int32}
Fixed{1, 28, 14, Int32}
Fixed{1, 40, 17, Int64}
Fixed{1, 25, -11, Int32}
true
true
false
Rounding up
Various rounding strategies allow you to control the accuracy of calculations. By default, RoundNearestTiesUp is used.
x = fi(1.5, 1, 14, 3) # signed, 14 bits, 3 fractional bits
println(round(x)) # 2.0 – rounding to the nearest integer (1.5 → 2)
println(trunc(x)) # 1.0 – dropping the fractional part
println(ceil(x)) # 2.0 – rounding up to a larger integer
println(floor(x)) # 1.0 – rounding down to a smaller integer
where:
-
round— bank rounding (to the nearest even number at0.5); -
trunc— discarding the fractional part; -
ceil— always in the big way; -
floor— always in a smaller direction.
Type Conversion (conversion)
Conversion to standard data types is useful when interacting with other libraries. When converting, the rounding rules are taken into account.
x = fi(1.5, 1, 12, 4)
y1 = Int64(x)
y2 = UInt8(x)
y3 = Float64(x)
y4 = convert(fixdt(0, 5, 2), x)
println(y1)
println(y2)
println(y3)
println(y4)
println(typeof(y1))
println(typeof(y2))
println(typeof(y3))
println(typeof(y4))
Conclusion:
1
1
1.5
1.5
Int64
UInt8
Float64
Fixed{0, 5, 2, UInt8}
Conclusion
Thus, the package EngeeFixedPoint.jl provides the following advantages:
-
Extended type system:
-
Full support for both signed and unsigned numbers;
-
Arbitrary bit depth (any bit size, not only 8/16/32/64/128);
-
Flexible adjustment of the fractional part (including negative values and cases when , the length of the fractional part, is longer than the length of the word ).
-
-
Improved type inference rules:
-
Platform-dependent code generation:
-
Different type inheritance rules for target platforms (C or Verilog);
-
Predictable behavior when overflowing the 128-bit boundary (unlike analogues).
-
-
Advanced functionality:
-
Optimized work with arrays and matrices (see Working with arrays and matrices);
-
Full support for complex numbers (see Complex numbers);
-
Effective rounding methods (round/trunk/ceiling/floor, for more information, see Rounding up).
-
Support for basic methods (zero/one/typemin/typemax), see Boundary values.
-
f). Since fixed-point numbers cannot accurately represent all possible fractional values, during operations, the result is rounded according to a given strategy (for example, to the nearest value or truncated).
W) and familiarity (`S`In such cases, an overflow processing strategy is used: saturation, in which the value is limited to the maximum/minimum allowed, or truncation or error output.