Integers and floating point numbers

Integer and floating-point values are the basic building blocks of arithmetic and computation. The built-in representations of such values are called numeric primitives, and the representations of integers and floating-point numbers as direct values in the code are called numeric literals. For example, 1 is an integer literal, and 1.0 is a floating-point literal. Their binary representations in memory as objects are numeric primitives.

Julia provides a wide range of primitive numeric types, which are used to define a complete set of arithmetic and bit operators, as well as standard mathematical functions. They are directly related to numeric types and operations that are natively supported on modern computers, which allows Julia to take full advantage of computing resources. In addition, Julia provides software support for arbitrary precision arithmetic, capable of handling operations with numeric values that cannot be properly represented in its own hardware implementation. However, this is done at the cost of relatively lower performance.

The following are Julia’s primitive numeric types.

Integer types

Type With a sign? Number of bits The lowest value The highest value

Type	With a sign?	Number of bits	The lowest value	The highest value
`Int8`	✓	8	--2^7	2^7 — 1
`UInt8`		8	0	2^8 — 1
`Int16`	✓	16	--2^15	2^15 — 1
`UInt16`		16	0	2^16 — 1
`Int32`	✓	32	--2^31	2^31 — 1
`UInt32`		32	0	2^32 — 1
`Int64`	✓	64	--2^63	2^63 — 1
`UInt64`		64	0	2^64 — 1
`Int128`	✓	128	--2^127	2^127 — 1
`UInt128`		128	0	2^128 — 1
`Bool`	N/A	8	`false` (0)	`true` (1)

Int8

✓

--2^7

2^7 — 1

UInt8

2^8 — 1

Int16

✓

--2^15

2^15 — 1

UInt16

2^16 — 1

Int32

✓

--2^31

2^31 — 1

UInt32

2^32 — 1

Int64

✓

--2^63

2^63 — 1

UInt64

2^64 — 1

Int128

✓

128

--2^127

2^127 — 1

UInt128

128

2^128 — 1

Bool

N/A

false (0)

true (1)

Floating point types

Type Accuracy Number of bits

Type	Accuracy	Number of bits
`Float16`	https://en.wikipedia.org/wiki/Half-precision_floating-point_format [half]	16
`Float32`	https://en.wikipedia.org/wiki/Single_precision_floating-point_format [single]	32
`Float64`	https://en.wikipedia.org/wiki/Double_precision_floating-point_format [double]	64

Float16

https://en.wikipedia.org/wiki/Half-precision_floating-point_format [half]

Float32

https://en.wikipedia.org/wiki/Single_precision_floating-point_format [single]

Float64

https://en.wikipedia.org/wiki/Double_precision_floating-point_format [double]

In addition, based on these primitive numeric types, full support has been created for complex and rational numbers. All numeric types naturally interact with each other without explicit conversion due to the flexible, user-expandable type promotion system.

Integers

Literal integers are represented in a standard way.

julia> 1
1

julia> 1234
1234

The default type for an integer literal depends on which architecture the target system has — 32-bit or 64-bit:

# 32-разрядная система:
julia> typeof(1)
Int32

# 64-разрядная система:
julia> typeof(1)
Int64

Julia internal variable 'Sys.WORD_SIZE` indicates whether the target system is 32-bit or 64-bit:

# 32-разрядная система:
julia> Sys.WORD_SIZE
32

# 64-разрядная система:
julia> Sys.WORD_SIZE
64

Julia also defines the types Int and UInt, which are aliases for the system’s own signed and unsigned integer types, respectively.:

# 32-разрядная система:
julia> Int
Int32
julia> UInt
UInt32

# 64-разрядная система:
julia> Int
Int64
julia> UInt
UInt64

Large integer literals that cannot be represented using only 32 bits, but can be represented in 64 bits, always produce 64-bit integers, regardless of the type of system.:

# 32-или 64-разрядная система:
julia> typeof(3000000000)
Int64

Unsigned integers are entered and output using the prefix 0x and the hexadecimal (base 16) digits 0-9a-f (capital letters A-F are also suitable for input). The size of the unsigned value is determined by the number of hexadecimal digits used.

julia> x = 0x1
0x01

julia> typeof(x)
UInt8

julia> x = 0x123
0x0123

julia> typeof(x)
UInt16

julia> x = 0x1234567
0x01234567

julia> typeof(x)
UInt32

julia> x = 0x123456789abcdef
0x0123456789abcdef

julia> typeof(x)
UInt64

julia> x = 0x11112222333344445555666677778888
0x11112222333344445555666677778888

julia> typeof(x)
UInt128

It is based on the following observation: when unsigned hexadecimal literals are used for integer values, they are usually used to represent a fixed numeric sequence of bytes, rather than just an integer value.

Binary and octal literals are also supported.

julia> x = 0b10
0x02

julia> typeof(x)
UInt8

julia> x = 0o010
0x08

julia> typeof(x)
UInt8

julia> x = 0x00000000000000001111222233334444
0x00000000000000001111222233334444

julia> typeof(x)
UInt128

As with hexadecimal literals, binary and octal literals create unsigned integer types. The size of a binary data element is the minimum required size if the first digit of the literal is not `0'. If there are leading zeros, the size is determined by the minimum required size for the literal, which has the same length, but the first digit is `1'. This means that

'0x1` and 0x12 are literals of UInt8;
'0x123` and 0x1234 are literals of UInt16;
'0x12345` and 0x12345678 are literals of UInt32;
'0x123456789` and 0x1234567890adcdef are literals of UInt64, etc.

Even if there are leading zeros that do not affect the value, they are taken into account when determining the storage size of the literal. Thus, 0x01 is UInt8, and 0x0001 is `UInt16'.

This is how users can control the size.

Unsigned literals (starting with 0x') that encode integers too large to be represented as `UInt128' values will create `BigInt values instead. It is not an unsigned type, but it is the only built-in type large enough to represent such large integer values.

Binary, octal, and hexadecimal literals can have a '-` sign immediately before the unsigned literal. They create an unsigned integer of the same size as an unsigned literal, with binary value addition.

julia> -0x2
0xfe

julia> -0x0002
0xfffe

The minimum and maximum representable values of primitive numeric types, such as integers, are set by functions typemin and typemax:

julia> (typemin(Int32), typemax(Int32))
(-2147483648, 2147483647)

julia> for T in [Int8,Int16,Int32,Int64,Int128,UInt8,UInt16,UInt32,UInt64,UInt128]
           println("$(lpad(T,7)): [$(typemin(T)),$(typemax(T))]")
       end
   Int8: [-128,127]
  Int16: [-32768,32767]
  Int32: [-2147483648,2147483647]
  Int64: [-9223372036854775808,9223372036854775807]
 Int128: [-170141183460469231731687303715884105728,170141183460469231731687303715884105727]
  UInt8: [0,255]
 UInt16: [0,65535]
 UInt32: [0,4294967295]
 UInt64: [0,18446744073709551615]
UInt128: [0,340282366920938463463374607431768211455]

Values returned typemin and typemax, always have a specified argument type. (The above expression uses several functions that have yet to be reviewed, including for cycles, lines and interpolation. However, it should be fairly simple and understandable for users with some programming experience.)

Overflow behavior

In Julia, exceeding the maximum representable value of a given type leads to cyclic carry.:

julia> x = typemax(Int64)
9223372036854775807

julia> x + 1
-9223372036854775808

julia> x + 1 == typemin(Int64)
true

Arithmetic operations with Julia integer types by their nature perform https://en.wikipedia.org/wiki/Modular_arithmetic [modular arithmetic], mirroring the characteristics of integer arithmetic on modern computer hardware. In scenarios where overflow is possible, it is very important to explicitly check the bypass effects that may result. In the module Base.Checked a set of arithmetic operations with overflow checks is available, which cause errors in case of overflow. In cases where overflow is unacceptable under any circumstances, it is recommended to use the type BigInt, as described in Arbitrary precision arithmetic.

Below is an example of overflow behavior and possible ways to resolve it.

julia> 10^19
-8446744073709551616

julia> big(10)^19
10000000000000000000

Division errors

Integer division (the div function) has two exceptional cases: division by zero and division of the smallest negative number (typemin) to --1. In both cases, there is DivideError. The remainder and modulus calculation functions ('rem` and mod) call DivideError if their second argument is zero.

Floating point numbers

Literal floating point numbers are represented in standard formats using, if necessary, https://en.wikipedia.org/wiki/Scientific_notation#E_notation [exponential notation]:

julia> 1.0
1.0

julia> 1.
1.0

julia> 0.5
0.5

julia> .5
0.5

julia> -1.23
-1.23

julia> 1e10
1.0e10

julia> 2.5e-4
0.00025

All of the above results are values Float64. Literal values Float32 can be specified by entering f instead of e:

julia> x = 0.5f0
0.5f0

julia> typeof(x)
Float32

julia> 2.5f-4
0.00025f0

Values can be easily converted to a type. Float32:

julia> x = Float32(-1.5)
-1.5f0

julia> typeof(x)
Float32

Floating-point hexadecimal literals are also allowed, but only as values. Float64, where p precedes the base 2 exponent:

julia> 0x1p0
1.0

julia> 0x1.8p3
12.0

julia> x = 0x.4p-1
0.125

julia> typeof(x)
Float64

Half-precision floating point numbers are also supported (Float16), but they are implemented programmatically (have a computer format) and use Float32 for calculations.

julia> sizeof(Float16(4.))
2

julia> 2*Float16(4.)
Float16(8.0)

The underscore character _ can be used as a number separator.:

julia> 10_000, 0.000_000_005, 0xdead_beef, 0b1011_0010
(10000, 5.0e-9, 0xdeadbeef, 0xb2)

Zero floating point

Floating point numbers have https://en.wikipedia.org/wiki/Signed_zero [two zeros] — positive and negative. They are equal to each other, but have different binary representations, as can be seen using the function bitstring:

julia> 0.0 == -0.0
true

julia> bitstring(0.0)
"0000000000000000000000000000000000000000000000000000000000000000"

julia> bitstring(-0.0)
"1000000000000000000000000000000000000000000000000000000000000000"

Special floating point values

There are three defined standard floating point values that do not correspond to any point on a straight line of real numbers.:

Float16 Float32 Float64 Name Description

`Float16`	`Float32`	`Float64`	Name	Description
`Inf16`	`Inf32`	`Inf`	positive infinity	the value is greater than all the final floating point values.
`-Inf16`	`-Inf32`	`-Inf`	negative infinity	the value is less than all the final floating point values.
`NaN16`	`NaN32`	`NaN`	Not a number	a value that does not `==` any floating-point value (including itself)

Inf16

Inf32

Inf

positive infinity

the value is greater than all the final floating point values.

-Inf16

-Inf32

-Inf

negative infinity

the value is less than all the final floating point values.

NaN16

NaN32

NaN

Not a number

a value that does not == any floating-point value (including itself)

For a further discussion of the ordering of these infinite floating point values relative to each other and other floating point values, see Numerical comparisons. According to https://en.wikipedia.org/wiki/IEEE_754-2008 [IEEE 754 standard] these floating point values are the results of certain arithmetic operations:

julia> 1/Inf
0.0

julia> 1/0
Inf

julia> -5/0
-Inf

julia> 0.000001/0
Inf

julia> 0/0
NaN

julia> 500 + Inf
Inf

julia> 500 - Inf
-Inf

julia> Inf + Inf
Inf

julia> Inf - Inf
NaN

julia> Inf * Inf
Inf

julia> Inf / Inf
NaN

julia> 0 * Inf
NaN

julia> NaN == NaN
false

julia> NaN != NaN
true

julia> NaN < NaN
false

julia> NaN > NaN
false

Functions are also applied to floating point types typemin and typemax:

julia> (typemin(Float16),typemax(Float16))
(-Inf16, Inf16)

julia> (typemin(Float32),typemax(Float32))
(-Inf32, Inf32)

julia> (typemin(Float64),typemax(Float64))
(-Inf, Inf)

Machine epsilon

Most real numbers cannot be accurately represented by floating-point numbers, so for many tasks it is important to know the distance between two adjacent representable floating-point numbers, which is often called https://en.wikipedia.org/wiki/Machine_epsilon [machine epsilon].

In Julia, there is a method eps, which allows you to get the distance between 1.0 and the next largest representable floating point value:

julia> eps(Float32)
1.1920929f-7

julia> eps(Float64)
2.220446049250313e-16

julia> eps() # то же, что и eps(Float64)
2.220446049250313e-16

These are the values of 2.0^-23 and 2.0^-52, as well as Float32 and Float64, respectively. Function eps can also take a floating-point value as an argument and outputs the absolute difference between this value and the next representable floating-point value. That is, eps(x) outputs a value of the same type as x, so x + eps(x)' is the next representable floating-point value greater than `x:

julia> eps(1.0)
2.220446049250313e-16

julia> eps(1000.)
1.1368683772161603e-13

julia> eps(1e-27)
1.793662034335766e-43

julia> eps(0.0)
5.0e-324

The distance between two adjacent representable floating point numbers is not constant — it is smaller for smaller values and larger for larger ones. In other words, representable floating-point numbers are most densely located on a straight line of real numbers near zero, and as they move away from zero, their sparsity increases exponentially. By definition, eps(1.0)is similar to `eps(Float64)' because `1.0 is a 64-bit floating point value.

Julia also provides features nextfloat and prevfloat, which return the next largest or smallest representable floating-point number to the argument, respectively:

julia> x = 1.25f0
1.25f0

julia> nextfloat(x)
1.2500001f0

julia> prevfloat(x)
1.2499999f0

julia> bitstring(prevfloat(x))
"00111111100111111111111111111111"

julia> bitstring(x)
"00111111101000000000000000000000"

julia> bitstring(nextfloat(x))
"00111111101000000000000000000001"

This example highlights the general principle that adjacent representable floating point numbers also have adjacent binary integer representations.

Rounding modes

If a number does not have an exact floating-point representation, it must be rounded to the appropriate representable value. However, the method of performing rounding can be changed, if necessary, in accordance with the rounding modes presented in https://en.wikipedia.org/wiki/IEEE_754-2008 [IEEE 754 standard].

The default mode is always used RoundNearest, which rounds to the nearest representable value, with ties rounded to the nearest value with the even least significant bit.

General provisions and reference materials

Floating-point arithmetic is characterized by many subtleties that may surprise users who are not familiar with the details of the low-level implementation. These features are described in detail in most books on scientific computing, as well as in the following reference materials.

A detailed guide to floating point arithmetic is https://standards.ieee.org/standard/754-2008.html [IEEE 754-2008 standard]. However, it is not available for free on the Internet.
A brief but clear description of the options for representing floating-point numbers can be found in the thematic https://www.johndcook.com/blog/2009/04/06/anatomy-of-a-floating-point-number /[article] John D. Cook, as well as in his https://www.johndcook.com/blog/2009/04/06/numbers-are-a-leaky-abstraction /[an introductory article] about some of the problems that arise from the difference between the behavior of this representation and the idealized abstraction of real numbers.
It is also recommended to familiarize yourself with the series https://randomascii.wordpress.com/2012/05/20/thats-not-normalthe-performance-of-odd-floats /[Bruce Dawson’s blog posts about floating point numbers].
An interesting and detailed discussion of floating point numbers and the issues related to computational accuracy that arise when working with them is presented in a paper by David Goldberg https://citeseerx.ist.psu.edu/viewdoc/download ?doi=10.1.1.22.6768&rep;=rep1&type;=pdf[What Every Computer Scientist Should Know About Floating-Point Arithmetic] (What every computer scientist should Know about Floating-Point Arithmetic).
More detailed documentation on the history, rationale, and problems with floating point numbers, as well as discussions of many other topics in numerical computing, can be found at https://people.eecs .berkeley.edu /~wkahan/[collection of publications] from https://en.wikipedia.org/wiki/William_Kahan [William Kahan], widely known as the "father of floating point". The article may be of particular interest https://people.eecs .berkeley.edu /~wkahan/ieee754status/754story.html [An Interview with the Old Man of Floating-Point] (Interview with the father of floating-Point).

Arbitrary precision arithmetic

Julia has built-in libraries for performing calculations with arbitrary precision integers and floating-point numbers. https://gmplib.org [GNU Multiple Precision Arithmetic Library (GMP)] and https://www.mpfr.org [GNU MPFR Library], respectively. Types are available in Julia for arbitrary precision integers and floating point numbers BigInt and BigFloat, respectively.

There are constructors to create these types from primitive numeric types. To build types from AbstractString, you can use string literal @big_str or parse. 'BigInt` can also be entered as integer literals if they are too large for other built-in integer types. Note that since there is no unsigned arbitrary precision integer type in Base (in most cases, BigInt is sufficient), hexadecimal, octal, and binary literals (besides decimal) can be used.

Once created, they are used in arithmetic operations along with all other numeric types due to type conversion promotion mechanism in Julia:

julia> BigInt(typemax(Int64)) + 1
9223372036854775808

julia> big"123456789012345678901234567890" + 1
123456789012345678901234567891

julia> parse(BigInt, "123456789012345678901234567890") + 1
123456789012345678901234567891

julia> string(big"2"^200, base=16)
"100000000000000000000000000000000000000000000000000"

julia> 0x100000000000000000000000000000000-1 == typemax(UInt128)
true

julia> 0x000000000000000000000000000000000
0

julia> typeof(ans)
BigInt

julia> big"1.23456789012345678901"
1.234567890123456789010000000000000000000000000000000000000000000000000000000004

julia> parse(BigFloat, "1.23456789012345678901")
1.234567890123456789010000000000000000000000000000000000000000000000000000000004

julia> BigFloat(2.0^66) / 3
2.459565876494606882133333333333333333333333333333333333333333333333333333333344e+19

julia> factorial(BigInt(40))
815915283247897734345611269596115894272000000000

However, type promotion between the above primitive types and BigInt/'BigFloat` is not automatic and must be specified explicitly.

julia> x = typemin(Int64)
-9223372036854775808

julia> x = x - 1
9223372036854775807

julia> typeof(x)
Int64

julia> y = BigInt(typemin(Int64))
-9223372036854775808

julia> y = y - 1
-9223372036854775809

julia> typeof(y)
BigInt

The default precision (in the number of digits of a significant number) and the rounding mode of operations BigFloat can be changed globally by calling setprecision and setrounding. These changes will be taken into account in all further calculations. Or the precision or rounding can be changed only within the execution of a specific block of code, using the same functions with the do block.:

julia> setrounding(BigFloat, RoundUp) do
           BigFloat(1) + parse(BigFloat, "0.1")
       end
1.100000000000000000000000000000000000000000000000000000000000000000000000000003

julia> setrounding(BigFloat, RoundDown) do
           BigFloat(1) + parse(BigFloat, "0.1")
       end
1.099999999999999999999999999999999999999999999999999999999999999999999999999986

julia> setprecision(40) do
           BigFloat(1) + parse(BigFloat, "0.1")
       end
1.1000000000004

The relationship between setprecision or setrounding and @big_str, a macro used for string literals big (for example, big"0.3"), may not be entirely intuitive, since @big_str is a macro. For more information, see the macro documentation. @big_str.

Numerical literal coefficients

To make ordinary numerical formulas and expressions more understandable, in Julia, you can specify a numeric literal immediately before the variable, implying multiplication. At the same time, writing polynomial expressions becomes much cleaner.:

julia> x = 3
3

julia> 2x^2 - 3x + 1
10

julia> 1.5x^2 - .5x + 1
13.0

Writing exponential functions takes on a more elegant look:

julia> 2^2x
64

The priority of numeric literal coefficients is slightly lower than that of unary operators, such as the negation operator. Thus, -2x is analyzed as (-2) *x, and √2x is analyzed as (√2)*x. However, in combination with the exponent, numeric literal coefficients are analyzed similarly to unary operators. For example, 2^3x is analyzed as 2^(3x), and 2x^3 is analyzed as 2*(x^3).

Numeric literals also work as coefficients in expressions with parentheses.:

julia> 2(x-1)^2 - 3(x-1) + 1
3

The precedence of numeric literal coefficients used for implicit multiplication is higher than that of other binary operators, such as multiplication operators. (*) and divisions (/, \ and //). This means, for example, that 1/2im is equal to -0.5im', and `6 // 2(2 + 1) is equal to `1 // 1'.

In addition, expressions in parentheses can be used as coefficients for variables, which implies multiplying the expression by a variable.:

julia> (x-1)x
6

However, neither matching two expressions with parentheses nor placing a variable before an expression with parentheses can be used to denote multiplication.

julia> (x-1)(x+1)
ERROR: MethodError: objects of type Int64 are not callable

julia> x(x+1)
ERROR: MethodError: objects of type Int64 are not callable

Both expressions are interpreted as an application of a function: if any expression that is not a numeric literal is immediately followed by a parenthesis, this expression is interpreted as a function applicable to values in parentheses (for more information about functions, see Functions). Thus, an error occurs in both cases because the left value is not a function.

The above syntactic improvements significantly reduce the visual noise that appears when writing ordinary mathematical formulas. Note that there should be no spaces between the numeric literal coefficient and the identifier or the expression in parentheses that it multiplies.

Syntactic conflicts

The syntax of adjacent literal coefficients may conflict with some syntaxes of numeric literals: hexadecimal, octal, and binary integer literals and the engineering notation for floating-point literals. The following are some situations in which syntactic conflicts arise.

The hexadecimal integer literal expression 0xff' can be interpreted as the numeric literal `0 multiplied by the variable xff'. A similar ambiguity occurs when using octal and binary literals such as `0o777 or `0b01001010'.
The literal floating-point expression 1e10 can be interpreted as the numeric literal 1 multiplied by the variable `e10'. The same thing happens with the equivalent form of `E'.
The 32-bit literal floating-point expression 1.5f22 can be interpreted as the numeric literal 1.5 multiplied by the variable `f22'.

In all cases, ambiguity is resolved in favor of interpreting expressions as numeric literals.:

Expressions starting with 0x/0o/0b are always hexadecimal/octal/binary literals.
Expressions starting with a numeric literal followed by e or E are always floating-point literals.
Expressions starting with a numeric literal followed by an f are always 32-bit floating-point literals.

Unlike E, which for historical reasons is equivalent to e in numeric literals, F is just another letter and does not work like f in numeric literals. Therefore, expressions starting with a numeric literal followed by F are interpreted as a numeric literal multiplied by a variable, which means that, for example, 1.5F22 is equal to `1.5 * F22'.

Literal zero and one

In Julia, there are functions that return literal zeroes and ones corresponding to a given type or type of a given variable.

Function Description

Function	Description
`zero(x)`	A literal null of type `x` or variable type `x`
`one(x)`	A literal unit of type `x` or variable type `x`

zero(x)

A literal null of type x or variable type x

one(x)

A literal unit of type x or variable type x

These functions are useful in numerical comparisons and avoid the costs associated with unnecessary by type conversion. Examples:

julia> zero(Float32)
0.0f0

julia> zero(1.0)
0.0

julia> one(Int32)
1

julia> one(BigFloat)
1.0