Functions
In the Julia language, a function is an object that maps a tuple of argument values to a return value. Julia functions are not truly mathematical functions, because they can change the global state of a program, and the global state of a program can affect them. The basic syntax for defining functions in Julia is as follows.
julia> function f(x, y)
x + y
end
f (generic function with 1 method)
This function takes two arguments, x
and y
, and returns the value of the last evaluated expression: `x + y'.
There is also a second, more concise syntax for defining a function in Julia. The traditional function declaration syntax shown above is equivalent to the following compact "assignment form".
julia> f(x, y) = x + y
f (generic function with 1 method)
In the assignment form, the function body must be a single expression, although it can be a compound expression (see section Compound expressions). Short and simple function definitions are typical for Julia. Accordingly, the short syntax of the functions is quite visual, it significantly reduces the amount of typed text and eliminates visual noise.
The function is called using the traditional syntax with parentheses:
julia> f(2, 3)
5
Without parentheses, the expression f
refers to a function as an object and can be passed like any other value.:
julia> g = f;
julia> g(2, 3)
5
As with variables, Unicode characters can also be used in function names.:
julia> ∑(x, y) = x + y
∑ (generic function with 1 method)
julia> ∑(2, 3)
5
Behavior when passing arguments
Arguments to Julia functions follow a convention that is sometimes called "pass-through", which means that values are not copied when they are passed to functions. By themselves, function arguments act as new variable relationships (new "names" that can refer to values), which is very similar to assignment argument_name = argument_value
, so that the objects they refer to are identical to the values passed. Modifications to mutable values (such as Array
) made inside the function will be visible to the caller. (This is the same behavior as in Scheme, in most versions of Lisp, Python, Ruby, and Perl, along with other dynamic languages.)
For example, in the function
function f(x, y)
x[1] = 42 # изменяет x
y = 7 + y # новая привязка для y, изменение не происходит
return y
end
The operator x[1] = 42
_ modifies the object x
, and this change _ will be reflected in the array passed by the caller for this argument. In turn, the assignment y = 7 + y
changes the bind ("name") the variable y
so that it references the new value 7 + y
, instead of changing the original object referenced by the variable y
. Therefore, it does not modify the corresponding argument passed by the caller. You can see this if you call f(x, y)
:
julia> a = [4, 5, 6]
3-element Vector{Int64}:
4
5
6
julia> b = 3
3
julia> f(a, b) # возвращает 7 + b == 10
10
julia> a # значение a[1] изменяется на 42 функцией f
3-element Vector{Int64}:
42
5
6
julia> b # не изменяется
3
According to the convention adopted in Julia (which is not a syntactic requirement), such a function usually has the name f!(x, y)
, rather than f(x, y)
, so that it is clear at the point of the call that at least one of the arguments (often the first one) is being changed.
!!! warning "Shared memory between arguments" The behavior of a mutable function may be unexpected if the modified argument uses the same memory with a different argument — a situation called aliasing occurs (for example, when one argument is a representation of another). If the docstring of the function does not explicitly state that the expected result is obtained when assigning aliases, the responsibility for correct behavior with such inputs lies with the caller.
Declarations of argument types
You can declare the types of function arguments by adding ::TypeName
to the argument name, as is customary for Type Declarations in Julia. For example, the following function computes recursively https://en.wikipedia.org/wiki/Fibonacci_number [Fibonacci numbers]:
fib(n::Integer) = n ≤ 2 ? one(n) : fib(n-1) + fib(n-2)
and the specification ::Integer
means that it will be called only if n
is a subtype. abstract type `Integer'.
Declaring argument types usually has no effect on performance: Regardless of which argument types are declared (if declared), Julia compiles a specialized version of the function for the actual argument types passed by the caller. For example, calling fib(1)
will compile a specialized version of fib
, specifically optimized for Int
arguments, which is reused if fib(7)
or fib(15)
are called. (There are rare exceptions when declaring argument types can trigger additional compiler specializations; see the section: You should understand when Julia avoids specialization.) Instead, the most common reasons to declare argument types in Julia are as follows.
-
Dispatching: as explained in the section Methods, you can have different versions ("methods") of a function for different types of arguments, in which case the types of arguments are used to determine which implementation is called for which arguments. For example, you can implement a completely different algorithm
fib(x::Number) = ...
, which works with any type of `Number' using https://en.wikipedia.org/wiki/Fibonacci_number#Binet%27s_formula [Binet’s formula] to extend it to non-integer values. -
Correctness: Type declarations can be useful if your function returns only the correct results for certain types of arguments. For example, if we omitted the types of arguments and wrote
fib(n) = n ≤ 2 ? one(n) : fib(n-1) + fib(n-2)
, then `fib(1.5)`would automatically give us the meaningless answer `1.0'. -
Clarity: Type declarations can serve as a form of documentation for expected arguments.
However, a common mistake is to over-limit the types of arguments, which may unnecessarily limit the applicability of the function and prevent it from being reused in unforeseen circumstances. For example, the above function is fib(n::Integer)
works equally well with arguments Int
(machine integers) and arbitrary precision integers BigInt
(see the section BigFloats and BigInts), which are especially useful because Fibonacci numbers grow exponentially rapidly and will soon cause overflow for any fixed-precision types, such as Int' (see the section Overflow behavior). However, if we declared our function as `fib(n::Int)
, the application to BigInt
would be prevented for no reason. In general, you should use the most general applicable abstract types for arguments, and if in doubt, omit the argument types. Later, you can always add argument type specifications if necessary, and by omitting them, you won’t sacrifice performance or functionality.
The keyword is return
The value returned by the function is the value of the last evaluated expression, which by default is the last expression in the body of the function definition. In the example of the function f
from the previous section, this is the value of the expression x + y'. Alternatively, as in many other languages, the 'return
keyword causes the function to return immediately, providing an expression whose value is returned.:
function g(x, y)
return x * y
x + y
end
Since function definitions can be entered in interactive sessions, these definitions are easy to compare.:
julia> f(x, y) = x + y
f (generic function with 1 method)
julia> function g(x, y)
return x * y
x + y
end
g (generic function with 1 method)
julia> f(2, 3)
5
julia> g(2, 3)
6
Of course, in a truly linear function body, such as g
, using return
is pointless, since the expression x + y
is never evaluated, we could just make x *y
the last expression in the function and omit return
. However, in the case of a different control flow, return
can actually be useful. For example, here is a function that calculates the length of the hypotenuse of a right triangle with the length of the sides x
and y
, avoiding overflow:
julia> function hypot(x, y)
x = abs(x)
y = abs(y)
if x > y
r = y/x
return x*sqrt(1 + r*r)
end
if y == 0
return zero(x)
end
r = x/y
return y*sqrt(1 + r*r)
end
hypot (generic function with 1 method)
julia> hypot(3, 4)
5.0
There are three possible return points from this function, returning the values of three different expressions, depending on the values of x
and y
. The return
in the last line could be omitted, since this is the last expression.
The type of the returned value
The return type can be specified in the function declaration using the ::
operator. It converts the return value to the specified type.
julia> function g(x, y)::Int8
return x * y
end;
julia> typeof(g(1, 2))
Int8
This function will always return Int8
regardless of the types x
and y
. For more information about the types of returned values, see Type declarations.
Return type declarations are rarely used in Julia: in general, you should write "type-stable" functions instead, in which the Julia compiler can automatically make conclusions about the type of the returned value. For more information, see the chapter Performance Tips.
Returning a missing value
For functions that do not need to return a value (functions that are used only for the sake of some side effects), the Julia convention is to return a value. nothing
:
function printx(x)
println("x = $x")
return nothing
end
This is an agreement in the sense that nothing
is not Julia’s keyword, but is a single object of the type Nothing'. You may also have noticed that the example of the `printx
function above is far-fetched, because println
already returns nothing', so the string with `return
is redundant.
There are two possible abbreviated forms for the expression return nothing
. On the one hand, the keyword return
explicitly returns nothing
, so it can be used separately. On the other hand, since functions explicitly return their last calculated value, nothing
can be used separately when it is the last expression. The preference for the expression return nothing
as opposed to individual return
or `nothing' is a matter of programming style.
Operators are functions
In Julia, most operators are just functions with support for special syntax. (The exceptions are operators with special calculation semantics, such as &&
and `||'. These operators cannot be functions, because short circuit calculations require that their operands are not evaluated before the operator is evaluated.) Accordingly, you can also apply them using argument lists in parentheses, just like any other function.:
julia> 1 + 2 + 3
6
julia> +(1, 2, 3)
6
The infix form is the exact equivalent of the function application form; in fact, the first one is analyzed to make an internal function call. This also means that you can assign and pass operators such as +
and *
, just like the other values of the function:
julia> f = +;
julia> f(1, 2, 3)
6
However, under the name f
, the function does not support infix notation.
Operators with special names
Several special expressions correspond to function calls with non-obvious names. They are listed below.
Expression | Challenge |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
- Note that expressions similar to `+[A; B
-
C; D;; …]+
, but containing more than two consecutive `;
, also correspond tohvncat
calls.
Anonymous functions
The functions in Julia are https://en.wikipedia.org/wiki/First-class_citizen [first-class objects]: They can be assigned to variables and called using the standard syntax of calling a function from the variable to which they were assigned. They can be used as arguments and returned as values. They can also be created anonymously, without naming them, using any of these syntaxes.:
julia> x -> x^2 + 2x - 1
#1 (универсальная функция с одним методом)
julia> function (x)
x^2 + 2x - 1
end
#3 (универсальная функция с одним методом)
Each operator creates a function that takes one argument x
and returns the value of the polynomial x^2+2x - 1
of this value. Note that the result is a universal function, but with a name generated by the compiler based on sequential numbering.
The main use of anonymous functions is to pass them to functions that take other functions as arguments. A classic example — map
, which applies a function to each array value and returns a new array containing the resulting values:
julia> map(round, [1.2, 3.5, 1.7])
3-element Vector{Float64}:
1.0
4.0
2.0
This works fine if there is already a named function performing the conversion to pass as the first argument to map
. However, there is often no named function ready for use. In these situations, the anonymous function design makes it easy to create a function object for single use without specifying a name.:
julia> map(x -> x^2 + 2x - 1, [1, 3, -1])
3-element Vector{Int64}:
2
14
-2
An anonymous function that accepts multiple arguments can be written using the syntax (x,y,z)->2x+y-z
.
Declaring an argument type for anonymous functions works the same way as for named functions, for example x::Integer->2x
. The return type of the anonymous function cannot be specified.
An anonymous function with a null argument can be written as ()->2+2
. The idea of a function without arguments may seem strange, but it is useful in cases where the result cannot (or should not) be pre-calculated. For example, Julia has a function with a null argument. time
, which returns the current time in seconds, and therefore seconds =()->round(Int, time())
is an anonymous function that returns this time rounded to the nearest integer assigned to the 'seconds` variable. Each time this anonymous function is called as seconds()
the current time will be calculated and returned.
Tuples
Julia has a built-in data structure called a cortage, which is closely related to function arguments and return values. A tuple is a fixed-length container that can contain any values, but cannot be changed (it is not changeable). Tuples are constructed using commas and parentheses, and they can be accessed using indexing.:
julia> (1, 1+1)
(1, 2)
julia> (1,)
(1,)
julia> x = (0.0, "hello", 6*7)
(0.0, "hello", 42)
julia> x[2]
"hello"
Note that a tuple with length 1 must be written with a comma, (1,)
, and (1)
will just be the value in parentheses. ()
represents an empty tuple (with length 0).
Named tuples
Alternatively, tuple components can be named, in which case a named tuple is constructed._:
julia> x = (a=2, b=1+2)
(a = 2, b = 3)
julia> x[1]
2
julia> x.a
2
Named tuples can additionally be accessed by name using dot-separated syntax ('x.a`) in addition to the usual indexing syntax (x[1]
or x[:a]
).
Destructive assignment and multiple return values
A comma-separated list of variables (which, alternatively, can be enclosed in parentheses) can be displayed on the left side of the assignment: the value on the right side is structured_ by iterating over the variables and assigning a value to each of them in turn.:
julia> (a, b, c) = 1:3
1:3
julia> b
2
The value on the right must be an iterator (see section Iteration interface) is at least the same length as the set of variables on the left (any redundant iterator elements are ignored).
This can be used to return multiple values from functions by returning a tuple or other iterable value. For example, the following function returns two values:
julia> function foo(a, b)
a+b, a*b
end
foo (generic function with 1 method)
If you call it in an interactive session without assigning a return value to anything, you will see that a tuple has been returned.:
julia> foo(2, 3)
(5, 6)
A destructuring assignment extracts each value into a variable:
julia> x, y = foo(2, 3)
(5, 6)
julia> x
5
julia> y
6
Another common use is to replace variables.:
julia> y, x = x, y
(5, 6)
julia> x
6
julia> y
5
If only a subset of iterator elements are required, a common convention is to assign ignored elements to a variable consisting only of the underscore characters _
(which are otherwise invalid in the variable name, see Allowed variable names):
julia> _, _, _, d = 1:10
1:10
julia> d
4
Other valid left-hand sides of expressions can be used as elements of an assignment list that will cause setindex!
or setproperty!
, or recursively destructure individual iterator elements:
julia> X = zeros(3);
julia> X[1], (a, b) = (1, (2, 3))
(1, (2, 3))
julia> X
3-element Vector{Float64}:
1.0
0.0
0.0
julia> a
2
julia> b
3
Compatibility: Julia 1.6
For |
If you add the suffix ...
to the last character in the assignment list (known as call), it will be assigned a collection or a "lazy" iterator of the remaining iterator elements from the right side.:
julia> a, b... = "hello"
"hello"
julia> a
'h': ASCII/Unicode U+0068 (category Ll: Letter, lowercase)
julia> b
"ello"
julia> a, b... = Iterators.map(abs2, 1:4)
Base.Generator{UnitRange{Int64}, typeof(abs2)}(abs2, 1:4)
julia> a
1
julia> b
Base.Iterators.Rest{Base.Generator{UnitRange{Int64}, typeof(abs2)}, Int64}(Base.Generator{UnitRange{Int64}, typeof(abs2)}(abs2, 1:4), 1)
For details on precise processing and configuration, see Base.rest
.
Compatibility: Julia 1.9
For |
Assignment merges can also occur in any other position. However, unlike merging the end of a collection, such a merge is always straightforward.
julia> a, b..., c = 1:5
1:5
julia> a
1
julia> b
3-element Vector{Int64}:
2
3
4
julia> c
5
julia> front..., tail = "Hi!"
"Hi!"
julia> front
"Hi"
julia> tail
'!': ASCII/Unicode U+0021 (category Po: Punctuation, other)
It is implemented through the function Base.split_rest
.
Note that for function definitions with a variable number of arguments, merging is still only allowed in the final position. However, this does not apply to destructuring of a single argument, as it does not affect method dispatch:
julia> f(x..., y) = x
ERROR: syntax: invalid "..." on non-final argument
Stacktrace:
[...]
julia> f((x..., y)) = x
f (generic function with 1 method)
julia> f((1, 2, 3))
(1, 2)
Destructuring properties
Instead of iteration-based destructuring, the right-hand side of assignments can be destructured using property names. This destructuring follows the syntax for NamedTuples and works by assigning each variable on the left a property from the right side of the assignment with the same name using getproperty
:
julia> (; b, a) = (a=1, b=2, c=3)
(a = 1, b = 2, c = 3)
julia> a
1
julia> b
2
Destructuring arguments
The destructuring function can also be used in the function argument. If the name of the function argument is written as a tuple (for example, (x, y)
), and not just a character, then the assignment (x, y) = argument
will be inserted.:
julia> minmax(x, y) = (y < x) ? (y, x) : (x, y)
julia> gap((min, max)) = max - min
julia> gap(minmax(10, 2))
8
Note the additional set of parentheses in the definition of gap'. Without them, `gap
would be a function with two arguments, and this example would not work.
Similarly, property destructuring can also be used in function arguments.:
julia> foo((; x, y)) = x + y
foo (generic function with 1 method)
julia> foo((x=1, y=2))
3
julia> struct A
x
y
end
julia> foo(A(3, 4))
7
For anonymous functions, an additional comma is required to destructure a single argument.:
julia> map(((x, y),) -> x + y, [(1, 2), (3, 4)]) 2-element Array{Int64,1}: 3 7
Functions with a variable number of arguments (Vararg)
It is often convenient to be able to write functions that take an arbitrary number of arguments. Such functions are traditionally known as vararg functions, which is short for variable number of arguments. You can define the vararg function by adding an ellipsis after the last positional argument.:
julia> bar(a, b, x...) = (a, b, x)
bar (generic function with 1 method)
The variables a
and b
, as usual, are associated with the values of the first two arguments, and the variable x
is associated with an iterable collection containing zero or more values passed to bar
after its first two arguments.:
julia> bar(1, 2)
(1, 2, ())
julia> bar(1, 2, 3)
(1, 2, (3,))
julia> bar(1, 2, 3, 4)
(1, 2, (3, 4))
julia> bar(1, 2, 3, 4, 5, 6)
(1, 2, (3, 4, 5, 6))
In all these cases, x
is associated with the tuple of final values passed to `bar'.
It is possible to limit the number of values passed as an argument to a variable; we will discuss this later in the section Parametrically limited function methods with a variable number of arguments.
At the same time, it is often convenient to "split" the values contained in an iterable collection into separate arguments when calling a function. ...
is also used for this, but in the function call:
julia> x = (3, 4)
(3, 4)
julia> bar(1, 2, x...)
(1, 2, (3, 4))
In this case, the tuple of values is "glued together" when calling a function with a variable number of arguments exactly where the variable number of arguments goes. However, this is not necessary.:
julia> x = (2, 3, 4)
(2, 3, 4)
julia> bar(1, x...)
(1, 2, (3, 4))
julia> x = (1, 2, 3, 4)
(1, 2, 3, 4)
julia> bar(x...)
(1, 2, (3, 4))
Moreover, an iterable object that is separated when calling a function does not necessarily have to be a tuple.:
julia> x = [3, 4]
2-element Vector{Int64}:
3
4
julia> bar(1, 2, x...)
(1, 2, (3, 4))
julia> x = [1, 2, 3, 4]
4-element Vector{Int64}:
1
2
3
4
julia> bar(x...)
(1, 2, (3, 4))
Also, a function whose arguments are split does not necessarily have to be a function with a variable number of arguments (although it often is).:
julia> baz(a, b) = a + b;
julia> args = [1, 2]
2-element Vector{Int64}:
1
2
julia> baz(args...)
3
julia> args = [1, 2, 3]
3-element Vector{Int64}:
1
2
3
julia> baz(args...)
ERROR: MethodError: no method matching baz(::Int64, ::Int64, ::Int64)
The function `baz` exists, but no method is defined for this combination of argument types.
Closest candidates are:
baz(::Any, ::Any)
@ Main none:1
Stacktrace:
[...]
As you can see, if the split container contains the wrong number of elements, the function call will fail, just as it would have failed if too many arguments had been explicitly set.
Optional arguments
It is often possible to provide reasonable default values for function arguments. This saves users from passing every argument on every call. For example, the function Date(y, [m, d])
from the Dates
module constructs the Date
type for the specified year y
, month m
and day d
. However, the arguments m
and d
are optional, and their default value is `1'. In short, this behavior can be expressed as follows.
julia> using Dates
julia> function date(y::Int64, m::Int64=1, d::Int64=1)
err = Dates.validargs(Date, y, m, d)
err === nothing || throw(err)
return Date(Dates.UTD(Dates.totaldays(y, m, d)))
end
date (generic function with 3 methods)
Note that this definition calls another method of the Date
function, which takes a single argument of the type UTInstant'.{Day}
.
With this definition, the function can be called with one, two, or three arguments, and 1
is passed automatically when only one or two arguments are specified.:
julia> date(2000, 12, 12)
2000-12-12
julia> date(2000, 12)
2000-12-01
julia> date(2000)
2000-01-01
In fact, optional arguments are just a convenient syntax for writing definitions of multiple methods with different numbers of arguments (see the section Note about optional and named arguments). You can check this using our example of the date
function by calling the methods
function.:
julia> methods(date)
# 3 метода для универсальной функции date:
[1] date(y::Int64) in Main at REPL[1]:1
[2] date(y::Int64, m::Int64) in Main at REPL[1]:1
[3] date(y::Int64, m::Int64, d::Int64) in Main at REPL[1]:1
Named arguments
Some functions require a large number of arguments, or they have a large number of behaviors. It can be difficult to remember how to call such functions. Named arguments can make these complex interfaces easier to use and extend them by allowing arguments to be defined by name rather than just position.
For example, consider the plot
function, which builds a line. This function can have many parameters that control the line style, thickness, color, etc. If it accepts named arguments, a possible call may look like plot(x, y, width=2)
, in which we decided to specify only the line thickness. Note that this serves two purposes. The call is easier to read because we can mark the argument with its value. It also becomes possible to pass any subset of a large number of arguments in any order.
Functions with named arguments are defined using a semicolon in the signature.:
function plot(x, y; style="solid", width=1, color="black")
###
end
When calling the function, a semicolon is optional: you can call plot(x, y, width=2)
or plot(x, y; width=2)
, but the first style is more common. Explicitly specifying a semicolon is required only for passing functions with a variable number of arguments or calculated keywords, as described below.
The default values of named arguments are calculated only when necessary (when the corresponding named argument is not passed) and in left-to-right order. Therefore, default expressions can refer to the previous named arguments.
The types of named arguments can be made explicit as follows:
function f(; x::Int=1)
###
end
Named arguments can also be used in functions with a variable number of arguments.:
function plot(x...; style="solid")
###
end
Additional named arguments can be collected using ...
, as in functions with a variable number of arguments.:
function f(x; y=0, kwargs...)
###
end
Inside f
, kwargs
will be an immutable iterator of key-value pairs with respect to the named tuple. Named tuples (as well as dictionaries with the Symbol
keys and other iterators that produce collections of two values with the symbol as the first values) can be passed as named arguments using a semicolon in the call, for example f(x, z=1; kwargs...)
.
If a named argument is not assigned a default value in the method definition, then it is mandatory: an exception will be thrown. UndefKeywordError
, if the caller does not assign a value to it:
function f(x; y)
###
end
f(3, y=5) # ОК, y присвоено значение
f(3) # выдает UndefKeywordError(:y)
You can also pass the expressions key => value
after the semicolon. For example, plot(x, y; :width => 2)
is equivalent to `plot(x, y, width=2)'. This is useful in situations where the keyword name is calculated at runtime.
If a "pure" identifier or an expression with a period occurs after a semicolon, the name of the named argument is implied by the identifier or field name. For example, plot(x, y; width)
is equivalent to plot(x, y; width= width)
, and plot(x, y; options.width)
is equivalent to `plot(x, y; width=options.width)'.
The essence of named arguments allows you to specify the same argument more than once. For example, in the call plot(x, y; options..., width=2)
it is possible that the options
structure also contains a value for width'. In this case, the rightmost occurrence takes precedence; in this example, `width
will undoubtedly have the value 2'. However, explicitly specifying the same named argument multiple times, e.g. `plot(x, y, width=2, width=3)
, is not allowed and leads to a syntax error.
The syntax of the do block for function arguments
Passing functions as arguments to other functions is an effective technique, but the appropriate syntax is not always convenient. It is especially inconvenient to record such calls when several lines are required for the function argument. As an example, let’s look at several call options map
for the function:
map(x->begin
if x < 0 && iseven(x)
return 0
elseif x == 0
return 1
else
return x
end
end,
[A, B, C])
Julia has the reserved word do
, which allows you to write this code in a clearer way.:
map([A, B, C]) do x
if x < 0 && iseven(x)
return 0
elseif x == 0
return 1
else
return x
end
end
The syntax do x
creates an anonymous function with the argument x
and passes it as the first argument to an external function. In this example, it is map
. Similarly, do a,b
would create an anonymous function with two arguments. Note that do(a,b)
would create an anonymous function with one argument, which is a tuple to be deconstructed. A simple do
would declare that it is followed by an anonymous function in the form () -> ...
.
The way these arguments are initialized depends on the "external" function; here map
will sequentially set the value x
for A
, B
, C
, calling an anonymous function for each of them, just as it would happen in the syntax `map(func, [A, B, C])'.
This syntax makes it easier to use functions to effectively extend the language, as the calls look like normal blocks of code. There are many use cases that are quite different from map
, for example, system status management. For example, there is a version open
, which runs code that ensures that an open file will eventually be closed:
open("outfile", "w") do io
write(io, data)
end
This is accomplished by the following definition:
function open(f::Function, args...)
io = open(args...)
try
f(io)
finally
close(io)
end
end
Here 'open` first opens the file for writing, and then passes the resulting output stream to the anonymous function that you defined in the 'do ... end` block. After the function is completed 'open` makes sure that the thread is closed correctly, regardless of whether the function terminated normally or threw an exception. (The 'try/finally` construction will be described in the section Control flow.)
With the syntax of the do
block, it helps to check the documentation or implementation to find out how the arguments to a custom function are initialized.
The 'do` block, like any other internal function, can "capture" variables from its scope. For example, the data
variable in the above example is open...do
is captured from the external area. Captured variables can create performance issues, as discussed in Performance Tips.
Function composition and pipelining
Functions in Julia can be combined together by composition or pipelining (chaining).
Function composition is when you combine functions together and apply the resulting composition to arguments. You use the function composition operator (∘
) to compose functions, so (f ∘ g)(args...; kw...)
is the same as f(g(args...; kw...))
.
You can enter the composition operator in REPL and editors with the appropriate configuration using \circ<tab>
.
For example, the composition of the functions sqrt
and +
can be performed as follows.
julia> (sqrt ∘ +)(3, 6)
3.0
This code adds up the numbers first, and then finds the square root of the result.
The following example composes three functions and compares the result for an array of strings.:
julia> map(first ∘ reverse ∘ uppercase, split("you can compose functions like this"))
6-element Vector{Char}:
'U': ASCII/Unicode U+0055 (category Lu: Letter, uppercase)
'N': ASCII/Unicode U+004E (category Lu: Letter, uppercase)
'E': ASCII/Unicode U+0045 (category Lu: Letter, uppercase)
'S': ASCII/Unicode U+0053 (category Lu: Letter, uppercase)
'E': ASCII/Unicode U+0045 (category Lu: Letter, uppercase)
'S': ASCII/Unicode U+0053 (category Lu: Letter, uppercase)
Chaining functions (sometimes called "pipelining" or "using pipelines" to send data to a subsequent function) is when you apply a function to the output of a previous function.:
julia> 1:10 |> sum |> sqrt
7.416198487095663
Here, the amount received by sum' is passed to the `sqrt
function. The equivalent composition will be as follows.
julia> (sqrt ∘ sum)(1:10)
7.416198487095663
The pipelining operator can also be used with translation, like .|>
, to provide a useful combination of chaining or pipelining and dot vectorization syntax (described below).
julia> ["a", "list", "of", "strings"] .|> [uppercase, reverse, titlecase, length]
4-element Vector{Any}:
"A"
"tsil"
"Of"
7
When combining pipelines with anonymous functions, parentheses should be used if subsequent pipelines should not be analyzed as part of the anonymous function body. Compare:
julia> 1:3 .|> (x -> x^2) |> sum |> sqrt
3.7416573867739413
julia> 1:3 .|> x -> x^2 |> sum |> sqrt
3-element Vector{Float64}:
1.0
2.0
3.0
Dot-separated syntax for vectorization functions
In technical computing languages, "vectorized" versions of functions are common, which simply apply the given function f(x)`to each element of the array `A
to output a new array via f(A)
. This kind of syntax is convenient for data processing, but in other languages vectorization is often required to ensure performance: if loops are slow, the "vectorized" version of the function can call fast code from libraries written in the language low-level. In Julia, vectorized functions are not required to achieve performance, and in fact it is often more profitable to write loops yourself (see the section Performance Tips), but such features can still be handy. Therefore, any Julia function f
can be applied element-wise to any array (or other collection) with the syntax f.(A)
. For example, sin
can be applied to all elements in the vector A
in this way:
julia> A = [1.0, 2.0, 3.0]
3-element Vector{Float64}:
1.0
2.0
3.0
julia> sin.(A)
3-element Vector{Float64}:
0.8414709848078965
0.9092974268256817
0.1411200080598672
Of course, you can omit the dot if you write a specialized "vector" method f
, for example using f(A::AbstractArray) = map(f, A)
, and it will be as efficient as f.(A)
. The advantage of the syntax is f.(A)
is that the library author does not need to decide in advance which functions are vectorizable.
In a broader sense, f.(args...)
is actually the equivalent of broadcast(f, args...)
, which allows you to perform operations with multiple arrays (even of different shapes) or with combinations of arrays and scalars (see Broadcast). For example, if you have f(x, y) = 3x + 4y
, then f.(pi, A)
will return a new array consisting of f(pi,a)
for each a
in A
, and f.(vector1, vector2)
will return a new vector consisting of f(vector1[i], vector2[i])
for each index i
(throwing an exception if the vectors have different lengths).
julia> f(x, y) = 3x + 4y;
julia> A = [1.0, 2.0, 3.0];
julia> B = [4.0, 5.0, 6.0];
julia> f.(pi, A)
3-element Vector{Float64}:
13.42477796076938
17.42477796076938
21.42477796076938
julia> f.(A, B)
3-element Vector{Float64}:
19.0
26.0
33.0
Named arguments are not translated, but simply passed through each function call. For example, round.(x, digits=3)
is equivalent to broadcast(x -> round(x, digits=3), x)
.
Moreover, the embedded calls are f.(args...)
_ are combined into one broadcast
loop. For example, sin.(cos.(X))
is equivalent to broadcast(x -> sin(cos(x)), X)
, similar to [sin(cos(x)) for x in X]
: there is only one loop of X
, and one array is allocated for the result. [On the contrary, sin(cos(X))
in a typical "vectorized" language, I would first allocate one temporary array for tmp=cos(X)
, and then calculate sin(tmp)
in a separate loop, allocating the second array.] This merging of loops is not a compiler optimization that may or may not happen, it is a syntactic guarantee in all cases where nested calls to f occur.(args...)
. Technically, merging stops as soon as a call to a "pointless" function is encountered; for example, in sin.(sort(cos.(X)))`the cycles `sin
and cos
cannot be combined due to the "interfering" function `sort'.
Finally, maximum efficiency is usually achieved when the output array of a vectorized operation is pre-allocated, so that repeated calls do not allocate new arrays for the results over and over again (see section Pre-allocated output data). A convenient syntax for this is 'X .= ..., which is equivalent to `broadcast!(identity, X, ...)
, except that, as in the example above, the loop is broadcast!
combines with any nested calls separated by a dot. For example, X .= sin.(Y)`equivalent to `broadcast!(sin, X, Y)
, overwriting X
with sin.(Y)
on the spot. If the left part is an expression indexing an array, for example X[begin+1:end] .= sin.(Y)
, it translates to broadcast!`on `view', for example, `broadcast!(sin, view(X, firstindex(X)+1:lastindex(X)), Y)
so that the left part is updated in place.
Since adding dots to many operations and function calls in an expression can be tedious and lead to writing hard-to-read code, there is a macro @.
to convert every function call, operation, and assignment in an expression to a version with a dot.
julia> Y = [1.0, 2.0, 3.0, 4.0];
julia> X = similar(Y); # предварительное выделение выходного массива
julia> @. X = sin(cos(Y)) # эквивалентно X .= sin.(cos.(Y))
4-element Vector{Float64}:
0.5143952585235492
-0.4042391538522658
-0.8360218615377305
-0.6080830096407656
Binary (or unary) operators such as .+
are handled using the same mechanism.: they are equivalent to the broadcast
calls and are combined with other nested calls separated by a dot. X .+= Y
and so on is equivalent to X .= X .+ Y
and results in a combined assignment in place; also see the section Point operators.
You can also combine dot operations with chaining functions using [ |
>), as in this example: |
julia> 1:5 .|> [x->x^2, inv, x->2*x, -, isodd]
5-element Vector{Real}:
1
0.5
6
-4
true
All functions in the combined translation are always called for each result element. Thus, X .+ σ .* randn.()
will add a mask of independent and identically selected random variables to each element of the array X
, and X .+ σ .* randn()
will add a different random sample to each element. In cases where the combined calculations are constant along one or more axes of the translation iteration, it is possible to resort to a compromise between space and time and allocate intermediate values to reduce the number of calculations. For more information, see the chapters section. "Performance Tips".
For further reading
It should be mentioned here that the described picture of the definition of functions is far from complete. Julia uses a complex type system and allows multiple dispatch of argument types. None of the examples given here provide any type annotations with respect to their arguments, which means that they apply to all types of arguments. The type system is described in the section Types, and the definition of a function in terms of methods selected by multiple dispatch for argument types at runtime is described in Methods.