Engee documentation

Metaprogramming

Julia’s most important Lisp legacy is its support for metaprogramming. Like Lisp, Julia presents its own code as the data structure of the language itself. Since the code is represented by objects that can be created and manipulated in the language, the program has the ability to transform and generate its own code. This allows you to generate complex code without additional assembly steps, as well as use macros in the true Lisp style, working at the level of https://en.wikipedia.org/wiki/Abstract_syntax_tree [abstract syntax trees]. In contrast, preprocessing "macro" systems similar to those used in C and C++, perform operations with the text and its replacement before the analysis or interpretation actually takes place. Since all data types and code in Julia are represented by Julia data structures, powerful features are available. https://en.wikipedia.org/wiki/Reflection_%28computer_programming%29 [reflections] that allow you to study the internal structure of a program and its types, just like any other data.

Metaprogramming is a powerful tool, but it introduces additional difficulties, making it difficult to understand the code. For example, it can be unexpectedly difficult to understand the rules for defining scopes. As a rule, metaprogramming should be resorted to only if other approaches are not applicable, such as higher-order functions and https://en.wikipedia.org/wiki/Closure_ (computer_programming)[short circuits].

eval and defining new macros are usually a last resort. It is almost never recommended to use Meta.parse or convert an arbitrary string to Julia code. To manipulate Julia’s code, use the Expr data structure directly so that you don’t have to understand the specifics of Julia syntax analysis.

In the best examples of metaprogramming applications, most of the functionality is often implemented as auxiliary runtime functions with a minimal amount of code to create.

Presentation of the program

Each Julia program starts its life as a string:

julia> prog = "1 + 1"
"1 + 1"
  • What happens next?*

The next step is https://en.wikipedia.org/wiki/Parsing#Computer_languages [convert] each string into an object called an expression, represented by the Julia type Expr:

julia> ex1 = Meta.parse(prog)
:(1 + 1)

julia> typeof(ex1)
Expr

The 'Expr` objects contain two parts:

julia> ex1.head
:call
  • arguments are expressions that can be characters, other expressions, or literal values.:

julia> ex1.args
3-element Vector{Any}:
  :+
 1
 1

Expressions can also be constructed directly in https://en.wikipedia.org/wiki/Polish_notation [prefix notation]:

julia> ex2 = Expr(:call, :+, 1, 1)
:(1 + 1)

The two expressions constructed above, by analysis and by direct construction, are equivalent:

julia> ex1 == ex2
true

The key point here is that Julia’s code is internally represented as a data structure accessible from the language itself.

Function dump represents Expr objects with indents and comments:

julia> dump(ex2)
Expr
  head: Symbol call
  args: Array{Any}((3,))
    1: Symbol +
    2: Int64 1
    3: Int64 1

Expr objects can also be nested:

julia> ex3 = Meta.parse("(4 + 4) / 2")
:((4 + 4) / 2)

Another way to view expressions is to use Meta.show_sexpr, which displays https://en.wikipedia.org/wiki/S-expression [S-expressions] of a given Expr in a form that may seem very familiar to Lisp users. Here is an example illustrating the display on a nested Expr:

julia> Meta.show_sexpr(ex3)
(:call, :/, (:call, :+, 4, 4), 2)

Symbols

The : symbol has two syntactic functions in Julia. The first form creates Symbol, https://en.wikipedia.org/wiki/String_interning [saved string] used as one building block of expressions based on valid names:

julia> s = :foo
:foo

julia> typeof(s)
Symbol

Designer Symbol takes any number of arguments and creates a new symbol by concatenating the representation of their strings:

julia> :foo === Symbol("foo")
true

julia> Symbol("1foo") # `:1foo` не сработает, так как `1foo` не является допустимым именем
Symbol("1foo")

julia> Symbol("func",10)
:func10

julia> Symbol(:var,'_',"sym")
:var_sym

In the context of an expression, symbols are used to indicate access to variables; when an expression is evaluated, the symbol is replaced by the value associated with that symbol in the corresponding regions.

Sometimes additional parentheses are required around the : argument to avoid ambiguity during analysis.:

julia> :(:)
:(:)

julia> :(::)
:(::)

Expressions and calculation

Enclosing in quotation marks

The second syntactic function of the symbol : is to create expression objects without explicitly using the constructor Expr. This is called quoting. The character : followed by a pair of parentheses around a single Julia code expression produces an 'Expr` object based on the included code. Here is an example of a short form used to quote an arithmetic expression:

julia> ex = :(a+b*c+1)
:(a + b * c + 1)

julia> typeof(ex)
Expr

(to view the structure of this expression, try ex.head and ex.args' or use `dump, as shown above, or Meta.@dump)

Note that equivalent expressions can be constructed using Meta.parse' or the direct form `Expr:

julia>      :(a + b*c + 1)       ==
       Meta.parse("a + b*c + 1") ==
       Expr(:call, :+, :a, Expr(:call, :*, :b, :c), 1)
true

Expressions provided by the analyzer usually have only symbols, other expressions, and literal values as their arguments, while expressions constructed by Julia code can have arbitrary runtime variables without literal forms as arguments. In this particular example, + and a are characters, *(b,c) is a subexpression, and 1 is a signed literal 64-bit integer.

There is a second syntactic form of quoting for several expressions: code blocks enclosed in quote ... end.

julia> ex = quote
           x = 1
           y = 2
           x + y
       end
quote
    #= none:2 =#
    x = 1
    #= none:3 =#
    y = 2
    #= none:4 =#
    x + y
end

julia> typeof(ex)
Expr

Interpolation

Direct construction of objects Expr is effective with value arguments, but using the Expr constructors can be tedious compared to Julia’s "normal" syntax. Alternatively, Julia allows interpolation of literals or expressions into quoted expressions. The interpolation is indicated by the prefix $.

In this example, the value of the variable a is interpolated.:

julia> a = 1;

julia> ex = :($a + b)
:(1 + b)

Interpolation into a non-quoted expression is not supported and will cause a compile-time error.:

julia> $a + b
ERROR: syntax: "$" expression outside quote

In this example, the tuple (1,2,3) is interpolated as an expression in the condition check.:

julia> ex = :(a in $:((1,2,3)) )
:(a in (1, 2, 3))

Using $ to interpolate expressions intentionally resembles string interpolation and command interpolation. Expression interpolation provides convenient and readable software construction of complex Julia expressions.

Split interpolation

Note that the $ interpolation syntax allows only one expression to be inserted into the enclosing expression. In some cases, you have an array of expressions, and you need them all to become arguments to the surrounding expression. This can be done using the syntax $(xs...). For example, the following code creates a function call in which the number of arguments is programmatically determined:

julia> args = [:x, :y, :z];

julia> :(f(1, $(args...)))
:(f(1, x, y, z))

Embedded citation

Naturally, it is possible for citation expressions to contain other citation expressions. Understanding how interpolation works in these cases can be somewhat difficult. Consider this example:

julia> x = :(1 + 2);

julia> e = quote quote $x end end
quote
    #= none:1 =#
    $(Expr(:quote, quote
    #= none:1 =#
    $(Expr(:$, :x))
end))
end

Note that the result contains $x, which means that x has not been calculated yet. In other words, the expression $ "belongs" to the internal quoting expression, and therefore its argument is evaluated only when the internal quoting expression is the following:

julia> eval(e)
quote
    #= none:1 =#
    1 + 2
end

However, the external expression quote can interpolate the values inside the $ in the internal quotation. This is done with a few '$`:

julia> e = quote quote $$x end end
quote
    #= none:1 =#
    $(Expr(:quote, quote
    #= none:1 =#
    $(Expr(:$, :(1 + 2)))
end))
end

Note that the result now displays (1 + 2) rather than the x character. Calculating this expression yields an interpolated 3:

julia> eval(e)
quote
    #= none:1 =#
    3
end

The intuition behind this behavior is that x is calculated once for each $: one $ works similarly to eval(:x), assigning a value to x, while two $ perform the equivalent of eval(eval(:x)).

QuoteNode

The usual representation of a citation (quote) in the AST is Expr with the title :quote:

julia> dump(Meta.parse(":(1+2)"))
Expr
  head: Symbol quote
  args: Array{Any}((1,))
    1: Expr
      head: Symbol call
      args: Array{Any}((3,))
        1: Symbol +
        2: Int64 1
        3: Int64 2

As we have seen, such expressions support interpolation with $. However, in some situations it is necessary to quote the _ code without performing the interpolation. This type of quoting does not yet have a syntax, but it has an internal representation as a QuoteNode object`:

julia> eval(Meta.quot(Expr(:$, :(1+2))))
3

julia> eval(QuoteNode(Expr(:$, :(1+2))))
:($(Expr(:$, :(1 + 2))))

The analyzer outputs a QuoteNode for simple quoted elements such as characters:

julia> dump(Meta.parse(":x"))
QuoteNode
  value: Symbol x

The 'QuoteNode` can also be used for certain difficult-to-implement metaprogramming tasks.

Calculating expressions

If there is an expression object, you can make the Julia environment calculate (execute) it in the global scope using eval:

julia> ex1 = :(1 + 2)
:(1 + 2)

julia> eval(ex1)
3

julia> ex = :(a + b)
:(a + b)

julia> eval(ex)
ERROR: UndefVarError: `b` not defined in `Main`
[...]

julia> a = 1; b = 2;

julia> eval(ex)
3

At everyone module has its own function eval, evaluates expressions in its global scope. Expressions passed eval are not limited to return values — they can also have side effects that change the environment state of the enabling module.:

julia> ex = :(x = 1)
:(x = 1)

julia> x
ERROR: UndefVarError: `x` not defined in `Main`

julia> eval(ex)
1

julia> x
1

Here` the evaluation of the expression object results in the assignment of a value to the global variable `x'.

Since expressions are just Expr objects, they can be constructed programmatically, and then it can be calculated whether it is possible to dynamically create arbitrary code, which can then be executed using eval. Here is a simple example.

julia> a = 1;

julia> ex = Expr(:call, :+, a, :b)
:(1 + b)

julia> a = 0; b = 2;

julia> eval(ex)
3

The value a is used to construct the expression ex, which applies the function + to the value 1 and the variable b. Note the important difference between how a and b are used.:

  • The value of the _ variable_ a is used as the direct value in the expression during the construction of the expression. Thus, the value of a is no longer important when evaluating an expression: the value in the expression is already 1, regardless of what the value of a is.

  • In turn, the _ character_ :b is used in the construction of the expression, so the value of the variable b at this time is immaterial, :b is just a symbol, and the variable b does not even need to be defined. However, during the evaluation of the expression, the value of the symbol :b is resolved by searching for the value of the variable b.

Functions in Expr expressions

As noted above, one of the extremely useful features of Julia is the ability to generate and manipulate Julia code in the Julia language itself. We have already seen one example of a function that returns objects. Expr: function Meta.parse', which accepts Julia’s code and returns the corresponding `Expr'. The function can also take one or more `Expr objects as arguments and return another `Expr'. Here is a simple motivating example.

julia> function math_expr(op, op1, op2)
           expr = Expr(:call, op, op1, op2)
           return expr
       end
math_expr (generic function with 1 method)

julia>  ex = math_expr(:+, 1, Expr(:call, :*, 4, 5))
:(1 + 4 * 5)

julia> eval(ex)
21

Let’s give another example. Here is a function that doubles any numeric argument, but leaves the expressions untouched.:

julia> function make_expr2(op, opr1, opr2)
           opr1f, opr2f = map(x -> isa(x, Number) ? 2*x : x, (opr1, opr2))
           retexpr = Expr(:call, op, opr1f, opr2f)
           return retexpr
       end
make_expr2 (generic function with 1 method)

julia> make_expr2(:+, 1, 2)
:(2 + 4)

julia> ex = make_expr2(:+, 1, Expr(:call, :*, 5, 8))
:(2 + 5 * 8)

julia> eval(ex)
42

Macros

Macros provide a mechanism for including the generated code in the final program body. The macro maps the tuple of arguments to the returned expression, and the resulting expression is compiled directly and does not require a call. eval during execution. Macro arguments can be expressions, literal values, and symbols.

The basics

Here is an extremely simple macro:

julia> macro sayhello()
           return :( println("Hello, world!") )
       end
@sayhello (macro with 1 method)

Macros in Julia syntax have a dedicated character: @ (commercial at), followed by the declared in the block macro NAME ... end. In this example, the compiler will replace all instances of @sayhello with:

:( println("Hello, world!") )

When @sayhello is entered into the REPL, the expression is executed immediately, so we only see the result of the calculation.:

julia> @sayhello()
Hello, world!

Now let’s look at a slightly more complex macro.:

julia> macro sayhello(name)
           return :( println("Hello, ", $name) )
       end
@sayhello (macro with 1 method)

This macro takes one argument: name'. If `@sayhello occurs, the quoted expression is expanded to interpolate the value of the argument in the final expression.:

julia> @sayhello("human")
Hello, human

We can view the returned quoted expression using the function macroexpand (important note: it is an extremely useful tool for debugging macros):

julia> ex = macroexpand(Main, :(@sayhello("human")) )
:(Main.println("Hello, ", "human"))

julia> typeof(ex)
Expr

You can see that the literal "human" is interpolated into the expression.

There is also a macro '@macroexpand`, which is perhaps a little more convenient than the macroexpand function:

julia> @macroexpand @sayhello "human"
:(println("Hello, ", "human"))

Wait a minute: why macros?

We have already seen the function f(::Expr...) -> Expr in the previous section. Actually macroexpand is also such a function. So why do macros exist?

Macros are necessary because they are executed when the code is analyzed, therefore, macros allow the programmer to create and include fragments of user code before the program is executed as a whole. To illustrate the difference, consider the following example.

julia> macro twostep(arg)
           println("I execute at parse time. The argument is: ", arg)
           return :(println("I execute at runtime. The argument is: ", $arg))
       end
@twostep (macro with 1 method)

julia> ex = macroexpand(Main, :(@twostep :(1, 2, 3)) );
I execute at parse time. The argument is: :((1, 2, 3))

The first challenge println is executed when called macroexpand. The resulting expression contains only the second println:

julia> typeof(ex)
Expr

julia> ex
:(println("I execute at runtime. The argument is: ", $(Expr(:copyast, :($(QuoteNode(:((1, 2, 3)))))))))

julia> eval(ex)
I execute at runtime. The argument is: (1, 2, 3)

Calling a macro

Macros are called using the following general syntax:

@name expr1 expr2 ...
@name(expr1, expr2, ...)

Note the distinctive @ before the macro name, the absence of commas between the argument expressions in the first form, and the absence of a space after the @name in the second. These two styles should not be mixed. For example, the following syntax differs from the examples above; it passes the tuple (expr1, expr2, ...) as a single argument to the macro:

@name (expr1, expr2, ...)

An alternative way to call a macro for an array literal (or inclusion) is to contrast one against the other without using parentheses. In this case, the array will be the only expression passed to the macro. The following syntax is equivalent (and different from @name [a b] * v):

@name[a b] * v
@name([a b]) * v

It is important to emphasize that macros receive their arguments as expressions, literals, or symbols. One way to examine the arguments of a macro is to call a function show in the body of the macro:

julia> macro showarg(x)
           show(x)
           # …оставшаяся часть макроса, возвращается выражение
       end
@showarg (macro with 1 method)

julia> @showarg(a)
:a

julia> @showarg(1+1)
:(1 + 1)

julia> @showarg(println("Yo!"))
:(println("Yo!"))

julia> @showarg(1)        # Числовой литерал
1

julia> @showarg("Yo!")    # Строковой литерал
"Yo!"

julia> @showarg("Yo! $("hello")")    # An interpolated string is an expression, not a string
:("Yo! $("hello")")

In addition to the specified list of arguments, each macro is passed additional arguments named `__source__' and `__module__'.

The __source__ argument provides information (in the form of a LineNumberNode object) about the location of the @ sign analyzer from the macro call. This allows macros to include better error diagnosis information and is commonly used, for example, by logging, string analysis macros, and documents, as well as for implementing macros. @__LINE__, @__FILE__ and @__DIR__.

Location information can be accessed by referring to __source__.line and __source__.file:

julia> macro __LOCATION__(); return QuoteNode(__source__); end
@__LOCATION__ (macro with 1 method)

julia> dump(
            @__LOCATION__(
       ))
LineNumberNode
  line: Int64 2
  file: Symbol none

The __module__ argument provides information (in the form of a Module object) about the context of the macro call extension. This allows the macro to search for context information, such as existing relationships, or insert a value as an additional argument into a run-time function call that performs self-checking in the current module.

Creating an extended macro

Here is a simplified definition of a macro @assert Julia:

julia> macro assert(ex)
           return :( $ex ? nothing : throw(AssertionError($(string(ex)))) )
       end
@assert (macro with 1 method)

This macro can be used as follows:

julia> @assert 1 == 1.0

julia> @assert 1 == 0
ERROR: AssertionError: 1 == 0

Instead of the written syntax, the macro is expanded during analysis to its returned result. This is equivalent to the following entry:

1 == 1.0 ? nothing : throw(AssertionError("1 == 1.0"))
1 == 0 ? nothing : throw(AssertionError("1 == 0"))

That is, in the first call, the expression :(1 == 1.0) is woven into the slot of the test condition, while the value of string(:(1 == 1.0)) is woven into the slot of the approval message. The entire expression, constructed in this way, is placed in the syntax tree in which the macro @assert is called. Then, during execution, if the value of the test expression is true when it is evaluated, it is returned nothing, whereas if the value is false, an error is returned indicating that the expression being validated had the value false. Note that it is impossible to write this as a function, since only the value of condition is available, and it is impossible to display the expression that calculated it in the error message.

The actual definition of @assert in Julia Base is more complicated. It allows the user to specify their own error message options, rather than just displaying a failed expression on the screen. Just like in functions with a variable number of arguments (Functions with a variable number of arguments (Vararg)), this is indicated by an ellipsis after the last argument:

julia> macro assert(ex, msgs...)
           msg_body = isempty(msgs) ? ex : msgs[1]
           msg = string(msg_body)
           return :($ex ? nothing : throw(AssertionError($msg)))
       end
@assert (macro with 1 method)

Now '@assert` has two modes of operation depending on the number of arguments received! If there is only one argument, the expression tuple captured by msgs will be empty and will behave the same as the simpler definition above. But now, if the user specifies the second argument, it is displayed in the body of the message, and not in the failed expression. You can examine the results of macro expansion with an appropriately named macro. @macroexpand:

julia> @macroexpand @assert a == b
:(if Main.a == Main.b
        Main.nothing
    else
        Main.throw(Main.AssertionError("a == b"))
    end)

julia> @macroexpand @assert a==b "a should equal b!"
:(if Main.a == Main.b
        Main.nothing
    else
        Main.throw(Main.AssertionError("a should equal b!"))
    end)

There is another case that the @assert macro handles: what if, in addition to displaying "a should equal b", we want to display their values? You can naively try to use string interpolation in a user message, for example, @assert a==b "a ( b)!", but this will not work as expected in the above macro. Do you understand why? Remember that in the section String interpolation it was said that the interpolated string is overwritten in the call string. Compare:

julia> typeof(:("a should equal b"))
String

julia> typeof(:("a ($a) should equal b ($b)!"))
Expr

julia> dump(:("a ($a) should equal b ($b)!"))
Expr
  head: Symbol string
  args: Array{Any}((5,))
    1: String "a ("
    2: Symbol a
    3: String ") should equal b ("
    4: Symbol b
    5: String ")!"

So now, instead of getting a simple string msg_body, the macro gets the entire expression, which needs to be calculated in order to display the data as expected. This can be woven into the returned expression as a call argument. string; for the full implementation, see by the link https://github.com/JuliaLang/julia/blob/master/base/error.jl [error.jl].

The '@assert` macro makes extensive use of interweaving quoted expressions to simplify operations with expressions inside the macro body.

Hygiene

The problem that arises in more complex macros is https://en.wikipedia.org/wiki/Hygienic_macro [hygiene]. In short, macros must ensure that the variables they represent in returned expressions do not accidentally conflict with existing variables in the surrounding code they are deployed to. Conversely, expressions that are passed to macros as arguments are often expected to perform calculations in the context of the surrounding code, interacting with existing variables and changing them. Another problem arises from the fact that a macro can be called in a module other than the one in which it was defined. In this case, we need to ensure that all global variables are resolved into the correct module. Julia already has an important advantage over languages with text macro expansion (such as C), which is that it is necessary to consider only returned expressions. All other variables (such as msg in @assert above) follow normal behavior of the scoping block.

To demonstrate these problems, consider writing the macro @time, which takes an expression as its argument, records the time, evaluates the expression, records the time again, displays the difference between the time before and after, and then gets the value of the expression as the final value. The macro may look like this:

macro time(ex)
    return quote
        local t0 = time_ns()
        local val = $ex
        local t1 = time_ns()
        println("elapsed time: ", (t1-t0)/1e9, " seconds")
        val
    end
end

Here we want t0, t1 and val to be private temporary variables, and also for time_ns to refer to the function time_ns in Julia Base, and not to any variable time_ns' that the user may have (the same applies to `println'). Imagine the problems that could occur if the custom expression `ex also contained assignments to a variable named t0 or defined its own variable `time_ns'. We could get errors or mysteriously incorrect behavior.

The Julia macro expander solves these problems as follows. First, the variables in the result of the macro are classified as local or global. A variable is considered local if it has a value assigned to it (and it is not declared global), it is declared local, or it is used as the name of a function argument. Otherwise, it is considered global. The local variables are then renamed to be unique (using the function gensym, which generates new characters), and global variables are resolved in the macro definition environment. Therefore, both of the problems described above are solved; local macro variables will not conflict with any user variables, and time_ns and println will refer to Julia Base definitions.

However, one problem remains. Consider the following usage of this macro:

module MyModule
import Base.@time

time_ns() = ... # вычисляет что-то

@time time_ns()
end

Here, the custom expression ex is a call to time_ns, but not to the same function time_ns that the macro uses. It explicitly refers to MyModule.time_ns. Therefore, we need to make sure that the code in ex resolves to the macro environment. This is done by adding escape sequences to the expression. esc:

macro time(ex)
    ...
    local val = $(esc(ex))
    ...
end

An expression wrapped in this way remains untouched by the macro expander and is simply inserted verbatim into the output. Therefore, it will be resolved in the macro invocation environment.

This mechanism of using escape sequences, if necessary, can be used to "violate" hygiene rules in order to introduce or manage user variables. For example, the following macro sets x to 0 in the call environment:

julia> macro zerox()
           return esc(:(x = 0))
       end
@zerox (macro with 1 method)

julia> function foo()
           x = 1
           @zerox
           return x # равно нулю
       end
foo (generic function with 1 method)

julia> foo()
0

This type of variable operation should be used with caution, but in some cases it is quite convenient.

Proper formulation of hygiene rules can be a daunting task. Before using a macro, you may want to consider whether closing the function will be sufficient. Another useful strategy is to postpone as much work as possible until completion. For example, many macros simply wrap their arguments in a QuoteNode or other similar ones. Expr. Examples of this are @task body, which simply returns schedule(Task(() -> $body)), and @eval expr, which simply returns eval(QuoteNode(expr)).

To demonstrate, we could rewrite the @time example above as follows:

macro time(expr)
    return :(timeit(() -> $(esc(expr))))
end
function timeit(f)
    t0 = time_ns()
    val = f()
    t1 = time_ns()
    println("elapsed time: ", (t1-t0)/1e9, " seconds")
    return val
end

However, we don’t do this, and it’s no coincidence: wrapping expr in a new scope block (anonymous function) also slightly changes the meaning of the expression (the scope of any variables in it), while we want @time to be used with minimal impact on the wrapped code.

Macros and dispatching

Macros, as well as Julia functions, are universal. This means that they can also have multiple method definitions due to multiple dispatching.:

julia> macro m end
@m (macro with 0 methods)

julia> macro m(args...)
           println("$(length(args)) arguments")
       end
@m (macro with 1 method)

julia> macro m(x,y)
           println("Two arguments")
       end
@m (macro with 2 methods)

julia> @m "asd"
1 arguments

julia> @m 1 2
Two arguments

However, it should be remembered that multiple dispatch is based on the AST types that are passed to the macro, and not on the types that the AST computes at runtime.:

julia> macro m(::Int)
           println("An Integer")
       end
@m (macro with 3 methods)

julia> @m 2
An Integer

julia> x = 2
2

julia> @m x
1 arguments

Code generation

When a significant amount of duplicate template code is required, it is usually generated programmatically to avoid redundancy. In most languages, this requires an additional build step and a separate program to generate duplicate code. In Julia, interpolation of expressions and eval allows you to perform such code generation during the normal course of program execution. For example, consider the following custom type

struct MyNumber
    x::Float64
end
# вывод

to which we want to add several methods. We can do this programmatically in the next cycle.:

for op = (:sin, :cos, :tan, :log, :exp)
    eval(quote
        Base.$op(a::MyNumber) = MyNumber($op(a.x))
    end)
end
# вывод

and now we can use these functions with our custom type.:

julia> x = MyNumber(π)
MyNumber(3.141592653589793)

julia> sin(x)
MyNumber(1.2246467991473532e-16)

julia> cos(x)
MyNumber(-1.0)

Thus, Julia acts as its own https://en.wikipedia.org/wiki/Preprocessor [preprocessor] and allows code generation from within the language. The above code can be written a little shorter using the prefix citation form ::

for op = (:sin, :cos, :tan, :log, :exp)
    eval(:(Base.$op(a::MyNumber) = MyNumber($op(a.x))))
end

However, this type of code generation from within the language, using the template eval(quote(...)), is quite common, so Julia has a macro to shorten this template.:

for op = (:sin, :cos, :tan, :log, :exp)
    @eval Base.$op(a::MyNumber) = MyNumber($op(a.x))
end

The macro @eval rewrites this code to be absolutely equivalent to the longer version above. For longer blocks of generated code, the argument of the expression given by @eval, can be a block:

@eval begin
    # несколько строк
end

Non-standard string literals

Remember that in the section Strings it was said that string literals with an identifier as a prefix are called non-standard string literals, and their semantics will differ from those of string literals without a prefix. For example:

Surprisingly, these behaviors are not hard-coded in the Julia analyzer or compiler. Instead, they use custom behaviors provided by a common mechanism that everyone can use: string literals with a prefix are parsed as macro calls with special names. For example, a regular expression macro is just the following code:

macro r_str(p)
    Regex(p)
end

And that’s it. This macro says that the contents of the string literal are r"^\s*(?:#|$)" it must be passed to the macro @r_str', and the result of this extension must be placed in the syntax tree in which the string literal is located. In other words, the expression `+r"^\s*(?:#|$)"+ it is equivalent to putting the following object directly into the syntax tree:

Regex("^\\s*(?:#|\$)")

The string literal form is not only shorter and much more convenient, but also more efficient.: since the regular expression is compiled, and the Regex object is actually created, when the code is compiled, compilation occurs only once, and not every time the code is executed. Let’s consider whether a regular expression occurs in a loop.:

for line = lines
    m = match(r"^\s*(?:#|$)", line)
    if m === nothing
        # не комментарий
    else
        # комментарий
    end
end

Since the regular expression is r"^\s*(?:#|$)" compiled and inserted into the syntax tree, when this code is analyzed, the expression is compiled only once, and not every time a loop is executed. To complete this without macros, we would have to write this loop as follows:

re = Regex("^\\s*(?:#|\$)")
for line = lines
    m = match(re, line)
    if m === nothing
        # не комментарий
    else
        # комментарий
    end
end

Moreover, if the compiler could not determine that the regular expression object did not change in all cycles, certain optimizations may not be possible, which still makes this version less efficient than the above more convenient literal form. Of course, there are still situations where the non-literal form is more convenient: if you need to interpolate a variable into a regular expression, you should choose this more verbose approach; in cases where the regular expression template itself is dynamic and can change with each iteration of the loop, you need to construct a new regular expression object with each iteration. However, in the vast majority of use cases, regular expressions are not constructed based on runtime data. In most cases, the ability to write regular expressions as compile-time values is invaluable.

The mechanism for working with user-defined string literals is extremely efficient. It implements not only non-standard Julia literals, but also the syntax of command literals ($echo "Hello, $person"$), for which the following harmless-looking macro is used:

macro cmd(str)
    :(cmd_gen($(shell_parse(str)[1])))
end

Of course, much of the complexity lies in the functions used in this macro definition, but they are just functions written entirely in Julia. You can read their source code and see exactly what they are doing, and all they are doing is constructing expression objects to insert into the syntax tree of your program.

Like string literals, command literals can also accept identifiers as a prefix to form what are called non-standard command literals. These command literals are parsed as macro calls with special names. For example, the syntax of $custom`literal"$ is parsed as @custom_cmd "literal". The Julia language itself lacks non-standard command literals, but packages can use this syntax. Apart from a different syntax and the suffix _cmd instead of _str, non-standard command literals behave exactly the same as non-standard string literals.

If two modules provide non-standard string or command literals with the same name, it is possible to qualify the string or command literal using the module name. For example, if both Foo and Bar provide a non-standard string literal @x_str, then you can write Foo.x"literal" or Bar.x"literal" to distinguish them.

Another way to define a macro is as follows:

macro foo_str(str, flag)
    # сделай что-нибудь
end

This macro can then be called using the following syntax:

foo"str"flag

The type of the flag in the syntax mentioned above will be a string (String) containing everything after the string literal.

Generated functions

A very special macro is @generated, which allows you to define so-called generated functions. They have the ability to generate specialized code depending on the types of their arguments with more flexibility and/or less code than can be achieved using multiple dispatching. While macros work with expressions during debugging and cannot access the types of their input data, the generated function is expanded when the types of arguments are known, but the function has not yet been compiled.

Instead of performing some kind of calculation or action, the declaration of the generated function returns a quoted expression, which then forms the body of the method corresponding to the types of arguments. When a generated function is called, the expression it returns is compiled and then executed. To make this process efficient, the result is usually cached. And to make it deducible, only a limited subset of the language is suitable for use. Thus, the generated functions provide a flexible way to transfer work from runtime to compile time, due to the large restrictions on allowed constructs.

If we talk about the generated functions, there are five main differences from the usual ones:

  1. The function declaration is marked with the macro `@generated'. This adds some information to the AST, which lets the compiler know that it is a generated function.

  2. In the body of the generated function, you only have access to the argument types, not their values.

  3. Instead of calculating something or performing some action, you return a quoted expression, which, when calculated, does what you want.

  4. Generated functions are allowed to call only functions that have been defined before the definition of the generated function. (Failure to do so may result in MethodErrors referring to functions from a future hierarchy of "age of the world" method definitions.)

  5. Generated functions must not _ change or _ observe any non-permanent global states (including, for example, I/O operations, locks, non-local dictionaries, or usage hasmethod). This means that they can only read global constants and cannot have side effects. In other words, they must be completely clean. Due to implementation limitations, this also means that they cannot define closures or generators at the moment.

The easiest way to illustrate this is with an example. We can declare the generated function foo as

julia> @generated function foo(x)
           Core.println(x)
           return :(x * x)
       end
foo (generic function with 1 method)

Note that the body returns the quoted expression, namely :(x *x), and not just the value `x*x'.

From the caller’s point of view, this is identical to a regular function; in fact, you don’t need to know if you’re calling a regular or a generated function. Let’s see how foo behaves.:

julia> x = foo(2); # примечание: выходные данные из выражения println() в теле
Int64

julia> x           # теперь мы выводим на экран x
4

julia> y = foo("bar");
String

julia> y
"barbar"

So, we see that in the body of the generated function, x is the type of the passed argument, and the value returned by the generated function is the result of calculating the quoted expression that we returned from the definition, now with the value `x'.

What happens if we calculate foo again with the type we already used?

julia> foo(4)
16

Please note that Int64 is not displayed. We see that the body of the generated function was executed here only once, for a specific set of argument types, and the result was cached. After that, in this example, the expression returned from the generated function on the first call was reused as the method body. However, the actual caching behavior is an implementation-defined performance optimization, so it is unacceptable to rely too heavily on this behavior.

The generated function can be created only once, and it can be created more often or not at all. As a result, you should never write a generated function with side effects, since it is impossible to determine when and how often a side effect will occur. (This is also true for macros, and, just as with macros, the use of eval in the generated function is a sign that you are doing something wrong.) However, unlike macros, the runtime cannot handle the call correctly. eval, therefore its use is prohibited.

It is also important to learn how the generated (@generated) functions interact with the method override. Following the principle that a correctly generated (@generated) function should not observe any mutable state or cause any change in the global state, we see the following behavior. Note that the generated function cannot call any method that was not defined before the definition of the generated function itself.

Initially, f(x) has one definition

julia> f(x) = "original definition";

Defining other operations using f(x):

julia> g(x) = f(x);

julia> @generated gen1(x) = f(x);

julia> @generated gen2(x) = :(f(x));

Now we add some new definitions for f(x):

julia> f(x::Int) = "definition for Int";

julia> f(x::Type{Int}) = "definition for Type{Int}";

and we compare the differences between these results.:

julia> f(1)
"definition for Int"

julia> g(1)
"definition for Int"

julia> gen1(1)
"original definition"

julia> gen2(1)
"definition for Int"

Each method of the generated function has its own representation of certain functions:

julia> @generated gen1(x::Real) = f(x);

julia> gen1(1)
"definition for Type{Int}"

The above example of the generated function foo' did not do anything that the normal function `foo(x)= x * x could not do (except for displaying the type of the first call and higher costs). However, the whole power of the generated function is based on its ability to calculate various quoted expressions depending on the types passed to it.:

julia> @generated function bar(x)
           if x <: Integer
               return :(x ^ 2)
           else
               return :(x)
           end
       end
bar (generic function with 1 method)

julia> bar(4)
16

julia> bar("baz")
"baz"

(although, of course, this far-fetched example would be easier to implement using multiple dispatching…​)

Abusing this will disrupt the runtime environment and lead to undefined behavior.:

julia> @generated function baz(x)
           if rand() < .9
               return :(x^2)
           else
               return :("boo!")
           end
       end
baz (generic function with 1 method)

Since the body of the generated function is not deterministic, its behavior and the behavior of the entire subsequent code_ are undefined.

_ Don’t copy these examples!_

We hope that these examples have proved useful in illustrating how generated functions work, both in terms of their definition and invocation; however, do not copy them for the following reasons:

  • the 'foo` function has side effects (calling `Core.println'), and it is not defined exactly when, how often, or how many times these side effects will occur.;

  • the 'bar` function solves a problem that is better solved using multiple dispatching — defining bar(x) = x and bar(x::Integer) =x^2 will do the same thing, but easier and faster.

  • the function of baz is "pathological".

Note that the set of operations that should not be attempted in the generated function is unlimited, and the runtime can currently only detect a subset of invalid operations. There are many other operations that, without notification, will simply disrupt the runtime environment, usually in subtle ways that are not obviously associated with an incorrect definition. Since the function generator is started during the output, it must comply with all the restrictions of this code.

Here are some operations that should not be attempted.

  1. Caching your own pointers.

  2. Interaction with the contents or methods of the Core.Compiler by any means.

  3. Observation of any changeable state.

    • The output for the generated function can be executed at any time, including when your code is trying to observe or change this state.

  4. Using any locks: The C code to which you are sending the call may use locks internally (for example, it is not a problem to call malloc, despite the fact that locks are required internally in most implementations), but do not try to hold or receive locks when executing Julia code.

  5. A call to any function that is defined after the body of the generated function. This condition is not strict for incrementally loaded precompiled modules to allow calling any function in the module.

Good. Now that you have a better understanding of how the generated functions work, let’s use them to create somewhat more advanced (and acceptable) functionality…​

An advanced example

The Julia base library has an internal function sub2ind' to calculate a linear index in an n-dimensional array based on a set of n multilinear indexes — in other words, to calculate the index `i, which can be used for indexing in array A using A[i] rather than A[x,y,z,...]. One of the possible implementations is the following.

julia> function sub2ind_loop(dims::NTuple{N}, I::Integer...) where N
           ind = I[N] - 1
           for i = N-1:-1:1
               ind = I[i]-1 + dims[i]*ind
           end
           return ind + 1
       end;

julia> sub2ind_loop((3, 5), 1, 2)
4

The same can be done using recursion.:

julia> sub2ind_rec(dims::Tuple{}) = 1;

julia> sub2ind_rec(dims::Tuple{}, i1::Integer, I::Integer...) =
           i1 == 1 ? sub2ind_rec(dims, I...) : throw(BoundsError());

julia> sub2ind_rec(dims::Tuple{Integer, Vararg{Integer}}, i1::Integer) = i1;

julia> sub2ind_rec(dims::Tuple{Integer, Vararg{Integer}}, i1::Integer, I::Integer...) =
           i1 + dims[1] * (sub2ind_rec(Base.tail(dims), I...) - 1);

julia> sub2ind_rec((3, 5), 1, 2)
4

Both of these implementations, although different from each other, do pretty much the same thing: a run-time loop over the dimensions of the array, collecting the offset in each dimension into the final index.

However, all the information we need for the loop is embedded in the information type of the arguments. This allows the compiler to move the iteration to compile time and eliminate run-time cycles altogether. We can use generated functions to achieve the same effect; in compiler terms, we use generated functions to manually collapse the loop. The body becomes almost identical, but instead of calculating a linear index, we create an expression that calculates the index.:

julia> @generated function sub2ind_gen(dims::NTuple{N}, I::Integer...) where N
           ex = :(I[$N] - 1)
           for i = (N - 1):-1:1
               ex = :(I[$i] - 1 + dims[$i] * $ex)
           end
           return :($ex + 1)
       end;

julia> sub2ind_gen((3, 5), 1, 2)
4

What code will be generated?

An easy way to find out is to extract the body into another (regular) function.:

julia> function sub2ind_gen_impl(dims::Type{T}, I...) where T <: NTuple{N,Any} where N
           length(I) == N || return :(error("partial indexing is unsupported"))
           ex = :(I[$N] - 1)
           for i = (N - 1):-1:1
               ex = :(I[$i] - 1 + dims[$i] * $ex)
           end
           return :($ex + 1)
       end;

julia> @generated function sub2ind_gen(dims::NTuple{N}, I::Integer...) where N
           return sub2ind_gen_impl(dims, I...)
       end;

julia> sub2ind_gen((3, 5), 1, 2)
4

Now we can execute sub2ind_gen_impl and examine the expression it returns.:

julia> sub2ind_gen_impl(Tuple{Int,Int}, Int, Int)
:(((I[1] - 1) + dims[1] * (I[2] - 1)) + 1)

So, the body of the method that will be used here does not include a loop at all - only indexing into two tuples, multiplication and addition or subtraction. All loop execution takes place during compilation, and we completely avoid loop operations during execution. Thus, we run the loop only _ once for each type_, in this case once for each N (except in borderline cases where the function is generated more than once — see disclaimer below).

Additional Info generated functions

Generated functions can achieve high efficiency at runtime, but their use is costly at compile time: for each combination of specific argument types, a new function body must be generated. Julia can usually compile "universal" versions of functions that will work with any arguments, but this is not possible with generated functions. This means that static compilation may not be possible for programs that use the generated functions intensively.

To solve this problem, the language provides syntax for writing common, non-genericizable alternative implementations of generated functions. If you apply it to the example with sub2ind above, it will look like this:

julia> function sub2ind_gen_impl(dims::Type{T}, I...) where T <: NTuple{N,Any} where N
           ex = :(I[$N] - 1)
           for i = (N - 1):-1:1
               ex = :(I[$i] - 1 + dims[$i] * $ex)
           end
           return :($ex + 1)
       end;

julia> function sub2ind_gen_fallback(dims::NTuple{N}, I) where N
           ind = I[N] - 1
           for i = (N - 1):-1:1
               ind = I[i] - 1 + dims[i]*ind
           end
           return ind + 1
       end;

julia> function sub2ind_gen(dims::NTuple{N}, I::Integer...) where N
           length(I) == N || error("partial indexing is unsupported")
           if @generated
               return sub2ind_gen_impl(dims, I...)
           else
               return sub2ind_gen_fallback(dims, I)
           end
       end;

julia> sub2ind_gen((3, 5), 1, 2)
4

Internally, this code creates two implementations of the function: the generated one, which uses the first block in if @generated, and the regular one, which uses the else block. Inside the 'then` part of the 'if @generated` block, the code has the same semantics as other generated functions: argument names refer to types, and the code must return an expression. There may be several if @generated blocks, in which case the generated implementation uses all the then blocks, and the alternative implementation uses all the else blocks.

Note that we have added error checking on top of the function. This code will be common to both versions and is the runtime code in both versions (it will be quoted and returned as an expression from the generated version). This means that the values and types of local variables are not available during code generation — the code for code generation can only see the types of arguments.

In this style, defining a code generation function is mostly an optional optimization. The compiler will use it if it is convenient, but otherwise it can use a normal implementation instead. This style is preferred because it allows the compiler to make additional decisions and compile programs in more ways, and normal code is more readable than code generated using code. However, which implementation will be used depends on the details of the compiler implementation, so it is important that the two implementations behave identically.