`EscapeAnalysis`

Core.Compiler.EscapeAnalysis is a compiler utility module designed to analyze output information for IR SSA is a form of Julia, also known as `IRCode'.

The purpose of this escape analysis is as follows:

using Julia’s high-level semantics, understanding the actions of outputs and the assignment of aliases using interprocedural calls;
be versatile enough to be used for various optimizations, including https://github.com/JuliaLang/julia/pull/43888 [SROA with alias support], https://github.com/JuliaLang/julia/pull/44056 [pre-insert finalize], https://github.com/JuliaLang/julia/pull/42465 [building an ImmutableArray without copying], allocating a stack of mutable objects, etc.;
performing a simple implementation based on the implementation of a completely reverse data flow analysis, as well as a new lattice design combining the properties of an orthogonal lattice.

Try it!

You can try to perform an escape analysis by downloading the script EAUtils.jl, which identifies suitable entries code_escapes and `@code_escapes' for testing and debugging.

julia> let JULIA_DIR = normpath(Sys.BINDIR, "..", "share", "julia")
           # загружает модуль `EscapeAnalysis` для определения основного кода анализа
           include(normpath(JULIA_DIR, "base", "compiler", "ssair", "EscapeAnalysis", "EscapeAnalysis.jl"))
           using .EscapeAnalysis
           # загружает `EAUtils` для определения служебных программ
           include(normpath(JULIA_DIR, "test", "compiler", "EscapeAnalysis", "EAUtils.jl"))
           using .EAUtils
       end


julia> mutable struct SafeRef{T}
           x::T
       end


julia> Base.getindex(x::SafeRef) = x.x;


julia> Base.setindex!(x::SafeRef, v) = x.x = v;


julia> Base.isassigned(x::SafeRef) = true;


julia> get′(x) = isassigned(x) ? x[] : throw(x);


julia> result = code_escapes((String,String,String,String)) do s1, s2, s3, s4
           r1 = Ref(s1)
           r2 = Ref(s2)
           r3 = SafeRef(s3)
           try
               s1 = get′(r1)
               ret = sizeof(s1)
           catch err
               global GV = err # will definitely exit the `r1`
end
           s2 = get'(r2) # still not fully exiting `r2`
s3 = get'(r3) # still not fully exiting `r3`
s4 = sizeof(s4) # Argument `s4` does not exit here
           return s2, s3, s4
       end
#1(X s1::String, ↑ s2::String, ↑ s3::String, ✓ s4::String) in Main at REPL[7]:2
X  1 ── %1  = %new(Base.RefValue{String}, _2)::Base.RefValue{String}
*′ │    %2  = %new(Base.RefValue{String}, _3)::Base.RefValue{String}
✓′ └─── %3  = %new(Main.SafeRef{String}, _4)::Main.SafeRef{String}
✓′ 2 ── %4  = ϒ (%3)::Main.SafeRef{String}
*′ │    %5  = ϒ (%2)::Base.RefValue{String}
✓  │    %6  = ϒ (_5)::String
◌  └─── %7  = enter #8
◌  3 ── %8  = Base.isdefined(%1, :x)::Bool
◌  └───       goto #5 if not %8
X  4 ──       Base.getfield(%1, :x)::String
◌  └───       goto #6
◌  5 ──       Main.throw(%1)::Union{}
◌  └───       unreachable
◌  6 ──       $(Expr(:leave, :(%7)))
◌  7 ──       goto #11
✓′ 8 ┄─ %16 = φᶜ (%4)::Main.SafeRef{String}
*′ │    %17 = φᶜ (%5)::Base.RefValue{String}
✓  │    %18 = φᶜ (%6)::String
X  └─── %19 = $(Expr(:the_exception))::Any
◌  9 ──       nothing::Nothing
◌  10 ─       (Main.GV = %19)::Any
◌  └───       $(Expr(:pop_exception, :(%7)))::Core.Const(nothing)
✓′ 11 ┄ %23 = φ (#7 => %3, #10 => %16)::Main.SafeRef{String}
*′ │    %24 = φ (#7 => %2, #10 => %17)::Base.RefValue{String}
✓  │    %25 = φ (#7 => _5, #10 => %18)::String
◌  │    %26 = Base.isdefined(%24, :x)::Bool
◌  └───       goto #13 if not %26
↑  12 ─ %28 = Base.getfield(%24, :x)::String
◌  └───       goto #14
◌  13 ─       Main.throw(%24)::Union{}
◌  └───       unreachable
↑  14 ─ %32 = Base.getfield(%23, :x)::String
◌  │    %33 = Core.sizeof(%25)::Int64
↑′ │    %34 = Core.tuple(%28, %32, %33)::Tuple{String, String, Int64}
◌  └───       return %34

The symbols on the side of each SSA call argument and operators have the following meanings.

◌ (monochrome): this value is not analyzed because its output information will not be used (for example, when the object is of type `isbitstype').
✓' (green or blue): this value never exits (contains `has_no_escape(result.state[x])) is colored blue if it also has an argument output (`has_arg_escape(result.state[x])`saved).
↑ (blue or yellow): this value can be returned to the caller via a return (contains has_return_escape(result.state[x])) is colored yellow if it also has a raw generated output (contains has_thrown_escape(result.state[x])).
'X` (red): this value can go to any place that escape analysis cannot judge, for example, go to global memory (contains has_all_escape(result.state[x])).
* ( bold font): the output state of this value is between ReturnEscape and AllEscape in partial order. EscapeInfo, is colored yellow if it also has a raw generated output (contains has_thrown_escape(result.state[x])).
′: this value has additional information about the object field or array element in its AliasInfo property.

Information about the output of each call argument and the SSA value can be checked programmatically, as shown below.

julia> result.state[Core.Argument(3)] # Получить EscapeInfo для `s2`
ReturnEscape

julia> result.state[Core.SSAValue(3)] # Получить EscapeInfo для `r3`
NoEscape′

The structure of the analysis

Grid design

EscapeAnalysis is implemented as https://en.wikipedia.org/wiki/Data-flow_analysis [data flow analysis] operating in a grid x::EscapeInfo, which consists of the following properties.

x.Analyzed::Bool: is not formally part of the lattice, only indicates whether x has been analyzed.
'x.ReturnEscape::BitSet`: records SSA statements where x can pass to the caller via return.
x.ThrownEscape::BitSet: records SSA statements where x can be called as an exception (used for exception handling described below).
`x.AliasInfo': stores all possible values that can be aliases of fields or elements of `x' (used for alias analysis described below).
'x.ArgEscape::Int` (not implemented yet): indicates that it will reach the caller via `setfield!`in the arguments.

These attributes can be combined to create a finite-height partial lattice, given the invariant that the input program has a finite number of operators due to Julia semantics. The most interesting thing about this lattice design is that it makes it easier to implement lattice operations by allowing each lattice property to be handled separately.:LatticeDesign[In our implementation of type inference, an alternative approach is used, where each property of the lattice is represented by a special object of the type of the lattice element. It turned out that he began to complicate the implementation of lattice operations mainly because he often requires transformation rules between all objects of the lattice element type. And we are working on https://github.com/JuliaLang/julia/pull/42596 [upgrading the implementation of the type inference grid] with an EscapeInfo-like grid design.].

Propagation of reverse output

This implementation of escape analysis is based on the data flow algorithm described in ^[1]. The analysis works in the EscapeInfo grid and proceeds through the grid elements from bottom to top until each grid element converges at a fixed point, preserving a (conceptual) working set that contains program counters corresponding to the remaining SSA operators to be analyzed. The analysis manages a single global state that tracks the EscapeInfo of each SSA argument and statement. Also note that some dependence on the stream is encoded in the form of program counters recorded in the ReturnEscape property of EscapeInfo, which can later be combined with dominance analysis to talk about dependence on the stream, if necessary.

A distinctive feature of this escape analysis is that it is completely reversible, meaning the output information goes from usage to definitions. For example, in the code snippet below, EA first analyzes the return %1 operator and overlays ReturnEscape on %1 (corresponding to obj), and then analyzes %1 = %new(Base.RefValue{String, _2})) and extends the ReturnEscape superimposed on %1 to the call argument _2 (corresponding to s).

julia> code_escapes((String,)) do s
           obj = Ref(s)
           return obj
       end
#3(↑ s::String) in Main at REPL[1]:2
↑′ 1 ─ %1 = %new(Base.RefValue{String}, _2)::Base.RefValue{String}
◌  └──      return %1

An important note is that this reverse analysis allows the output information to naturally follow the "use-definition" chain, rather than the order of execution.:BackandForth[Our type inference algorithm, on the contrary, is implemented as a direct analysis, since type information usually goes from definition to use, therefore it is more natural and effective to distribute such information in a direct way.As a result, this scheme provides a simple implementation of escape analysis. For example, a `PhiNode' can be processed by simply extending the output information in relation to a `PhiNode' to the values of its predecessor.

julia> code_escapes((Bool, String, String)) do cnd, s, t
           if cnd
               obj = Ref(s)
           else
               obj = Ref(t)
           end
           return obj
       end
#5(✓ cnd::Bool, ↑ s::String, ↑ t::String) in Main at REPL[1]:2
◌  1 ─      goto #3 if not _2
↑′ 2 ─ %2 = %new(Base.RefValue{String}, _3)::Base.RefValue{String}
◌  └──      goto #4
↑′ 3 ─ %4 = %new(Base.RefValue{String}, _4)::Base.RefValue{String}
↑′ 4 ┄ %5 = φ (#2 => %2, #3 => %4)::Base.RefValue{String}
◌  └──      return %5

Alias Analysis

EscapeAnalysis implements a reverse field analysis in order to reason with a certain precision about the outputs entered in relation to the fields of the object; for this, there is a property x.AliasInfo in x::EscapeInfo. It records all possible values that can be assigned to the fields x at the places of use, and then the information about the output of these recorded values is distributed to the actual values of the fields at the places of definition. More specifically, the analysis records a value that can be assigned as an alias to an object field when analyzing the getfield call, and then distributes its output information to the fields when analyzing the expression %new(...) or calling setfield! ^[2].

julia> code_escapes((String,)) do s
           obj = SafeRef("init")
           obj[] = s
           v = obj[]
           return v
       end
#7(↑ s::String) in Main at REPL[1]:2
◌  1 ─     return _2

In the example above, ReturnEscape overlaid on %3 (corresponding to v) does not extend directly to %1 (corresponding to obj), but rather, ReturnEscape extends only to _2 (corresponding to s). Here, %3 is written to the AliasInfo property for %1, since it can be assigned as an alias to the first field %1, and then, when analyzing Base.setfield!(%1, :x, _2)::String, this output information is distributed to _2, but not by `%1'.

This is how EscapeAnalysis tracks which IR elements can be aliased in the chain getfield-%new/setfield!to analyze the outputs of the object’s fields, but in fact this alias analysis should be generalized to handle other IR elements as well. This is because in IR Julia, the same object is sometimes represented by different IR elements, so it must be ensured that these different IR elements, which can actually represent the same object, have the same output information. IR elements that return the same object as their operands, such as `PiNode and typeassert, can lead to aliasing at the IR level, and therefore it is required that the output information regarding any such aliased value be shared between them. More interestingly, it is also necessary for proper understanding of the changes in the PhiNode. Consider the following example.

julia> code_escapes((Bool, String,)) do cond, x
           if cond
               ϕ2 = ϕ1 = SafeRef("foo")
           else
               ϕ2 = ϕ1 = SafeRef("bar")
           end
           ϕ2[] = x
           y = ϕ1[]
           return y
       end
#9(✓ cond::Bool, ↑ x::String) in Main at REPL[1]:2
◌  1 ─      goto #3 if not _2
✓′ 2 ─ %2 = %new(Main.SafeRef{String}, "foo")::Main.SafeRef{String}
◌  └──      goto #4
✓′ 3 ─ %4 = %new(Main.SafeRef{String}, "bar")::Main.SafeRef{String}
✓′ 4 ┄ %5 = φ (#2 => %2, #3 => %4)::Main.SafeRef{String}
✓′ │   %6 = φ (#2 => %2, #3 => %4)::Main.SafeRef{String}
◌  │        Base.setfield!(%5, :x, _3)::String
↑  │   %8 = Base.getfield(%6, :x)::String
◌  └──      return %8

ϕ1 = %5 and ϕ2 = %6 are assigned aliases, so ReturnEscape overlaid on %8 = Base.getfield(%6, :x)::String (corresponding to y = ϕ1[]), should be extended to Base.setfield!(%5, :x, _3)::String (corresponding to ϕ2[] = x). In order for such output information to be distributed correctly, the analysis must understand that ϕ1 and ϕ2 can also be assigned aliases, and equalize their output information.

An interesting property of such information about the assignment of aliases is that it is unknown at the place of use, but can only be obtained at the place of definition (since aliases are conceptually equivalent to assignment); therefore, it is not suitable for reverse analysis. To efficiently distribute output information between related values, EscapeAnalysis.jl uses an approach based on the escape analysis algorithm described in an old paper on JVMfootnote:JVM05[Escape Analysis in the Context of Dynamic Compilation and Deoptimization. Thomas Kotzmann and Hanspeter Mössenböck, 2005, June. https://dl.acm.org/doi/10.1145/1064979.1064996 That is, in addition to controlling the elements of the exit grid, the analysis also supports a set of equivalent aliases — an unrelated set of SSA arguments and operators with assigned aliases. A set of aliases controls the values that can be assigned as aliases to each other, and allows them to align the output information superimposed on any such value with an alias.

Array analysis

The analysis of aliases for object fields described above can also be generalized to analyze operations with arrays. 'EscapeAnalysis` implements processing for various primitive array operations so that it can distribute outputs through the "use-definition" chain of arrayref-`arrayset' and does not exit allocated arrays too conservatively.

julia> code_escapes((String,)) do s
           ary = Any[]
           push!(ary, SafeRef(s))
           return ary[1], length(ary)
       end
#11(X s::String) in Main at REPL[1]:2
X  1 ── %1  = Core.getproperty(Memory{Any}, :instance)::Memory{Any}
X  │    %2  = invoke Core.memoryref(%1::Memory{Any})::MemoryRef{Any}
X  │    %3  = %new(Vector{Any}, %2, (0,))::Vector{Any}
X  │    %4  = %new(Main.SafeRef{String}, _2)::Main.SafeRef{String}
X  │    %5  = Base.getfield(%3, :ref)::MemoryRef{Any}
X  │    %6  = Base.getfield(%5, :mem)::Memory{Any}
X  │    %7  = Base.getfield(%6, :length)::Int64
X  │    %8  = Base.getfield(%3, :size)::Tuple{Int64}
◌  │    %9  = $(Expr(:boundscheck, true))::Bool
X  │    %10 = Base.getfield(%8, 1, %9)::Int64
◌  │    %11 = Base.add_int(%10, 1)::Int64
◌  │    %12 = Base.memoryrefoffset(%5)::Int64
X  │    %13 = Core.tuple(%11)::Tuple{Int64}
◌  │          Base.setfield!(%3, :size, %13)::Tuple{Int64}
◌  │    %15 = Base.add_int(%12, %11)::Int64
◌  │    %16 = Base.sub_int(%15, 1)::Int64
◌  │    %17 = Base.slt_int(%7, %16)::Bool
◌  └───       goto #3 if not %17
X  2 ── %19 = %new(Base.var"#133#134"{Vector{Any}, Int64, Int64, Int64, Int64, Int64, Memory{Any}, MemoryRef{Any}}, %3, %16, %12, %11, %10, %7, %6, %5)::Base.var"#133#134"{Vector{Any}, Int64, Int64, Int64, Int64, Int64, Memory{Any}, MemoryRef{Any}}
✓  └───       invoke %19()::MemoryRef{Any}
◌  3 ┄─       goto #4
X  4 ── %22 = Base.getfield(%3, :size)::Tuple{Int64}
◌  │    %23 = $(Expr(:boundscheck, true))::Bool
X  │    %24 = Base.getfield(%22, 1, %23)::Int64
X  │    %25 = Base.getfield(%3, :ref)::MemoryRef{Any}
X  │    %26 = Base.memoryrefnew(%25, %24, false)::MemoryRef{Any}
X  │          Base.memoryrefset!(%26, %4, :not_atomic, false)::Main.SafeRef{String}
◌  └───       goto #5
◌  5 ── %29 = $(Expr(:boundscheck, true))::Bool
◌  └───       goto #9 if not %29
◌  6 ── %31 = Base.sub_int(1, 1)::Int64
◌  │    %32 = Base.bitcast(Base.UInt, %31)::UInt64
X  │    %33 = Base.getfield(%3, :size)::Tuple{Int64}
◌  │    %34 = $(Expr(:boundscheck, true))::Bool
X  │    %35 = Base.getfield(%33, 1, %34)::Int64
◌  │    %36 = Base.bitcast(Base.UInt, %35)::UInt64
◌  │    %37 = Base.ult_int(%32, %36)::Bool
◌  └───       goto #8 if not %37
◌  7 ──       goto #9
◌  8 ── %40 = Core.tuple(1)::Tuple{Int64}
✓  │          invoke Base.throw_boundserror(%3::Vector{Any}, %40::Tuple{Int64})::Union{}
◌  └───       unreachable
X  9 ┄─ %43 = Base.getfield(%3, :ref)::MemoryRef{Any}
X  │    %44 = Base.memoryrefnew(%43, 1, false)::MemoryRef{Any}
X  │    %45 = Base.memoryrefget(%44, :not_atomic, false)::Any
◌  └───       goto #10
X  10 ─ %47 = Base.getfield(%3, :size)::Tuple{Int64}
◌  │    %48 = $(Expr(:boundscheck, true))::Bool
X  │    %49 = Base.getfield(%47, 1, %48)::Int64
↑′ │    %50 = Core.tuple(%45, %49)::Tuple{Any, Int64}
◌  └───       return %50

In the example above, EscapeAnalysis' understands that `%20 and %2 (corresponding to the selected object SafeRef(s)) aliases are assigned via the arrayset-arrayref chain, and overlays ReturnEscape on them, but does not overlay it on the allocated array %1 (corresponding to ary). 'EscapeAnalysis` still overlays ThrownEscape on ary because it also needs to consider potential exits via BoundsError. Also note that such raw `ThrownEscape' can often be ignored when optimizing the allocation of `ary'.

Moreover, in cases where information about the array index, as well as its dimensions, can be precisely known, EscapeAnalysis can even talk about piecemeal assignment of aliases through the arrayref-arrayset chain, since EscapeAnalysis analyzes the assignment of aliases for each field for objects.

julia> code_escapes((String,String)) do s, t
           ary = Vector{Any}(undef, 2)
           ary[1] = SafeRef(s)
           ary[2] = SafeRef(t)
           return ary[1], length(ary)
       end
#13(X s::String, X t::String) in Main at REPL[1]:2
X  1 ── %1  = $(Expr(:foreigncall, :(:jl_alloc_genericmemory), Ref{Memory{Any}}, svec(Any, Int64), 0, :(:ccall), Memory{Any}, 2, 2))::Memory{Any}
X  │    %2  = Core.memoryrefnew(%1)::MemoryRef{Any}
X  │    %3  = %new(Vector{Any}, %2, (2,))::Vector{Any}
X  │    %4  = %new(Main.SafeRef{String}, _2)::Main.SafeRef{String}
◌  │    %5  = $(Expr(:boundscheck, true))::Bool
◌  └───       goto #5 if not %5
◌  2 ── %7  = Base.sub_int(1, 1)::Int64
◌  │    %8  = Base.bitcast(Base.UInt, %7)::UInt64
X  │    %9  = Base.getfield(%3, :size)::Tuple{Int64}
◌  │    %10 = $(Expr(:boundscheck, true))::Bool
X  │    %11 = Base.getfield(%9, 1, %10)::Int64
◌  │    %12 = Base.bitcast(Base.UInt, %11)::UInt64
◌  │    %13 = Base.ult_int(%8, %12)::Bool
◌  └───       goto #4 if not %13
◌  3 ──       goto #5
◌  4 ── %16 = Core.tuple(1)::Tuple{Int64}
✓  │          invoke Base.throw_boundserror(%3::Vector{Any}, %16::Tuple{Int64})::Union{}
◌  └───       unreachable
X  5 ┄─ %19 = Base.getfield(%3, :ref)::MemoryRef{Any}
X  │    %20 = Base.memoryrefnew(%19, 1, false)::MemoryRef{Any}
X  │          Base.memoryrefset!(%20, %4, :not_atomic, false)::Main.SafeRef{String}
◌  └───       goto #6
X  6 ── %23 = %new(Main.SafeRef{String}, _3)::Main.SafeRef{String}
◌  │    %24 = $(Expr(:boundscheck, true))::Bool
◌  └───       goto #10 if not %24
◌  7 ── %26 = Base.sub_int(2, 1)::Int64
◌  │    %27 = Base.bitcast(Base.UInt, %26)::UInt64
X  │    %28 = Base.getfield(%3, :size)::Tuple{Int64}
◌  │    %29 = $(Expr(:boundscheck, true))::Bool
X  │    %30 = Base.getfield(%28, 1, %29)::Int64
◌  │    %31 = Base.bitcast(Base.UInt, %30)::UInt64
◌  │    %32 = Base.ult_int(%27, %31)::Bool
◌  └───       goto #9 if not %32
◌  8 ──       goto #10
◌  9 ── %35 = Core.tuple(2)::Tuple{Int64}
✓  │          invoke Base.throw_boundserror(%3::Vector{Any}, %35::Tuple{Int64})::Union{}
◌  └───       unreachable
X  10 ┄ %38 = Base.getfield(%3, :ref)::MemoryRef{Any}
X  │    %39 = Base.memoryrefnew(%38, 2, false)::MemoryRef{Any}
X  │          Base.memoryrefset!(%39, %23, :not_atomic, false)::Main.SafeRef{String}
◌  └───       goto #11
◌  11 ─ %42 = $(Expr(:boundscheck, true))::Bool
◌  └───       goto #15 if not %42
◌  12 ─ %44 = Base.sub_int(1, 1)::Int64
◌  │    %45 = Base.bitcast(Base.UInt, %44)::UInt64
X  │    %46 = Base.getfield(%3, :size)::Tuple{Int64}
◌  │    %47 = $(Expr(:boundscheck, true))::Bool
X  │    %48 = Base.getfield(%46, 1, %47)::Int64
◌  │    %49 = Base.bitcast(Base.UInt, %48)::UInt64
◌  │    %50 = Base.ult_int(%45, %49)::Bool
◌  └───       goto #14 if not %50
◌  13 ─       goto #15
◌  14 ─ %53 = Core.tuple(1)::Tuple{Int64}
✓  │          invoke Base.throw_boundserror(%3::Vector{Any}, %53::Tuple{Int64})::Union{}
◌  └───       unreachable
X  15 ┄ %56 = Base.getfield(%3, :ref)::MemoryRef{Any}
X  │    %57 = Base.memoryrefnew(%56, 1, false)::MemoryRef{Any}
X  │    %58 = Base.memoryrefget(%57, :not_atomic, false)::Any
◌  └───       goto #16
X  16 ─ %60 = Base.getfield(%3, :size)::Tuple{Int64}
◌  │    %61 = $(Expr(:boundscheck, true))::Bool
X  │    %62 = Base.getfield(%60, 1, %61)::Int64
↑′ │    %63 = Core.tuple(%58, %62)::Tuple{Any, Int64}
◌  └───       return %63

Note that ReturnEscape only applies to %2 (corresponding to SafeRef(s)), but not to %4 (corresponding to SafeRef(t)). This is because the dimension of the allocated array and the indexes involved in all arrayref/arrayset operations are available as permanent information, and EscapeAnalysis can understand that %6 has been assigned the alias %2, but will never be assigned the alias %4. In this case, subsequent optimization passes will be able to replace Base.arrayref(true, %1, 1)::Any with %2 (the so-called load forwarding) and eventually completely eliminate the allocation of the array %1 (the so-called scalar substitution).

Compared to the analysis of an object’s field, where access to the field can be trivially analyzed using type information obtained by inference, the dimension of the array is not encoded as type information, so we need additional analysis to obtain this information. At this point, EscapeAnalysis first performs an additional simple linear scan to analyze the measurements of the allocated arrays before running the main analysis procedure, so that subsequent escape analysis can accurately analyze operations with these arrays.

However, such an accurate piecemeal analysis of pseudonyms is often difficult. In fact, the main difficulty inherent in an array is that the dimension and index of the array are often inconsistent.

A loop often produces non-permanent array indexes that are mutable in the loop.
It is typical for vectors: changing the size of an array changes the dimension of the array and violates its constancy.

Let’s discuss these difficulties using concrete examples.

In the following example, EscapeAnalysis fails an accurate alias analysis because the index is in Base.arrayset(false, %4, %8, %6)::Vector{Any} is not (trivially) constant. In particular, Any[nothing, nothing] forms a loop and calls the arrayset operation in the loop, where %6 is represented as the value of the ϕ node (the value of which depends on the execution order). As a result, ReturnEscape ends up overlaid on both %23 (corresponding to SafeRef(s)) and %25 (corresponding to SafeRef(t)'), although ideally it should be overlaid only on `%23, but not on %25.

julia> code_escapes((String,String)) do s, t
           ary = Any[nothing, nothing]
           ary[1] = SafeRef(s)
           ary[2] = SafeRef(t)
           return ary[1], length(ary)
       end
#15(X s::String, X t::String) in Main at REPL[1]:2
X  1 ── %1  = $(Expr(:foreigncall, :(:jl_alloc_genericmemory), Ref{Memory{Any}}, svec(Any, Int64), 0, :(:ccall), Memory{Any}, 2, 2))::Memory{Any}
X  │    %2  = Core.memoryrefnew(%1)::MemoryRef{Any}
X  └─── %3  = %new(Vector{Any}, %2, (2,))::Vector{Any}
◌  2 ┄─ %4  = φ (#1 => 1, #6 => %14)::Int64
◌  │    %5  = φ (#1 => 1, #6 => %15)::Int64
X  │    %6  = Base.getfield(%3, :ref)::MemoryRef{Any}
X  │    %7  = Base.memoryrefnew(%6, %4, false)::MemoryRef{Any}
◌  │          Base.memoryrefset!(%7, nothing, :not_atomic, false)::Nothing
◌  │    %9  = (%5 === 2)::Bool
◌  └───       goto #4 if not %9
◌  3 ──       goto #5
◌  4 ── %12 = Base.add_int(%5, 1)::Int64
◌  └───       goto #5
◌  5 ┄─ %14 = φ (#4 => %12)::Int64
◌  │    %15 = φ (#4 => %12)::Int64
◌  │    %16 = φ (#3 => true, #4 => false)::Bool
◌  │    %17 = Base.not_int(%16)::Bool
◌  └───       goto #7 if not %17
◌  6 ──       goto #2
◌  7 ──       goto #8
X  8 ── %21 = %new(Main.SafeRef{String}, _2)::Main.SafeRef{String}
◌  │    %22 = $(Expr(:boundscheck, true))::Bool
◌  └───       goto #12 if not %22
◌  9 ── %24 = Base.sub_int(1, 1)::Int64
◌  │    %25 = Base.bitcast(Base.UInt, %24)::UInt64
X  │    %26 = Base.getfield(%3, :size)::Tuple{Int64}
◌  │    %27 = $(Expr(:boundscheck, true))::Bool
X  │    %28 = Base.getfield(%26, 1, %27)::Int64
◌  │    %29 = Base.bitcast(Base.UInt, %28)::UInt64
◌  │    %30 = Base.ult_int(%25, %29)::Bool
◌  └───       goto #11 if not %30
◌  10 ─       goto #12
◌  11 ─ %33 = Core.tuple(1)::Tuple{Int64}
✓  │          invoke Base.throw_boundserror(%3::Vector{Any}, %33::Tuple{Int64})::Union{}
◌  └───       unreachable
X  12 ┄ %36 = Base.getfield(%3, :ref)::MemoryRef{Any}
X  │    %37 = Base.memoryrefnew(%36, 1, false)::MemoryRef{Any}
X  │          Base.memoryrefset!(%37, %21, :not_atomic, false)::Main.SafeRef{String}
◌  └───       goto #13
X  13 ─ %40 = %new(Main.SafeRef{String}, _3)::Main.SafeRef{String}
◌  │    %41 = $(Expr(:boundscheck, true))::Bool
◌  └───       goto #17 if not %41
◌  14 ─ %43 = Base.sub_int(2, 1)::Int64
◌  │    %44 = Base.bitcast(Base.UInt, %43)::UInt64
X  │    %45 = Base.getfield(%3, :size)::Tuple{Int64}
◌  │    %46 = $(Expr(:boundscheck, true))::Bool
X  │    %47 = Base.getfield(%45, 1, %46)::Int64
◌  │    %48 = Base.bitcast(Base.UInt, %47)::UInt64
◌  │    %49 = Base.ult_int(%44, %48)::Bool
◌  └───       goto #16 if not %49
◌  15 ─       goto #17
◌  16 ─ %52 = Core.tuple(2)::Tuple{Int64}
✓  │          invoke Base.throw_boundserror(%3::Vector{Any}, %52::Tuple{Int64})::Union{}
◌  └───       unreachable
X  17 ┄ %55 = Base.getfield(%3, :ref)::MemoryRef{Any}
X  │    %56 = Base.memoryrefnew(%55, 2, false)::MemoryRef{Any}
X  │          Base.memoryrefset!(%56, %40, :not_atomic, false)::Main.SafeRef{String}
◌  └───       goto #18
◌  18 ─ %59 = $(Expr(:boundscheck, true))::Bool
◌  └───       goto #22 if not %59
◌  19 ─ %61 = Base.sub_int(1, 1)::Int64
◌  │    %62 = Base.bitcast(Base.UInt, %61)::UInt64
X  │    %63 = Base.getfield(%3, :size)::Tuple{Int64}
◌  │    %64 = $(Expr(:boundscheck, true))::Bool
X  │    %65 = Base.getfield(%63, 1, %64)::Int64
◌  │    %66 = Base.bitcast(Base.UInt, %65)::UInt64
◌  │    %67 = Base.ult_int(%62, %66)::Bool
◌  └───       goto #21 if not %67
◌  20 ─       goto #22
◌  21 ─ %70 = Core.tuple(1)::Tuple{Int64}
✓  │          invoke Base.throw_boundserror(%3::Vector{Any}, %70::Tuple{Int64})::Union{}
◌  └───       unreachable
X  22 ┄ %73 = Base.getfield(%3, :ref)::MemoryRef{Any}
X  │    %74 = Base.memoryrefnew(%73, 1, false)::MemoryRef{Any}
X  │    %75 = Base.memoryrefget(%74, :not_atomic, false)::Any
◌  └───       goto #23
X  23 ─ %77 = Base.getfield(%3, :size)::Tuple{Int64}
◌  │    %78 = $(Expr(:boundscheck, true))::Bool
X  │    %79 = Base.getfield(%77, 1, %78)::Int64
↑′ │    %80 = Core.tuple(%75, %79)::Tuple{Any, Int64}
◌  └───       return %80

The following example shows how changing the size of a vector makes it difficult to accurately analyze aliases. A significant difficulty lies in the fact that the dimension of the allocated array %1 is first initialized as 0, but then changed by two calls to :jl_array_grow_end'. Currently, `EscapeAnalysis simply refuses to accurately analyze aliases when faced with array resizing operations, so ReturnEscape is overlaid as %2 (corresponding to SafeRef(s)), and by %20 (corresponding to SafeRef(t)).

julia> code_escapes((String,String)) do s, t
           ary = Any[]
           push!(ary, SafeRef(s))
           push!(ary, SafeRef(t))
           ary[1], length(ary)
       end
#17(X s::String, X t::String) in Main at REPL[1]:2
X  1 ── %1  = Core.getproperty(Memory{Any}, :instance)::Memory{Any}
X  │    %2  = invoke Core.memoryref(%1::Memory{Any})::MemoryRef{Any}
X  │    %3  = %new(Vector{Any}, %2, (0,))::Vector{Any}
X  │    %4  = %new(Main.SafeRef{String}, _2)::Main.SafeRef{String}
X  │    %5  = Base.getfield(%3, :ref)::MemoryRef{Any}
X  │    %6  = Base.getfield(%5, :mem)::Memory{Any}
X  │    %7  = Base.getfield(%6, :length)::Int64
X  │    %8  = Base.getfield(%3, :size)::Tuple{Int64}
◌  │    %9  = $(Expr(:boundscheck, true))::Bool
X  │    %10 = Base.getfield(%8, 1, %9)::Int64
◌  │    %11 = Base.add_int(%10, 1)::Int64
◌  │    %12 = Base.memoryrefoffset(%5)::Int64
X  │    %13 = Core.tuple(%11)::Tuple{Int64}
◌  │          Base.setfield!(%3, :size, %13)::Tuple{Int64}
◌  │    %15 = Base.add_int(%12, %11)::Int64
◌  │    %16 = Base.sub_int(%15, 1)::Int64
◌  │    %17 = Base.slt_int(%7, %16)::Bool
◌  └───       goto #3 if not %17
X  2 ── %19 = %new(Base.var"#133#134"{Vector{Any}, Int64, Int64, Int64, Int64, Int64, Memory{Any}, MemoryRef{Any}}, %3, %16, %12, %11, %10, %7, %6, %5)::Base.var"#133#134"{Vector{Any}, Int64, Int64, Int64, Int64, Int64, Memory{Any}, MemoryRef{Any}}
✓  └───       invoke %19()::MemoryRef{Any}
◌  3 ┄─       goto #4
X  4 ── %22 = Base.getfield(%3, :size)::Tuple{Int64}
◌  │    %23 = $(Expr(:boundscheck, true))::Bool
X  │    %24 = Base.getfield(%22, 1, %23)::Int64
X  │    %25 = Base.getfield(%3, :ref)::MemoryRef{Any}
X  │    %26 = Base.memoryrefnew(%25, %24, false)::MemoryRef{Any}
X  │          Base.memoryrefset!(%26, %4, :not_atomic, false)::Main.SafeRef{String}
◌  └───       goto #5
X  5 ── %29 = %new(Main.SafeRef{String}, _3)::Main.SafeRef{String}
X  │    %30 = Base.getfield(%3, :ref)::MemoryRef{Any}
X  │    %31 = Base.getfield(%30, :mem)::Memory{Any}
X  │    %32 = Base.getfield(%31, :length)::Int64
X  │    %33 = Base.getfield(%3, :size)::Tuple{Int64}
◌  │    %34 = $(Expr(:boundscheck, true))::Bool
X  │    %35 = Base.getfield(%33, 1, %34)::Int64
◌  │    %36 = Base.add_int(%35, 1)::Int64
◌  │    %37 = Base.memoryrefoffset(%30)::Int64
X  │    %38 = Core.tuple(%36)::Tuple{Int64}
◌  │          Base.setfield!(%3, :size, %38)::Tuple{Int64}
◌  │    %40 = Base.add_int(%37, %36)::Int64
◌  │    %41 = Base.sub_int(%40, 1)::Int64
◌  │    %42 = Base.slt_int(%32, %41)::Bool
◌  └───       goto #7 if not %42
X  6 ── %44 = %new(Base.var"#133#134"{Vector{Any}, Int64, Int64, Int64, Int64, Int64, Memory{Any}, MemoryRef{Any}}, %3, %41, %37, %36, %35, %32, %31, %30)::Base.var"#133#134"{Vector{Any}, Int64, Int64, Int64, Int64, Int64, Memory{Any}, MemoryRef{Any}}
✓  └───       invoke %44()::MemoryRef{Any}
◌  7 ┄─       goto #8
X  8 ── %47 = Base.getfield(%3, :size)::Tuple{Int64}
◌  │    %48 = $(Expr(:boundscheck, true))::Bool
X  │    %49 = Base.getfield(%47, 1, %48)::Int64
X  │    %50 = Base.getfield(%3, :ref)::MemoryRef{Any}
X  │    %51 = Base.memoryrefnew(%50, %49, false)::MemoryRef{Any}
X  │          Base.memoryrefset!(%51, %29, :not_atomic, false)::Main.SafeRef{String}
◌  └───       goto #9
◌  9 ── %54 = $(Expr(:boundscheck, true))::Bool
◌  └───       goto #13 if not %54
◌  10 ─ %56 = Base.sub_int(1, 1)::Int64
◌  │    %57 = Base.bitcast(Base.UInt, %56)::UInt64
X  │    %58 = Base.getfield(%3, :size)::Tuple{Int64}
◌  │    %59 = $(Expr(:boundscheck, true))::Bool
X  │    %60 = Base.getfield(%58, 1, %59)::Int64
◌  │    %61 = Base.bitcast(Base.UInt, %60)::UInt64
◌  │    %62 = Base.ult_int(%57, %61)::Bool
◌  └───       goto #12 if not %62
◌  11 ─       goto #13
◌  12 ─ %65 = Core.tuple(1)::Tuple{Int64}
✓  │          invoke Base.throw_boundserror(%3::Vector{Any}, %65::Tuple{Int64})::Union{}
◌  └───       unreachable
X  13 ┄ %68 = Base.getfield(%3, :ref)::MemoryRef{Any}
X  │    %69 = Base.memoryrefnew(%68, 1, false)::MemoryRef{Any}
X  │    %70 = Base.memoryrefget(%69, :not_atomic, false)::Any
◌  └───       goto #14
X  14 ─ %72 = Base.getfield(%3, :size)::Tuple{Int64}
◌  │    %73 = $(Expr(:boundscheck, true))::Bool
X  │    %74 = Base.getfield(%72, 1, %73)::Int64
↑′ │    %75 = Core.tuple(%70, %74)::Tuple{Any, Int64}
◌  └───       return %75

To solve these problems, the output needs to be aware of the array dimensions and distribute them in an order-dependent way^[3], and also determined the correct representation of the loop’s variable values.

At this point, EscapeAnalysis quickly switches to a more imprecise analysis that does not track accurate information about indexes in cases where array dimensions or indexes are trivially variable. Switching can be naturally implemented as a grid connection operation of the EscapeInfo.AliasInfo property as part of data flow analysis.

Exception handling

It is also worth paying attention to how EscapeAnalysis handles possible exits through exceptions. It is naive to believe that it is sufficient to extend the output information superimposed on the object :the_exception to all values that may appear in the corresponding 'try` block. In fact, Julia has several other ways to access the exception object, such as Base.current_exceptions and rethrow'. For example, escape analysis should take into account the potential output of `r in the example below.

julia> const GR = Ref{Any}();


julia> @noinline function rethrow_escape!()
           try
               rethrow()
           catch err
               GR[] = err
           end
       end;


julia> get′(x) = isassigned(x) ? x[] : throw(x);


julia> code_escapes() do
           r = Ref{String}()
           local t
           try
               t = get′(r)
           catch err
               t = typeof(err) # `err` (whose alias is `r`) does not work here
               rethrow_escape!() # But here comes the `r`
           end
           return t
       end
#19() in Main at REPL[4]:2
X  1 ─ %1  = %new(Base.RefValue{String})::Base.RefValue{String}
◌  2 ─ %2  = enter #8
◌  3 ─ %3  = Base.isdefined(%1, :x)::Bool
◌  └──       goto #5 if not %3
X  4 ─ %5  = Base.getfield(%1, :x)::String
◌  └──       goto #6
◌  5 ─       Main.throw(%1)::Union{}
◌  └──       unreachable
◌  6 ─       $(Expr(:leave, :(%2)))
◌  7 ─       goto #9
✓  8 ┄ %11 = $(Expr(:the_exception))::Any
X  │   %12 = Main.typeof(%11)::DataType
✓  │         invoke Main.rethrow_escape!()::Any
◌  └──       $(Expr(:pop_exception, :(%2)))::Core.Const(nothing)
X  9 ┄ %15 = φ (#7 => %5, #8 => %12)::Union{DataType, String}
◌  └──       return %15

To properly understand all possible exits through existing exception interfaces, a global analysis is required. At the moment, we always conservatively distribute the most important information about the output to all objects of potentially occurring exceptions, since such additional analysis may not be advisable, given that exception handling and error path should usually not depend too much on performance. In addition, optimizing error paths can be very inefficient, as they are often not even intentionally optimized for reasons of delay.

Property x::EscapeInfo`x.ThrownEscape records SSA statements in which x can be invoked as an exception. Using this information, EscapeAnalysis can propagate possible exits through exceptions, limited only to those that can occur in each region of the try.

julia> result = code_escapes((String,String)) do s1, s2
           r1 = Ref(s1)
           r2 = Ref(s2)
           local ret
           try
               s1 = get′(r1)
               ret = sizeof(s1)
           catch err
               global GV = err # обязательно выйдет из `r1`
           end
           s2 = get′(r2)       # все еще не полностью выходит из `r2`
           return s2
       end
#21(X s1::String, ↑ s2::String) in Main at REPL[1]:2
X  1 ── %1  = %new(Base.RefValue{String}, _2)::Base.RefValue{String}
*′ └─── %2  = %new(Base.RefValue{String}, _3)::Base.RefValue{String}
*′ 2 ── %3  = ϒ (%2)::Base.RefValue{String}
◌  └─── %4  = enter #8
◌  3 ── %5  = Base.isdefined(%1, :x)::Bool
◌  └───       goto #5 if not %5
X  4 ──       Base.getfield(%1, :x)::String
◌  └───       goto #6
◌  5 ──       Main.throw(%1)::Union{}
◌  └───       unreachable
◌  6 ──       $(Expr(:leave, :(%4)))
◌  7 ──       goto #11
*′ 8 ┄─ %13 = φᶜ (%3)::Base.RefValue{String}
X  └─── %14 = $(Expr(:the_exception))::Any
◌  9 ──       nothing::Nothing
◌  10 ─       (Main.GV = %14)::Any
◌  └───       $(Expr(:pop_exception, :(%4)))::Core.Const(nothing)
*′ 11 ┄ %18 = φ (#7 => %2, #10 => %13)::Base.RefValue{String}
◌  │    %19 = Base.isdefined(%18, :x)::Bool
◌  └───       goto #13 if not %19
↑  12 ─ %21 = Base.getfield(%18, :x)::String
◌  └───       goto #14
◌  13 ─       Main.throw(%18)::Union{}
◌  └───       unreachable
◌  14 ─       return %21

Using Analysis

The analyze_escapes is the entry point for analyzing information about the output of SSA-IR elements.

Most optimizations such as SROA (sroa_pass!), are more effective when applied to an optimized source that has been simplified during the embedding process (ssa_inlining_pass!) by resolving interprocedural calls and extending the sources of the called object. Accordingly, analyze_escapes can also analyze the results after IR embedding and collect information about the output, which is useful for certain memory-related optimizations.

However, since some optimization procedures, such as embedding, can change the order of execution and eliminate non-working code, they can disrupt the interprocedural reliability of the output information. In particular, in order to collect acceptable interprocedural information about the output, it is necessary to analyze the pre-embedding of the IR.

For this reason, the function "analyze_escapes" can analyze "IRCode" at any stage of optimization at the Julia level, and, in particular, it is supposed to be used in the next two stages.

'EA IPO': analysis of the pre-embedding of IR to create a cache of valid interprocedural output information
`Local EA': Analyzing the results after IR embedding to collect locally valid output information

The output information obtained using the IPO EA is converted into the ArgEscapeCache' data structure and cached globally. By passing the appropriate `get_escape_cache callback to the analyze_escapes function, escape analysis can improve the accuracy of the analysis by using cached interprocedural information about non-embedded callable objects that was obtained during the previous 'IPO EA` analysis. More interesting is the fact that it is also acceptable to use information about the output of IPO EA for type inference. For example, the accuracy of the output can be improved by forming a Const/PartialStruct/`MustAlias' mutable object.

# Core.Compiler.EscapeAnalysis.analyze_escapes — Function

analyze_escapes(ir::IRCode, nargs::Int, get_escape_cache) -> estate::EscapeState

Analyzes information about the output to the ir:

nargs: the number of actual arguments of the analyzed call
get_escape_cache(::MethodInstance) -> Union{Bool,ArgEscapeCache}: gets cached information about the output of the argument

# Core.Compiler.EscapeAnalysis.EscapeState — Type

estate::EscapeState

An extended grid that maps SSA arguments and values to output information presented as EscapeInfo. The output information related to the IR SSA element x can be obtained using `estate[x]'.

# Core.Compiler.EscapeAnalysis.EscapeInfo — Type

x::EscapeInfo

A grid for exit information that has the following properties:

x.Analyzed::Bool: is not formally part of the lattice, only indicates whether x has been analyzed
'x.ReturnEscape::Bool`: indicates that x can pass to the caller via return
x.ThrownEscape::BitSet: records the SSA operator numbers in which x can be called as an exception:
- isempty(x.ThrownEscape): x will never be called in this call frame (bottom)
- pc ∈ x.ThrownEscape: x can be called in the SSA operator in pc
- -1 ∈ x.ThrownEscape: x' can be called at arbitrary points in this call frame (upper part). This information will be used by `escape_exception! to propagate potential exits through exceptions.
x.AliasInfo::Union{Bool,IndexableFields,IndexableElements,Unindexable}: stores all possible values that can be aliases of fields or elements of the array x:
- x.AliasInfo === false indicates that the fields or elements of x have not yet been analyzed
- x.AliasInfo === true indicates that the fields or elements of x cannot be analyzed, for example, the type of x is unknown or unspecified, and therefore its fields or elements cannot be accurately known
- x.AliasInfo::IndexableFields records all possible values that can be aliases of the fields of the object x, with accurate information about the indexes
- x.AliasInfo::IndexableElements records all possible values that can be aliases of array elements x, with accurate information about indexes
- x.AliasInfo::The 'Unindexable records all possible values, which can be aliases of fields or elements of x, with accurate information about the indexes.
x.Liveness::BitSet: records SSA operator numbers where x can be located, for example, used as a call argument, returned to the caller, or stored for :foreigncall:
- isempty(x.Liveness): x will never be used in this frame (bottom part)
- 0 ∈ x.Liveness also has a special meaning, which is that it is the call argument of the current analyzed call frame (and therefore it is immediately visible to the caller).
- pc ∈ x.Liveness: x' can be used in the SSA operator in `pc
- -1 ∈ x.Liveness: `x' can be used at arbitrary points in this call frame (upper part)

There are utility constructors for creating common EscapeInfo types. For example:

NoEscape(): the bottom element of this grid, which means that it will not go anywhere.
AllEscape(): the topmost element of this grid, which means it will go out anywhere

analyze_escapes will translate these elements from bottom to top in the same direction as Julia’s own type inference routine. The abstract state will be initialized by the lower elements:

The call arguments are initialized as ArgEscape(), the Liveness property of which includes 0 to indicate that it is passed as the call argument and is immediately visible from the calling object.;
the remaining states are initialized as NotAnalyzed(), which is a special element of the lattice that is slightly lower than NoEscape, but at the same time does not represent any meaning, except that it has not yet been analyzed (thus, formally it is not part of the lattice).

1. A Graph-Free approach to Data-Flow Analysis. Markas Mohnen, 2002, April. https://api.semanticscholar.org/CorpusID:28519618 .

2. However, in some cases, the fields of objects cannot be accurately analyzed. For example, an object may go to a place where EscapeAnalysis cannot take into account the possible impact of memory on it, or the fields of objects may simply be unknown due to the lack of type information. In such cases, the AliasInfo property rises to the topmost element within its own lattice order, which leads to a more conservative subsequent field analysis, and information about the output in relation to the fields of the non-analyzed object is distributed to the object itself.

3. Otherwise, we will need another direct data flow analysis based on escape analysis.