Engee documentation
Notebook

Views as a way to improve code performance

This script discusses the use of views, a mechanism that allows accessing array elements without creating copies of them.
Topics will be covered:

  • the difference between a slice copy (slicing) and a view (view)
  • the use of macros @view and @views and their differences.

In order to test the effectiveness of using views, we will connect the benchmark libraries.

In [ ]:
import Pkg; Pkg.add("BenchmarkTools")
   Resolving package versions...
  No Changes to `~/.project/Project.toml`
  No Changes to `~/.project/Manifest.toml`

The problem of copying

When we use syntax b = a[1:5], then b becomes a copy of the first five elements a, rather than "linking by address" to the elements a.

In [ ]:
a = collect(1:10)
b = a[1:5]           # [1, 2, 3, 4, 5] 
println(pointer(a))
println(pointer(b))
b .= 0               # [0, 0, 0, 0, 0]
a'                   # как видим, матрица не поменялась, что и логично
Ptr{Int64} @0x00007fb4fa4125f0
Ptr{Int64} @0x00007fb548a22980
Out[0]:
1×10 adjoint(::Vector{Int64}) with eltype Int64:
 1  2  3  4  5  6  7  8  9  10

To change our original vector, we have to do an extra action.:

In [ ]:
a[1:5] .= b[1:5]
a'
Out[0]:
1×10 adjoint(::Vector{Int64}) with eltype Int64:
 0  0  0  0  0  6  7  8  9  10

The view function

Representations just allow you to use the usual syntax, but not to create copies, but to access directly to the "memory cells" of arrays.
To do this, you can use the function view

In [ ]:
a = collect(1:10000)
# '÷' не то же, что и '/' (÷ = div()) 
view_of_a = view(a,1:length(a)÷2) # end здесь не сработает
view_of_a .= 0
a
Out[0]:
10000-element Vector{Int64}:
     0
     0
     0
     0
     0
     0
     0
     0
     0
     0
     0
     0
     0
     ⋮
  9989
  9990
  9991
  9992
  9993
  9994
  9995
  9996
  9997
  9998
  9999
 10000
In [ ]:
pointer(view_of_a) == pointer(a)
Out[0]:
true

Let's make sure that using views allows us to avoid allocating extra memory for copies by using @allocated, showing the number of allocated bytes.

In [ ]:
println(@allocated (subarray_of_a = a[1:end÷2]))
println(@allocated (view_of_a = view(a,1:length(a)÷2)))
40112
112

@view

But using the function view it does not correspond to the above statement about the "familiar interface", since we could not use, for example, the keyword end.
You can use a macro to solve this problem. @view:

In [ ]:
a = repeat(1:10,inner=3)
b = @view a[end-3:end]
b .= 0
a'
Out[0]:
1×30 adjoint(::Vector{Int64}) with eltype Int64:
 1  1  1  2  2  2  3  3  3  4  4  4  5  …  7  7  7  8  8  8  9  9  0  0  0  0

But the question may arise: why do we need extra variables when you can just do

a = repeat(1:10,inner=3)
a[end-3,3] .= 0
`` The
answer to this can be formulated as follows:

> Views are needed as a combination of efficient use of resources and maintaining code readability.

Suppose there is a task to output and calculate the sum of a triplet of elements.

```julia
for i in 0:(length(a)÷3-1)
    println("sum of $(a[3i+1:3i+3]) -> $(sum(a[3i+1:3i+3]))")
end

It can be seen that there are duplicate elements, and it is also easy to make a mistake in one of the indexes.

In [ ]:
for i in 0:(length(a)÷3-1)
    triplet = a[3i+1:3i+3]
    println("sum of $triplet -> $(sum(triplet))")
end
sum of [1, 1, 1] -> 3
sum of [2, 2, 2] -> 6
sum of [3, 3, 3] -> 9
sum of [4, 4, 4] -> 12
sum of [5, 5, 5] -> 15
sum of [6, 6, 6] -> 18
sum of [7, 7, 7] -> 21
sum of [8, 8, 8] -> 24
sum of [9, 9, 0] -> 18
sum of [0, 0, 0] -> 0

In addition, we can see that a function that uses representations will allocate much less memory and run significantly faster.
To do this, we will use the macro @btime, showing the execution time of the function and the memory allocated at the same time, running the function several times and averaging the values.

(The output to the console was removed from the functions so as not to clog up the console during multiple function calls)

In [ ]:
using BenchmarkTools

function tripletssum_subarray(v)
for i in 0:(length(v)÷3-1)
    triplet = v[3i+1:3i+3]
end
end

function tripletssum_view(v)
for i in 0:(length(v)÷3-1)
    triplet = @view v[3i+1:3i+3]
end
end

 
a = repeat(1:10000,inner=3)

println(@btime tripletssum_subarray(a))
println(@btime tripletssum_view(a))
  442.980 μs (10000 allocations: 781.25 KiB)
nothing
  11.152 μs (0 allocations: 0 bytes)
nothing

@views

Consider the following example

In [ ]:
Pkg.add("LinearAlgebra")
using LinearAlgebra
@btime dot( a[1:end÷2], a[end÷2+1:end])
   Resolving package versions...
  No Changes to `~/.project/Project.toml`
  No Changes to `~/.project/Manifest.toml`
  51.050 μs (13 allocations: 234.64 KiB)
Out[0]:
312575002500

And it would seem that we know how to improve this code.:

In [ ]:
try
# ОСНОВНОЙ КОД
#------------------------------------------------------
  @btime dot(@view a[1:end÷2], @view a[end÷2+1:end])  
#------------------------------------------------------
# ОБРАБОТКА ИСКЛЮЧЕНИЯ
catch e
	io = IOBuffer();
	showerror(io, e)
	error_msg = String(take!(io))
end
Out[0]:
"LoadError: ArgumentError: Invalid use of @view macro: argument must be a reference expression A[...].\nin expression starting at In[53]:4"

The error says that we are using the macro incorrectly. @view.

Although it seems to be our expression a[1:end÷2] corresponds to the expression A[...].

The problem is that we used the macro incorrectly.

In order to fix this situation, we put the vectors to which we want to apply the representation in parentheses.:

In [ ]:
  @btime dot(@view(a[1:end÷2]) ,@view(a[end÷2+1:end]))  
  4.153 μs (11 allocations: 272 bytes)
Out[0]:
312575002500

But in order not to write for each slicing @view we can use a macro @views

In [ ]:
  @btime @views dot((a[1:end÷2]), (a[end÷2+1:end]))  
  3.801 μs (11 allocations: 272 bytes)
Out[0]:
312575002500

@views you can insert it before defining a function so that slices inside it occur using views.

In [ ]:
@views function tripletssum_views(v)
    for i in 0:(length(v)÷3-1)
        triplet = v[3i+1:3i+3]
    end
end
a = repeat(1:10000,inner=3)

println(@btime tripletssum_views(a))
  11.047 μs (0 allocations: 0 bytes)
nothing

In what cases should representations be used?

Representations should be used where:

  • it improves readability
  • it affects performance
  • do you understand the difference between working with a copy and with the original
In [ ]:
a = rand(1000)
println(@allocated sum(a))
println(@allocated sum(a[1:end]))
println(@allocated sum(copy(a[1:end])))
println(@allocated sum(@view a[1:end]))
16
8192
17152
2824
In [ ]:
import Pkg; Pkg.add(["OrdinaryDiffEq","Plots"])
using OrdinaryDiffEq
using Plots
gr()
function lotka(du, u, p, t) 
    du[1] = p[1] * u[1] - p[2] * u[1] * u[2]
    du[2] = p[4] * u[1] * u[2] - p[3] * u[2]  
end
α = 1; β = 0.01; γ = 1; δ = 0.02;
p = [α, β, γ, δ]
tspan = (0.0, 6.5)
u0 = [50; 50]
prob = ODEProblem(lotka, u0, tspan, p)
sol = solve(prob,saveat=0.001)
   Resolving package versions...
  No Changes to `~/.project/Project.toml`
  No Changes to `~/.project/Manifest.toml`
Out[0]:
retcode: Success
Interpolation: 1st order linear
t: 6501-element Vector{Float64}:
 0.0
 0.001
 0.002
 0.003
 0.004
 0.005
 0.006
 0.007
 0.008
 0.009
 0.01
 0.011
 0.012
 ⋮
 6.489
 6.49
 6.491
 6.492
 6.493
 6.494
 6.495
 6.496
 6.497
 6.498
 6.499
 6.5
u: 6501-element Vector{Vector{Float64}}:
 [50.0, 50.0]
 [50.02500624847936, 50.000012502345534]
 [50.05002498979869, 50.00005001769738]
 [50.07505621775588, 50.00011255855343]
 [50.1000999261168, 50.000200137444956]
 [50.125156108615315, 50.00031276693659]
 [50.1502247589533, 50.00045045962633]
 [50.17530587080062, 50.000613228145575]
 [50.20039943779515, 50.00080108515906]
 [50.22550545354273, 50.001014043364925]
 [50.25062391161723, 50.00125211549466]
 [50.27575480556051, 50.001515314313124]
 [50.30089812888241, 50.00180365261858]
 ⋮
 [50.05479160200564, 49.98705194268289]
 [50.079830802201556, 49.987119920075195]
 [50.104882567590614, 49.987212877939285]
 [50.12994689359235, 49.987330827635944]
 [50.15502377561346, 49.98747378054721]
 [50.18011320904711, 49.98764174807645]
 [50.20521518927415, 49.98783474164827]
 [50.23032971166218, 49.988052772708585]
 [50.2554567715658, 49.98829585272456]
 [50.28059636432637, 49.98856399318467]
 [50.305748485272886, 49.98885720559867]
 [50.330913129720685, 49.98917550149757]

When drawing graphs, time-dependent variables will be used. and .

In [ ]:
plot(sol)
Out[0]:

But if we want to draw a dependency then we will have to use slices . sol[1,:] and sol[2,:]. Which just reminds us of our aforementioned problem.

In [ ]:
@btime plot(sol[1,:],sol[2,:])
  582.602 μs (494 allocations: 451.46 KiB)
Out[0]:

Which we can now solve using representations:

In [ ]:
@btime @views plot(sol[1,:],sol[2,:])
  488.246 μs (492 allocations: 102.46 KiB)
Out[0]:

Conclusions

After getting acquainted with the concept of representations, practical ways to improve the performance of functions that do not require any significant changes to the code were considered.