Views as a way to improve code performance¶

This script discusses the use of views, a mechanism that allows you to access array elements without creating copies of them. The topics will be touched upon:

the difference between a slice copy (slicing) and a view (view)
use of macros @view and @views and their differences.

In order to check the efficiency of using views - let's connect BenchmarkTools libraries

import Pkg; Pkg.add("BenchmarkTools")

   Resolving package versions...
  No Changes to `~/.project/Project.toml`
  No Changes to `~/.project/Manifest.toml`

Copying problem¶

When we use the syntax b = a[1:5], then b becomes a copy of the first five elements of a, rather than "linking by address" to the elements of a.

a = collect(1:10)
b = a[1:5]           # [1, 2, 3, 4, 5] 
println(pointer(a))
println(pointer(b))
b .= 0               # [0, 0, 0, 0, 0]
a'                   # как видим, матрица не поменялась, что и логично

Ptr{Int64} @0x00007fb4fa4125f0
Ptr{Int64} @0x00007fb548a22980

1×10 adjoint(::Vector{Int64}) with eltype Int64:
 1  2  3  4  5  6  7  8  9  10

To change our original vector, we are forced to do an extra action:

a[1:5] .= b[1:5]
a'

1×10 adjoint(::Vector{Int64}) with eltype Int64:
 0  0  0  0  0  6  7  8  9  10

The view function¶

Views just allow us to use the familiar syntax, but to create not copies, but to access directly the "memory locations" of arrays. To do this, you can use the function view

a = collect(1:10000)
# '÷' не то же, что и '/' (÷ = div()) 
view_of_a = view(a,1:length(a)÷2) # end здесь не сработает
view_of_a .= 0
a

10000-element Vector{Int64}:
     0
     0
     0
     0
     0
     0
     0
     0
     0
     0
     0
     0
     0
     ⋮
  9989
  9990
  9991
  9992
  9993
  9994
  9995
  9996
  9997
  9998
  9999
 10000

pointer(view_of_a) == pointer(a)

true

Let's make sure that using views allows us to avoid allocating extra memory for copies by using @allocated, which shows the number of bytes allocated.

println(@allocated (subarray_of_a = a[1:end÷2]))
println(@allocated (view_of_a = view(a,1:length(a)÷2)))

40112
112

@view¶

But using the view function does not fulfil the above statement about the "familiar interface" because we could not use, for example, the keyword end. To solve this problem we can use the macro @view:

a = repeat(1:10,inner=3)
b = @view a[end-3:end]
b .= 0
a'

1×30 adjoint(::Vector{Int64}) with eltype Int64:
 1  1  1  2  2  2  3  3  3  4  4  4  5  …  7  7  7  8  8  8  9  9  0  0  0  0

But the question may arise: why do we need unnecessary variables when we can just do a

a = repeat(1:10,inner=3)
a[end-3,3] .= 0

The answer to this can be formulated as follows:

Representations are needed as a union of efficient use of resources and preserving code readability.

Suppose there is a task to output and calculate the sum of triplet elements.

for i in 0:(length(a)÷3-1)
    println("sum of $(a[3i+1:3i+3]) -> $(sum(a[3i+1:3i+3]))")")
end

You can see that there are repeating elements, and it is also easy to make a mistake in one of the indexings.

for i in 0:(length(a)÷3-1)
    triplet = a[3i+1:3i+3]
    println("sum of $triplet -> $(sum(triplet))")
end

sum of [1, 1, 1] -> 3
sum of [2, 2, 2] -> 6
sum of [3, 3, 3] -> 9
sum of [4, 4, 4] -> 12
sum of [5, 5, 5] -> 15
sum of [6, 6, 6] -> 18
sum of [7, 7, 7] -> 21
sum of [8, 8, 8] -> 24
sum of [9, 9, 0] -> 18
sum of [0, 0, 0] -> 0

In addition, we can see that a function that uses views will allocate much less memory and run much faster. For this purpose, we will use the macro @btime, which shows the execution time of the function and the memory allocated by running the function several times and averaging the values.

(We removed the output to the console from the functions to avoid clogging the console during multiple function calls).

using BenchmarkTools

function tripletssum_subarray(v)
for i in 0:(length(v)÷3-1)
    triplet = v[3i+1:3i+3]
end
end

function tripletssum_view(v)
for i in 0:(length(v)÷3-1)
    triplet = @view v[3i+1:3i+3]
end
end

 
a = repeat(1:10000,inner=3)

println(@btime tripletssum_subarray(a))
println(@btime tripletssum_view(a))

  442.980 μs (10000 allocations: 781.25 KiB)
nothing
  11.152 μs (0 allocations: 0 bytes)
nothing

@views¶

Let's consider the following example

Pkg.add("LinearAlgebra")
using LinearAlgebra
@btime dot( a[1:end÷2], a[end÷2+1:end])

   Resolving package versions...
  No Changes to `~/.project/Project.toml`
  No Changes to `~/.project/Manifest.toml`

  51.050 μs (13 allocations: 234.64 KiB)

312575002500

And it would seem that we know how we can improve this code:

try
# ОСНОВНОЙ КОД
#------------------------------------------------------
  @btime dot(@view a[1:end÷2], @view a[end÷2+1:end])  
#------------------------------------------------------
# ОБРАБОТКА ИСКЛЮЧЕНИЯ
catch e
	io = IOBuffer();
	showerror(io, e)
	error_msg = String(take!(io))
end

"LoadError: ArgumentError: Invalid use of @view macro: argument must be a reference expression A[...].\nin expression starting at In[53]:4"

The error says that we are not using the macro correctly @view.

Although our expression a[1:end÷2] seems to match the expression A[...].

The problem is that we misused the macro.

In order to correct this situation, we put the vectors to which we want to apply the representation into brackets:

  @btime dot(@view(a[1:end÷2]) ,@view(a[end÷2+1:end]))

  4.153 μs (11 allocations: 272 bytes)

312575002500

But in order not to write @view for each slicing we can use the macro @views

  @btime @views dot((a[1:end÷2]), (a[end÷2+1:end]))

  3.801 μs (11 allocations: 272 bytes)

312575002500

@views can be inserted before the function definition, so that slices inside the function will be performed using views.

@views function tripletssum_views(v)
    for i in 0:(length(v)÷3-1)
        triplet = v[3i+1:3i+3]
    end
end
a = repeat(1:10000,inner=3)

println(@btime tripletssum_views(a))

  11.047 μs (0 allocations: 0 bytes)
nothing

In what cases should we use views?¶

Views should be used where:

it improves readability
it affects productivity
you understand the difference between working with the copy and the original

a = rand(1000)
println(@allocated sum(a))
println(@allocated sum(a[1:end]))
println(@allocated sum(copy(a[1:end])))
println(@allocated sum(@view a[1:end]))

16
8192
17152
2824

Consider example of solving the Lotka-Volterra equation

import Pkg; Pkg.add(["OrdinaryDiffEq","Plots"])
using OrdinaryDiffEq
using Plots
gr()
function lotka(du, u, p, t) 
    du[1] = p[1] * u[1] - p[2] * u[1] * u[2]
    du[2] = p[4] * u[1] * u[2] - p[3] * u[2]  
end
α = 1; β = 0.01; γ = 1; δ = 0.02;
p = [α, β, γ, δ]
tspan = (0.0, 6.5)
u0 = [50; 50]
prob = ODEProblem(lotka, u0, tspan, p)
sol = solve(prob,saveat=0.001)

   Resolving package versions...
  No Changes to `~/.project/Project.toml`
  No Changes to `~/.project/Manifest.toml`

retcode: Success
Interpolation: 1st order linear
t: 6501-element Vector{Float64}:
 0.0
 0.001
 0.002
 0.003
 0.004
 0.005
 0.006
 0.007
 0.008
 0.009
 0.01
 0.011
 0.012
 ⋮
 6.489
 6.49
 6.491
 6.492
 6.493
 6.494
 6.495
 6.496
 6.497
 6.498
 6.499
 6.5
u: 6501-element Vector{Vector{Float64}}:
 [50.0, 50.0]
 [50.02500624847936, 50.000012502345534]
 [50.05002498979869, 50.00005001769738]
 [50.07505621775588, 50.00011255855343]
 [50.1000999261168, 50.000200137444956]
 [50.125156108615315, 50.00031276693659]
 [50.1502247589533, 50.00045045962633]
 [50.17530587080062, 50.000613228145575]
 [50.20039943779515, 50.00080108515906]
 [50.22550545354273, 50.001014043364925]
 [50.25062391161723, 50.00125211549466]
 [50.27575480556051, 50.001515314313124]
 [50.30089812888241, 50.00180365261858]
 ⋮
 [50.05479160200564, 49.98705194268289]
 [50.079830802201556, 49.987119920075195]
 [50.104882567590614, 49.987212877939285]
 [50.12994689359235, 49.987330827635944]
 [50.15502377561346, 49.98747378054721]
 [50.18011320904711, 49.98764174807645]
 [50.20521518927415, 49.98783474164827]
 [50.23032971166218, 49.988052772708585]
 [50.2554567715658, 49.98829585272456]
 [50.28059636432637, 49.98856399318467]
 [50.305748485272886, 49.98885720559867]
 [50.330913129720685, 49.98917550149757]

The time dependencies of the variables $x(t)$ and $y(t)$ will be used to draw the graphs.

plot(sol)

But if we want to draw the dependence $y(x)$, we will have to use the slices sol[1,:] and sol[2,:]. Which reminds us of our above-mentioned problem.

@btime plot(sol[1,:],sol[2,:])

  582.602 μs (494 allocations: 451.46 KiB)

Which we can now solve using views:

@btime @views plot(sol[1,:],sol[2,:])

  488.246 μs (492 allocations: 102.46 KiB)

Conclusions¶

Having learnt about the concept of views, practical ways to improve the performance of functions that do not require any significant changes to the code have been considered.