Instrumentation of Julia using DTrace and bpftrace
The DTrace and bpftrace tools provide lightweight process instrumentation. Instrumentation can be enabled and disabled during the process, and its costs are minimal.
Compatibility: Julia 1.8
Probe support was added in Julia 1.8. |
This document is intended for Linux, but the content should be mostly true for Mac OS/Darwin and FreeBSD. |
Enabling support
On Linux, install the systemtap
package with the dtrace
version and create a Make.user
file with the following contents:
WITH_DTRACE=1
to enable the USDT probes.
Check
> readelf -n usr/lib/libjulia-internal.so.1 Displaying notes found in: .note.gnu.build-id Owner Data size Description GNU 0x00000014 NT_GNU_BUILD_ID (unique build ID bitstring) Build ID: 57161002f35548772a87418d2385c284ceb3ead8 Displaying notes found in: .note.stapsdt Owner Data size Description stapsdt 0x00000029 NT_STAPSDT (SystemTap probe descriptors) Provider: julia Name: gc__begin Location: 0x000000000013213e, Base: 0x00000000002bb4da, Semaphore: 0x0000000000346cac Arguments: stapsdt 0x00000032 NT_STAPSDT (SystemTap probe descriptors) Provider: julia Name: gc__stop_the_world Location: 0x0000000000132144, Base: 0x00000000002bb4da, Semaphore: 0x0000000000346cae Arguments: stapsdt 0x00000027 NT_STAPSDT (SystemTap probe descriptors) Provider: julia Name: gc__end Location: 0x000000000013214a, Base: 0x00000000002bb4da, Semaphore: 0x0000000000346cb0 Arguments: stapsdt 0x0000002d NT_STAPSDT (SystemTap probe descriptors) Provider: julia Name: gc__finalizer Location: 0x0000000000132150, Base: 0x00000000002bb4da, Semaphore: 0x0000000000346cb2 Arguments:
Adding probes to libjulia
The probes are declared in the dtraces format in the file src/uprobes.d
. The generated header file is included in the file src/julia_internal.h
: when adding probes, it is necessary to provide an empty implementation in it.
The header will contain the *_ENABLED
semaphore and the actual probe call. If it takes a lot of resources to calculate the probe arguments, you should first check if the probe is enabled, and then calculate the arguments and invoke the probe.
if (JL_PROBE_{PROBE}_ENABLED())
auto expensive_arg = ...;
JL_PROBE_{PROBE}(expensive_arg);
If the probe has no arguments, it is better not to enable the semaphore check. When the USDT probes are turned on, the semaphore puts a load on the memory, regardless of whether the probe is turned on.
#define JL_PROBE_GC_BEGIN_ENABLED() __builtin_expect (julia_gc__begin_semaphore, 0)
__extension__ extern unsigned short julia_gc__begin_semaphore __attribute__ ((unused)) __attribute__ ((section (".probes")));
Since the probe itself is an idle operation, control will be transferred to the probe handler.
Available probes
Garbage Collection probes
-
julia:gc__begin
: Garbage collection begins in one of the threads and initiates program execution suspension. -
julia:gc__stop_the_world
: All threads have reached a safe point, and garbage collection is underway. -
julia:gc__mark__begin
: the beginning of the marking phase. -
julia:gc__mark_end(scanned_bytes, perm_scanned)
: Marking phase is completed. -
julia:gc__sweep_begin(full)
: start of the cleaning phase. -
julia:gc__sweep_end
: The cleaning phase is completed. -
julia:gc__end
: Garbage collection is completed, other threads continue to run. -
julia:gc__finalizer
: The initial garbage collection thread has completed the execution of finalizers.
Task Environment Probes
-
julia:rt__run__task(task)
: Switches to thetask
task in the current thread. -
julia:rt__pause__task(task)
: Switches from thetask
task in the current thread. -
julia:rt__new__task(parent, child)
: theparent
task has created thechild
task in the current thread. -
julia:rt__start__task(task)
: Thetask
is running for the first time with a new stack. -
julia:rt__finish__task(task)
: Thetask
has completed and will no longer be executed. -
julia:rt__start__process__events(task)
: Thetask
task has started processing libuv events. -
julia:rt__finish__process__events(task)
: The `task' task has completed processing libuv events.
Task Queue Probes
-
julia:rt__taskq__insert(ptls, task)
: Theptls
thread tried to add thetask
task to the multiq PARTR. -
julia:rt__taskq__get(ptls, task)
: The 'ptls` thread extracted thetask
task from PARTR multiq.
Flow stop and wake-up probes
-
julia:rt__sleep__check__wake(ptls, old_state)
: the stream (PTLSptls
) is waking up, the previous state is `old_state'. -
julia:rt__sleep__check__wakeup(ptls)
: The stream (PTLSptls
) has awakened itself. -
julia:rt__sleep__check__sleep(ptls)
: The thread (PTLSptls
) is trying to switch to an inactive state. -
julia:rt__sleep__check__taskq__wake(ptls)
: The thread (PTLSptls
) failed to switch to an inactive state due to tasks in PARTR multiq. -
julia:rt__sleep__check__task__wake(ptls)
: The thread (PTLSptls
) failed to switch to an inactive state due to tasks in the Base work queue. -
julia:rt__sleep__check__uv__wake(ptls)
: The thread (PTLSptls
) failed to enter an inactive state due to libuv waking up.
Examples of using probes
Delay in suspending program execution during garbage collection
The 'contrib/gc_stop_the_world_latency.bt` provides an example of the bpftrace
script, which creates a histogram of the delay in reaching the safe point for all threads.
Run this Julia code with the julia -t 2
parameter.
using Base.Threads fib(x) = x <= 1 ? 1 : fib(x-1) + fib(x-2) beaver = @spawn begin while true fib(30) # Эта заданная вручную безопасная точка необходима, иначе данный # цикл может никогда не подвергаться сборке мусора. GC.safepoint() end end allocator = @spawn begin while true zeros(1024) end end wait(allocator)
In the second terminal:
> sudo contrib/bpftrace/gc_stop_the_world_latency.bt Attaching 4 probes... Tracing Julia GC Stop-The-World Latency... Hit Ctrl-C to end. ^C @usecs[1743412]: [4, 8) 971 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [8, 16) 837 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [16, 32) 129 |@@@@@@ | [32, 64) 10 | | [64, 128) 1 | |
We can see the distribution of the delay in the program suspension phase for the Julia process.
Task Generation Monitor
Sometimes it can be useful to know when a task spawns other tasks. This is easily done using the `rt__new__task'. The first argument of the probe, `parent', is an existing task that creates another task. Thus, if you know the address of the task that needs to be tracked, you can easily find out which tasks it has generated. Let’s see how it’s done. First, let’s start the Julia session and get the PID and the address of the REPL task.
> julia _ _ _ _(_)_ | Documentation: https://docs.julialang.org (_) | (_) (_) | _ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help. | | | | | | |/ _` | | | | |_| | | | (_| | | Version 1.6.2 (2021-07-14) _/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release |__/ | 1> getpid() 997825 2> current_task() Task (runnable) @0x00007f524d088010
Now we can run bpftrace and make the rt__new__task probe monitor only the parent task.
sudo bpftrace -p 997825 -e 'usdt:usr/lib/libjulia-internal.so:julia:rt__new__task /arg0==0x00007f524d088010/{ printf("Task: %x\n", arg0); }'
(Note that in the above code, the first argument, i.e. parent
, is arg0
.)
If we generate one task:
@async 1+1
then we will see that it has been created.:
Task: 4d088010
However, if we generate some more tasks from this new task:
@async for i in 1:10
@async 1+1
end
using bpftrace
we will see only one task.:
Task: 4d088010
And it’s still the same task that we’re tracking! Of course, we can easily remove this filter to see all the created tasks.
sudo bpftrace -p 997825 -e 'usdt:usr/lib/libjulia-internal.so:julia:rt__new__task { printf("Task: %x\n", arg0); }'
Task: 4d088010 Task: 4dc4e290 Task: 4dc4e290 Task: 4dc4e290 Task: 4dc4e290 Task: 4dc4e290 Task: 4dc4e290 Task: 4dc4e290 Task: 4dc4e290 Task: 4dc4e290 Task: 4dc4e290
We see our root task and the newly spawned task as the parent for ten more new tasks.
Detecting a mass awakening
Sometimes tasks face the problem of mass awakening (thundering herd): when an inactive task execution environment receives a request to perform any work, all threads can be awakened, even if only a part of them would be enough. Because of this, there may be an additional delay, as CPU resources are consumed in waking up all threads (and stopping them due to lack of sufficient work).
Using bpftrace
you can easily illustrate this problem. First, in one terminal, we run Julia with several threads (six in this example) and get the process PID.
> julia -t 6 _ _ _ _(_)_ | Documentation: https://docs.julialang.org (_) | (_) (_) | _ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help. | | | | | | |/ _` | | | | |_| | | | (_| | | Version 1.6.2 (2021-07-14) _/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release |__/ | 1> getpid() 997825
In another terminal, we run bpftrace
to monitor the process, in particular using the probe rt__sleep__check__wake
.
sudo bpftrace -p 997825 -e 'usdt:usr/lib/libjulia-internal.so:julia:rt__sleep__check__wake { printf("Thread wake up! %x\n", arg0); }'
Now we will create and complete one task in Julia.
Threads.@spawn 1+1
In 'bpftrace`, the output will be something like this.
Thread wake up! 3f926100 Thread wake up! 3ebd5140 Thread wake up! 3f876130 Thread wake up! 3e2711a0 Thread wake up! 3e312190
Although we have created only one task (which can be handled by only one thread at a time), all threads have been awakened! In the future, perhaps this problem will be solved and the runtime will wake up only one thread (or not at all if there is enough spawning thread to complete the task).
Task monitoring using BPFnative.jl
BPFnative.jl can be connected to USDT sensing points in the same way as bpftrace. https://github.com/jpsamaroo/BPFnative.jl/blob/master/examples/task-runtime.jl [Here] you can find a demonstration of monitoring the task execution environment, garbage collection, as well as threads going to sleep and waking up.
Notes on using bpftrace
An example of a probe in the bpftrace format looks like this.
usdt:usr/lib/libjulia-internal.so:julia:gc__begin { @start[pid] = nsecs; }
The probe declaration takes the type usdt
, followed by either the library path or the PID, the vendor name julia
, and the probe name `gc__begin'. Note that the relative path to is used here. `libjulia-internal.so but in a production environment, an absolute path may be required.