Proper maintenance of multithreaded locks
The following strategies ensure that there are no deadlocks in the code (usually by observing the 4th Coffman condition: cyclic waiting).
The code needs to be structured in such a way that you only need to get one lock at a time.
Shared locks should always be obtained in the same order as shown in the table below.
It is necessary to avoid constructions where unlimited recursion is expected to be required.
Blockages
Below are all the locks that exist in the system, as well as the mechanisms for their use to avoid potential deadlocks (the Straus algorithm is unacceptable here).
The following locks are definitely final locks (1st level) and should not attempt to obtain any other lock.
A point of safety
Note that this lock is implicitly obtained by
JL_LOCK
andJL_UNLOCK'. Use the `_NOGC
options to eliminate this situation for level 1 locks.While holding this lock, the code should not perform any allocations or reach security points. Note that there are safety points when allocating, enabling or disabling garbage collection, entering or restoring exception frames, and accepting or releasing locks.
shared_map
finalizers
pagealloc
gc_perm_lock
flisp
jl_in_stackwalk (Win32)
ResourcePool<?>::mutex
RLST_mutex
llvmprintingmutex
jllockedstream::mutex
debuginfo_asyncsafe
inferencetimingmutex
ExecutionEngine::SessionLock
flisp itself is already thread-safe. This lock protects only the pool `jl_ast_context_list_t'. Similarly, ResourcePool<?>::mutexes protects only the associated resource pool.
The following is the final lock (2nd level), which internally receives only locks of the 1st level (security point).
globalrootslock
Module->lock
JLDebuginfoPlugin::PluginMutex
newlyinferredmutex
The following is a level 3 lock that can only receive level 1 or 2 locks internally.
Method->writelock
typecache
The following is a 4th-level lock that can recursively receive only 1st, 2nd, or 3rd-level locks.
MethodTable->writelock
While holding the lock above this point, the Julia code cannot be invoked.
orc::ThreadSafeContext (TSCtx) locks occupy a special place in the lock hierarchy. They serve to protect the global non-thread-safe state of LLVM, but there can be any number of them. By default, all these locks can be considered level 5 locks when compared with the rest of the hierarchy. You should receive TSCtx only from the TSCtx JIT pool, and all locks on this TSCtx must be lifted before it is returned to the pool. If multiple TSCtx locks need to be obtained at the same time (due to recursive compilation), then they should be obtained in the same order in which the TSCtx locks were taken from the pool.
The following is a Level 5 lockdown:
JuliaOJIT::EmissionMutex
The following is a 6th-level lock that can recursively receive only lower-level locks.
codegen
jl_modules_mutex
The next lock is almost the root lock (of the penultimate level), which means that only the root lock can be held when trying to obtain it.
typeinf
This option is perhaps one of the most difficult, since type inference can be invoked from many points.
Currently, this lock is combined with the code generation lock because they call each other recursively.
The next lock synchronizes the I/O operation. Keep in mind that performing any I/O operation (for example, outputting warning messages or debugging information) while holding any other lock listed above can lead to dangerous and difficult-to-detect deadlocks. BE VERY CAREFUL!
iolock
Separate ThreadSynchronizers locks
You can continue to hold them after releasing the iolock lock or receive them without it, but be very careful and do not try to get the iolock lock while holding these locks.
Blocking Libdl.LazyLibrary
The next lock is the root lock, which means that no other locks can be held when trying to obtain it.
toplevel
This lock should be held when trying to perform a top-level action (for example, creating a new type or defining a new method): an attempt to obtain this lock inside an intermediate function will result in a deadlock condition.
In addition, it is unclear whether any code can safely be executed in parallel with an arbitrary top-level expression. Therefore, it may be necessary for all threads to reach the safety point first.
Broken locks
The following locks do not work.
-
toplevel
Doesn’t exist now > > fix: create it.
-
Module->lock
It is vulnerable to deadlocks because there is no certainty that it is received sequentially. > Some operations (for example, 'import_module') do not have a lock. > > Fix: replace `jl_modules_mutex'?
-
loading.jl:
require
andregister_root_module
This file potentially has a lot of problems. > > Fix: Locks are required.
Common global data structures
Each such data structure requires locks, as they share a mutable global state. Here is the reverse list for the above lock priority list. It does not include the final resources of the 1st level, as they are too simple.
MethodTable modifications (def, cache): MethodTable->writelock
Type declarations: toplevel lock
Application of types: blocking typecache
Tables of global variables: Module->lock
Module serializer: toplevel lock
JIT and type inference: blocking code generation
Updates to MethodInstance/CodeInstance: Method->writelock, blocking code generation
These are set at creation and are immutable.:
specTypes
sparam_vals
def
owner
These are set using `jl_type_infer' (while holding the code generation lock):
cache
rettype
inferred * acceptable age
The
inInference
flag:
Optimization to quickly prevent repetition in the
jl_type_infer
function when it is already runningThe actual state (of the
inferred
installation, thenfptr
) is protected by a code generation lock
Function pointers:
perform a transition once from
NULL
to a value while the code generation lock is held.The code generator cache (contents of `functionObjectsDecls'):
can jump several times, but only while the code generation lock is held
You can use its old version or block new versions, so races are not dangerous unless the code references other data in the method instance (for example,
rettype
) and assumes that they are consistent, unless it holds the code generation lock.
LLVMContext: blocking code generation
Method: Method->writelock
-
root array (serializer and code generation)
-
TFUNC challenge/specialization/modification