Calculating the Julia code
One of the most difficult moments in learning how to execute code in Julia is understanding how all the parts work together to execute a block of code.
Each piece of code usually goes through many stages with potentially unfamiliar names, such as (in no particular order): using the flisp, AST, C interpreter++, LLVM, eval
, typeinf
, macroexpand
, sysimg (or system image), bootstrapping, compilation, analysis, execution, JIT, interpretation, packaging, unpacking, using an embedded function and a primitive function before becoming the desired result (hopefully).
Definitions
|
Julia code Execution
The following is a general description of the process.
-
The user launches `julia'.
-
The C function
main()
is called fromcli/loader_exe.c
. This function processes command line arguments by filling in thejl_options
structure and setting theARGS
variable. Then she initializes Julia (by calling the function https://github.com/JuliaLang/julia/blob/master/src/init.c [julia_init
ininit.c
], which can load a previously compiled system image, sysimg). Finally, she hands over control to Julia, calling https://github.com/JuliaLang/julia/blob/master/base/client.jl [Base._start()
]. -
When
_start()
takes control, the subsequent sequence of commands depends on the specified command line arguments. For example, if a file name was specified, this file will be executed. Otherwise, an interactive REPL loop will be started. -
Omitting details about how the REPL interacts with the user, let’s just say that the program ends with a block of code that it wants to execute.
-
If the block of code to execute is in a file, it is called https://github.com/JuliaLang/julia/blob/master/src/toplevel.c [
jl_load(char *filename)
] to upload a file and its analysis. Each piece of code is then passed toeval
for execution. -
Each piece of code (or AST) is passed to
eval()
to convert to a result. -
eval()
takes each code fragment and tries to run it injl_toplevel_eval_flex()
. -
jl_toplevel_eval_flex()
determines whether the code is a top-level action (for example,using
ormodule
) that will not be allowed inside the function. If this is the case, the code is passed to the top-level interpreter. -
Then
jl_toplevel_eval_flex()
expands the code to eliminate any macros and lower the AST to simplify its execution. -
After that, 'jl_toplevel_eval_flex() It uses some simple heuristic procedures to decide whether to perform JIT compilation for the AST or interpret it directly.
-
The main part of the work on interpreting the code is performed by https://github.com/JuliaLang/julia/blob/master/src/interpreter.c [
eval
in `interpreter.c']. -
If the code is compiled, it does most of the work. `codegen.cpp `. Whenever the Julia function is called for the first time with a given set of argument types, it will be executed type inference. This information is used during the code generation stage (codegen) to create faster code.
-
Eventually, the user exits the REPL or the end of the program is reached and the
_start()
method returns. -
Just before exiting,
main()
calls https://github.com/JuliaLang/julia/blob/master/src/init.c [jl_atexit_hook(exit_code)
]. This calls the functionBase._atexit()
(which calls any functions registered inatexit()
inside Julia). Then the function is called https://github.com/JuliaLang/julia/blob/master/src/gc.c [jl_gc_run_all_finalizers()
]. As a result, it correctly clears all thelibuv
handlers and waits until they are reset and closed.
Analysis
The Julia analyzer is a small lisp program written in the femtolisp language, the source code of which is located inside Julia in the folder https://github.com/JuliaLang/julia/tree/master/src/flisp [src/flisp].
Its interface functions are defined mainly in https://github.com/JuliaLang/julia/blob/master/src/jlfrontend.scm [jlfrontend.scm
]. The code in https://github.com/JuliaLang/julia/blob/master/src/ast.c ['ast.c`] handles this transfer on Julia’s side.
Other important files at this stage are https://github.com/JuliaLang/julia/blob/master/src/julia-parser.scm [julia-parser.scm
], which processes the markup of Julia code and converts it to AST, and https://github.com/JuliaLang/julia/blob/master/src/julia-syntax.scm [julia-syntax.scm
], which handles the transformation of complex AST representations into simple reduced AST representations more suitable for analysis and execution.
If you want to test the analyzer without completely rebuilding Julia, you can run the interface part yourself as follows.
$ cd src $ flisp/flisp > (load "jlfrontend.scm") > (jl-parse-file "<filename>")
Macro Expansion
When the function eval()
detects a macro, it expands this AST node before trying to evaluate the expression. Macro expansion involves transferring from eval()
(in Julia) to the analyzer function jl_macroexpand()
(written in flisp
) to the Julia macro itself (written somewhere in Julia) using the function fl_invoke_julia_macro()
and back.
The extension is usually activated as the first step during a call. Meta.lower()
/jl_expand()', although it can also be initiated directly by calling `macroexpand()
/jl_macroexpand()
.
Type inference
In Julia, type inference is implemented using the function https://github.com/JuliaLang/julia/blob/master/base/compiler/typeinfer.jl [typeinf()
in the file compiler/typeinfer.jl
]. Type inference is the process of examining a Julia function and determining the type boundaries of each of its variables, as well as the type boundaries of the function’s return value. This allows for the implementation of many future optimization measures, such as unpacking known immutable values, and raising various computational operations during compilation, such as calculating field offsets and function pointers. Type inference may also contain other steps such as constant propagation and embedding.
Even more definitions
|
JIT code generation
Code generation is the process of converting Julia’s AST into native machine code.
The JIT environment is initialized with an advance call https://github.com/JuliaLang/julia/blob/master/src/codegen.cpp ['jl_init_codegen` in `codegen.cpp `].
Upon request, the Julia method is converted to its own function using the function emit_function(jl_method_instance_t*)
. (Note that when using MCJIT (in LLVM v3.4 and later), each function must be JIT into a new module.) This function recursively calls the emit_expr()
function until the entire function is issued.
The remaining majority of this document is devoted to various manual optimizations of specific code patterns. For example, the function emit_known_call()
knows how to embed many primitive functions (defined in https://github.com/JuliaLang/julia/blob/master/src/builtins.c [`builtins.c']) for various combinations of argument types.
Other parts of the code generation process are handled by various auxiliary files.
-
Handles backtracking for JIT functions
-
Handles FFI ccall and llvmcall, as well as various 'abi_*.cpp` files
-
Handles the output of various low-level internal functions
Bootstrapping
The process of creating a system image is called bootstrapping. |
This word comes from the English phrase pulling yourself up by the bootstraps (to achieve everything on your own) and means the idea to start with a very limited set of available functions and definitions and end up creating a fully functional environment.
System Image
The system image is a precompiled archive of a set of Julia files. The file sys.ji' distributed with Julia is one of these system images created by executing the file https://github.com/JuliaLang/julia/blob/master/base/sysimg.jl [`sysimg.jl
] and serialize the resulting environment (including types, functions, modules, and all other defined values) into a file. Therefore, it contains a static version of the Main
, Core
, and Base
modules (and everything else that was in the environment at the end of bootstrapping). This serializer or deserializer is implemented using the function https://github.com/JuliaLang/julia/blob/master/src/staticdata.c ['jl_save_system_image` or jl_restore_system_image
in the staticdata.c
file].
If the sysimg file is missing ('jl_options.image_file == NULL`), it also means that the --build
option was specified on the command line, so the end result should be a new sysimg file. During Julia initialization, the minimum modules Core
and Main
are created. Then a file named boot.jl
is calculated from the current directory. After that, Julia computes any file specified as a command line argument until it reaches the end. Finally, it saves the resulting environment to a sysimg file to use as a starting point for future execution.