Overview of the machine code generation process
Representation of pointers
When transferring code to an object file, pointers will be transferred as moves. The deserialization code ensures that any object that pointed to one of these constants will be re-created and will contain the correct runtime pointer.
Otherwise, they will be output as literal constants.
To output one of these objects, call the function `literal_pointer_val'. It will track the value of Julia and the global LLVM object, ensuring that they are valid both for the current runtime environment and after deserialization.
When you transfer an object to an object file, these global objects are stored as references in a large gvals
table. In this case, the deserializer can refer to them by index and implement a custom manual mechanism for their recovery, similar to the global offset table (GOT).
Function pointers are handled in a similar way. They are stored as values in a large table `fvals'. As with global objects, the deserializer can refer to them by index.
Note that extern
functions are handled separately, with names, using the usual symbol resolution mechanism in the linker.
Note that the ccall
functions are also processed separately using the manually used GOT table and the procedure layout table (PLT).
Representation of intermediate values
The values are passed in the 'jl_cgval_t` structure. It represents an R-value and contains enough information to determine how to assign it or transfer it somewhere.
Values are created using one of the auxiliary constructors, usually the following: mark_julia_type' (for immediate values) and `mark_julia_slot
(for pointers to values).
The convert_julia_type' function can perform a conversion between any two types. It returns an R-value with `cgval.typ
having the value `typ'. It will bring the object to the desired representation by creating heap pointers, allocating stack copies, and calculating marked-up joins as needed to change the representation.
On the contrary, the function update_julia_type
will change the cgval.type
to `type' only if it can be done at zero cost (i.e. without creating code).
Representation of the association
The output types of joins can be allocated on the stack via the marked-up type representation.
The following simple procedures are used to process marked-up joins.
-
mark-type
-
load-local
-
store-local
-
isa
-
is
-
emit_typeof
-
emit_sizeof
-
boxed
-
unbox
-
specialized cc-ret
For everything else, it should be possible to process in the output with the usage of these primitives to implement the division of the union.
Representation of a marked-up union as a pair < void* union, byte selector>
. The selector has a fixed size of byte & 0x7f
and will mark the union of the first 126 types of isbits. It records the quantity in depth based on the unit in the union of isbits object types. An index of zero indicates that union*
is actually a marked-up heap-allocated jl_value_t*
and should be treated as a regular packed object, not as a marked-up union.
The high bit of the selector (byte & 0x80
) can be checked to determine whether void*
is actually a pointer allocated on the heap (jl_value_t*
), which avoids the cost of re-allocating the block, while maintaining the ability to efficiently handle the division of the union based on the low bits.
It is guaranteed that byte & 0x7f
is an accurate test for the type; if the value can be represented by a label, it will never be labeled byte =0x80'. When testing `isa
, there is no need to also check the type label.
The allocated memory area union*
can have any size. The only limitation is that it must be large enough to contain the data currently specified by the selector (selector
). It may not be large enough to accommodate the union of all types that can be stored in it according to the associated field of the union type. Copying should be done with care.
Presentation of a specialized call agreement signature
The object jl_returninfo_t
describes the details of the calling agreement of any called object.
If any of the arguments or return type of the method can be represented in an expanded form and the method does not use a variable number of arguments, it will be provided with an optimized call convention signature based on its specTypes
and rettype
fields.
The general principles are as follows.
-
Primitive types are passed in integer or floating-point registers.
-
VecElement types are passed in vector registers.
-
Structures are passed on the stack.
-
Return values are treated similarly to arguments, with a size limit at which they will be returned using the hidden sret argument.
The general logic is implemented using get_specsig_function
and `deserves_sret'.
Also, if the return type is a union, it can be returned as a pair of values (pointer and label). If the union values can be placed on the stack, sufficient space to store them will also be passed as a hidden first argument. What the returned pointer will point to - this space, a packed object, or even another permanent memory, depends on the called object.