Engee documentation

Optimization of combining isbits

In Julia, the Array type stores both bit values and packed values allocated on the heap. The difference is whether the value itself is stored as embedded (in the directly allocated array memory), or whether the array memory is just a collection of pointers to objects allocated elsewhere. In terms of performance, accessing embedded values is an obvious advantage over having to follow a pointer to the actual value. The definition of isbits usually means any Julia type with a fixed deterministic size, which means there are no pointer fields, see `?isbitstype'.

Julia also supports union types, literally - the union of a set of types. Custom union type definitions can be extremely useful for applications seeking to embrace a nominal type system (i.e. explicit subtype relationships) and define methods or functionality for these otherwise unrelated sets of types. However, the compiler’s job is to determine how to handle these types of joins. Our own approach (and indeed what worked in Julia before version 0.7) is to simply make a cell and then a pointer in the cell to the actual value, similar to the previously mentioned packed values. However, this is an unfortunate solution, since there are many small primitive bit types (for example, UInt8, Int32, Float64, etc.) that would easily fit into this cell without requiring redirection to access the value. In Julia 0.7, there are two main ways to optimize this approach: isbits join fields and isbits join arrays.

Isbits association structures

Julia now includes an optimization where isbits union fields in types (mutable struct, struct, etc.) will be stored as embedded. This is achieved by defining the size of the embedding of the union type (for example, Union{UInt8, Int16}`will have a size of 2 bytes, which is the size required for the largest join type `Int16), and an additional type label byte (UInt8) is allocated, the value of which indicates the type of the actual value stored as embedded for the join bytes. The byte value of the type label is the type index of the actual value in the type order for the union type. For example, the value of a label of type 0x02 for a field with type Union'.{Nothing, UInt8, Int16} indicates that the value Int16 is stored in 16 bits of the field in the structure’s memory. The value 0x01 indicates that the value UInt8 is stored in the first 8 of the 16 bits of the field’s memory. Finally, the value 0x00 indicates that the value nothing will be returned for this field, despite the fact that, being a single type with a single instance of the type, it technically has a size of 0. The byte of the type label for the type union field is stored directly in the calculated field union memory.

Pooling memory isbits

Julia can now also store isbits union values as embedded in memory, as opposed to needing an indirect cell. Optimization is achieved by storing additional byte type labels, one byte per element, along with bytes of actual data. This type label memory performs the same function as the type field register: its value indicates the type of the actual stored union value. The type label memory follows the regular data space directly. Thus, the formula for accessing byte labels of the join array types is as follows: a->data + a->length * a->elsize.