Arrays with custom indexes

Traditionally, arrays in Julia are indexed starting from 1, while in some other languages numbering starts from 0 or even (for example, in Fortran) you can specify an arbitrary initial index. Although the standard numbering (for example, starting from 1 in Julia) has many advantages, some algorithms are significantly simplified if indexing beyond the range of 1:size(A,d) (or 0:size(A,d)-1) is possible. To simplify such calculations, Julia supports arrays with arbitrary indexes.

This page answers the question "What needs to be done to support such arrays in your own code?" First, let’s look at the simplest case: if you know that you will never have to deal with arrays with non-standard indexing in your code, then the answer is probably nothing. The old code with traditional arrays should work without any changes if it uses the exported Julia interfaces. If it seems more convenient to simply prohibit the use of non-traditional arrays with non-unity indexing, you can add:

Base.require_one_based_indexing(arrays...)

where arrays... is a list of array objects that need to be checked for indexing violations from one.

Universalizing existing code

The general procedure is as follows:

replace size with axes;
replace 1:length(A) with eachindex(A) or in some cases with LinearIndices(A);
replace explicit memory allocations, such as Array{Int}(undef, size(B)), to similar(Array{Int}, axes(B)).

More details about this are described below.

Possible hazards

Since many people may take for granted that all arrays are indexed starting from one, there is always a chance of errors when using arrays with non-standard indexing. The most annoying thing is getting incorrect results or the program crashes (Julia crashes). For example, consider the following function.

function mycopy!(dest::AbstractVector, src::AbstractVector)
    length(dest) == length(src) || throw(DimensionMismatch("vectors must match"))
    # Теперь можно спокойно использовать @inbounds, верно? (Уже нет!)
    for i = 1:length(src)
        @inbounds dest[i] = src[i]
    end
    dest
end

This code implicitly assumes that the vectors are indexed from one; if dest does not start at the same index as src, there is a chance of an emergency termination. (If this happens, then to find out the reason, try running Julia with the --check-bounds=yes parameter.)

Using `axes` to check boundaries and iterate through the loop

The axes(A) method (similar to size(A)) returns a tuple of AbstractUnitRange' objects{<:Integer}, defining the allowed range of indexes for each dimension of the array A'. If `A is indexed in a non-standard way, the ranges may not start with one. To get the range for a specific dimension d, use the axes(A, d) call.

The Base module implements a custom range type OneTo: OneTo(n) means the same as 1:n, but guarantees (through the type system) that the subscript is 1. For any new type AbstractArray an object of this type is returned by default by the axes function. This means that this type of array uses traditional indexing — starting from one.

Please note that there are special functions called "checkbounds" and "checkindex" for checking boundaries, which can sometimes simplify this task.

Linear indexing (`LinearIndices')

Some algorithms are most conveniently (or more efficiently) implemented using linear indexes A[i], even if A is a multidimensional array. No matter what the array’s own indexes are, linear indexes always have a range of 1:length(A)'. However, in the case of one-dimensional arrays (i.e. `AbstractVector) an ambiguity arises: does v[i] mean linear indexing or Cartesian indexing with its own array indexes?

For this reason, the optimal solution may be to iterate through the array using eachindex(A) or, if the indexes must be consecutive integers, to obtain a range of indexes by calling LinearIndices(A)'. As a result, `axes(A, 1) will be returned if A is an AbstractVector, or the equivalent of 1:length(A) otherwise.

According to this definition, one-dimensional arrays always use Cartesian indexing with their own array indexes. To confirm this fact, it is worth noting that the index conversion functions return an error if the shape of the array corresponds to a one-dimensional array with non-standard indexing (i.e. Tuple{UnitRange}, not the tuple `OneTo'). For arrays with standard indexing, these functions will work as usual.

Here is one way to rewrite mycopy! using axes and `LinearIndices'.

function mycopy!(dest::AbstractVector, src::AbstractVector)
    axes(dest) == axes(src) || throw(DimensionMismatch("vectors must match"))
    for i in LinearIndices(src)
        @inbounds dest[i] = src[i]
    end
    dest
end

Allocating storage space using `similar` generalizations

Storage space is often allocated using Array'.{Int}(undef, dims) or similar(A, args...). If the result must match the indexes of some other array, this may not always be enough. A universal replacement for such templates is the use of similar(storagetype, shape)'. `storagetype means the desired type of basic, standard behavior, for example, Array{Int}, BitArray or even +dims→zeros(Float32, dims)+' (in this case, an array of only zeros is placed in memory). `shape is a tuple of values Integer or `AbstractUnitRange', which defines the indexes to be used in the result. Please note: to get an array of only zeros corresponding to the indexes of A, it is convenient to simply use `zeros(A)'.

Let’s look at a couple of illustrative examples. First, if A has standard indexes, then similar(Array{Int}, axes(A)) will result in a call to Array{Int}(undef, size(A)) and returns the array. If A is an AbstractArray type with non-standard indexing, then call similar(Array{Int}, axes(A)) should return something similar in behavior to Array{Int}, but having a form (including indexes) corresponding to A. (The most obvious implementation is to place a Array' object in memory.{Int}(undef, size(A)) followed by its encapsulation in a type with a shift in indexes.)

Also keep in mind that calling similar(Array{Int}, (axes(A, 2),))+`will place an object in memory `+AbstractVector{Int} (that is, a one-dimensional array) corresponding to the column indexes `A'.

Writing custom array types with non-unity indexing

Most of the methods that will need to be defined are standard for any type of `AbstractArray'; see the section Abstract arrays. This section describes the steps required to identify non-standard indexing.

Custom `AbstractUnitRange` types

If you are creating an array type with non-unit indexing, you will need to specialize the axes function so that it returns a UnitRange object or (perhaps even better) a custom AbstractUnitRange object. The advantage of the user-defined type is that it tells functions such as similar about the type of memory allocation. When writing an array type indexed from zero, it is most likely better to start by creating a new type AbstractUnitRange — ZeroRange, where ZeroRange(n) is equivalent to `0:n-1'.

In general, you should not export ZeroRange from a package: there may be other packages that implement their own type of ZeroRange, and having several separate types of ZeroRange, oddly enough, is an advantage. ModuleA.ZeroRange indicates that the similar function should create ModuleA.ZeroArray', and `ModuleB.ZeroRange corresponds to the type ModuleB.ZeroArray. This approach ensures the peaceful coexistence of many different user-defined array types.

Please note: in order not to write your own ZeroRange type, you can sometimes use the Julia package. https://github.com/JuliaArrays/CustomUnitRanges.jl [CustomUnitRanges.jl].

Specialization of `axes`

After creating your own AbstractUnitRange type, you need to specialize the axes function.

Base.axes(A::ZeroArray) = map(n->ZeroRange(n), A.size)

Here it is assumed that the ZeroArray has a field named `size' (this can be implemented in other ways).

In some cases, the backup definition is `axes(A, d)'.

axes(A::AbstractArray{T,N}, d) where {T,N} = d <= N ? axes(A)[d] : OneTo(1)

It may not meet your needs.: you may need to specialize it so that it returns something other than OneTo(1) for d > ndims(A). Similarly, Base has a special function axes1, which is equivalent to axes(A, 1), but ignores the check. ndims(A) > 0 (at runtime). (Its purpose is solely to increase productivity.) It is defined as follows.

axes1(A::AbstractArray{T,0}) where {T} = OneTo(1)
axes1(A::AbstractArray) = axes(A)[1]

If the first case (a one-dimensional array) poses a problem for your custom array type, implement the appropriate specialization.

Specialization `similar`

For your own type ZeroRange, you must also add the following two specializations `similar'.

function Base.similar(A::AbstractArray, T::Type, shape::Tuple{ZeroRange,Vararg{ZeroRange}})
    # тело
end

function Base.similar(f::Union{Function,DataType}, shape::Tuple{ZeroRange,Vararg{ZeroRange}})
    # тело
end

Both of them should implement your custom array type.

'reshape` specialization

If necessary, define the following method.

Base.reshape(A::AbstractArray, shape::Tuple{ZeroRange,Vararg{ZeroRange}}) = ...

This will change the shape (reshape) of the array so that it has custom indexes.

Objects that mimic the behavior of AbstractArray, but are not its subtypes

The value of the has_offset_axes method depends on whether the axes method is defined for the objects for which it is called. If for some reason the axes method is not defined for the object, you can define the following method.

Base.has_offset_axes(obj::MyNon1IndexedArraylikeObject) = true

This will allow you to identify a problem in the code that assumes indexing from unity, and give a useful error message instead of returning incorrect results or crashing Julia.

Error interception

If the new array type causes errors somewhere else in the code, you can try commenting out the @boundscheck in your implementation of getindex and setindex! for debugging. This will make sure that when accessing each element, the boundaries are checked. Alternatively, you can restart Julia with the --check-bounds=yes parameter.

In some cases, it may be useful to temporarily disable the size and length functions for a new array type, as they are often used in code with incorrect assumptions.