Engee documentation

Working with WorkspaceArray in Engee

'WorkspaceArray` is a special data type in the Engee modeling environment designed for storing and processing time series. It is based on the AbstractArray interface of the language Julia, but unlike standard arrays, implements Lazy Loading of data, which makes it a lazy array.

A lazy array (or lazily loadable array) is a structure that does not store data entirely in RAM, but loads it as needed: for example, when iterating, accessing an index, or converting to a table. This allows you to efficiently work with very large datasets that do not fit in memory or come from external sources (for example, from a database or file).

The 'WorkspaceArray` in Engee is used in the following scenarios:

Thus, WorkspaceArray provides a single and user—friendly interface for accessing temporary data throughout the entire simulation, from preparing input data to analyzing results.

Creating a WorkspaceArray

Usually, a `WorkspaceArray' is formed automatically during the simulation process, for example, when recording signals through a block To Workspace or when saving the results to a variable simout.

However, it can also be created manually, for example, to load external data or test the model on pre—prepared inputs. In this case, a WorkspaceArray is created using one of the following methods:

  • From a CSV file. To download data from a CSV file, just specify the path to the file.:

    my_wa = WorkspaceArray("my_wa", "/user/path_to_csv.csv")

    here "my_wa" is the name under which the array will be stored in Variable Window variables article 2 1 Engee. It is used for subsequent array references in models, scripts, and other applications.

    The CSV file must contain two columns: one required with the name `time', the second with the values of the signal. Example of the CSV file structure:

    time,value
    0.1,1
    0.2,2
    0.3,3
  • From the 'DataFrame` table. If you are already working with tabular data in Julia, you can create a WorkspaceArray directly from the DataFrame object. This is especially useful when generating input signals programmatically, during data analysis or transformation.

    DataFrame' is a tabular structure from the package `DataFrames.jl, similar to tables in Excel or DataFrame in Python (pandas). Each column has a name and type, and the data is stored in rows. Example of creation:

    using DataFrames
    
    df = DataFrame(time = [0.1, 0.2, 0.3], value = [1, 2, 3])
    wa = WorkspaceArray("my_wa", df)

    The table should contain two columns: `time' (type `Float64') and `value' (can be a scalar, vector, or array of numbers).

Working with WorkspaceArray

'WorkspaceArray` implements the AbstractArray' interface, which allows it to be accessed like a regular array in Julia — indexed, iterated, sliced, etc However, in addition to the basic behavior of arrays, WorkspaceArray contains an internal structure and additional fields that link it to data sources, support lazy loading, and make it convenient to work with time series.

Each 'WorkspaceArray` is not just a set of numbers, but an object with "metadata": it knows where its data lies, how it relates to time, where it is downloaded from, and how it was received. These additional properties can be requested directly by simply accessing the fields of the object through a dot.:

wa = WorkspaceArray("my_wa")
wa.time    # access to timestamps
wa.value   # access to values
wa.type    # let's find out which type of array is :pair, :time or :value

The following fields are available:

  • time is a child WorkspaceArray containing only timestamps. It has the type :time;

  • value is a child WorkspaceArray' containing the values of the signal. It has the type `:value;

  • type — type of the current array: :time, :value or :pair (timestamps + values);

  • parent is a reference to the parent WorkspaceArray if the current one is the result of a slice.;

  • `range' — the range of indexes based on which the slice was created. Used for lazy slicing;

  • dimension' — the dimension of the data in the `value field (for example, scalar, vector, matrix);

  • `signal_id' is the unique identifier of the data source in Engee.

Most often, when working with WorkspaceArray, the time and value fields are used directly, especially if you need to get the data individually or convert it to a table.

Reuse

After creating a variable containing a WorkspaceArray' (for example, named `wa), it is automatically associated with a signal_id (unique identifier) in Engee. This allows you to refer to it by name and reuse it in other parts of the project, for example, in web applications based on the Genie framework (for more information, see Working with Genie at Engee), in other models or in blocks with nesting, for example Subsystem:

wa = WorkspaceArray("my_wa")

Such a record creates a new WorkspaceArray, referring to the already registered data under the name "my_wa".

Use with models

Blocks To Workspace and From Workspace designed to interact with the 'WorkspaceArray` in the model:

Block To Workspace writes data from the model to a variable of type WorkspaceArray in the workspace Engee. The variable name is set by the Variable name parameter. Only numeric data (scalar and arrays) is supported.

Block From Workspace reads data from the workspace and feeds it to the model input. Only variables of the 'WorkspaceArray` type are used.

The To Workspace and From Workspace blocks work only with the 'WorkspaceArray` type.

Usage example: From/To Workspace demo.

Working with CSV

Exporting a WorkspaceArray to CSV

To export 'WorkspaceArray` to CSV, use CSV.write with the required parameter delim="\t":

using CSV
CSV.write("/user/workspacearray_csv.csv", delim="\t", data_frame)
The absence of the parameter delim="\t" will cause the error BoundsError during import and will create an empty file.

Importing a WorkspaceArray from CSV

wa = WorkspaceArray("my_wa", "/user/my_data.csv")

The file should contain two columns:

  • time is the required column name.;

  • `value' — the value.

Uploading data

Full upload to the DataFrame

To fully upload data from the WorkspaceArray' to the DataFrame use collect:

df = collect(wa) # we get a DataFrame with time and value columns

You can also get individual columns from the 'WorkspaceArray`:

collect(wa.time)   # Only time
collect(wa.value)  # values only
It is not recommended to use collect for very large data, as memory overload is possible.

Lazy slices

One of the key advantages of WorkspaceArray is its support for Lazy Slices.

Unlike regular Julia arrays, where a slice (wa[1:10]) creates a new array with a copy of the data, WorkspaceArray works differently. When slicing, a view is created that stores only the range of indexes (range) and a reference to the source object (parent) — without actually loading the values into memory. Data is loaded from storage only with an explicit request, for example, via collect, during iteration or index access.

This is especially useful for:

  • Saving memory — slices do not copy data, but only point to the necessary elements.;

  • High performance — you can only work with the right pieces of data.;

  • Flexibility — Lazy slices can be used as model inputs, in visualizations, or when exporting.

Example:

wa_slice = wa[1:2:end] # lazy representation: every second element
collect(wa_slice)      # loading data into memory (as a DataFrame)

Result:

Row │ time     value
    │ Float64  Int64
────┼───────────────
  1 │ 0.1      1
  2 │ 0.3      3
  3 │ 0.5      5

In addition to slices, you can access a single element by index.:

wa[1] # access to the first record will be obtained (0.1, 1)

Cross-sections by time and value fields

Separate fields .time and .value are available in the WorkspaceArray, to which lazy slices can also be applied.:

wa_values = wa.value[2:end] # a cross-section based on values only
wa_times  = wa.time[2:end]  # time-scale slice only

Slice of wa[1:2:end] returns a WorkspaceArray type object with the same signal_id, but with a new range field indicating a slice. This lets you know that you are looking at a lazy representation, not a copy of the data.

If necessary, the data can be loaded into memory.:

wa = WorkspaceArray("my_wa")

wa_slice = wa[1:2:end]     # creating a lazy slice
df = collect(wa_slice)     # uploading data
println(df)                # we get a table with truncated values

WorkspaceArray methods and interface

WorkspaceArray implements standard Julia interfaces and complements them with its own methods. Below is a brief overview of the most useful functions that can be used when working with time series.:

  • wa[i] — access to the element by index, returns the tuple (time, value):

    wa[1] # will be received (0.1, 42.0)
  • wa[start:stop] — slice by range, returns a lazy representation. The data is not copied, but uploaded as needed.:

    wa_slice = wa[1:10]
    collect(wa_slice) # loads only the selected range
  • wa[1:2:end] is a lazy slice with a step.

  • wa.time, wa.value — access to child WorkspaceArray containing time and values separately. Slices can also be applied to them.:

    wa_times = wa.time[5:end]
    wa_values = wa.value[5:end]
  • collect(wa) — converts WorkspaceArray to 'DataFrame'. This is the main way to get the data explicitly:

    df = collect(wa)
  • copy(wa) / similar(wa) — create a surface copy of the object. In the current implementation, they work the same way.

  • wa1 == wa2 — comparison of two workspacearrays by signal_id and range (range).

If you work with time series in Engee, then most likely you will have enough collect, slices (wa[1:10]) and iterrows for everyday tasks. The remaining methods are useful for optimizing and fine-tuning behavior.

Usage example

Let’s look at a basic example of working with a WorkspaceArray — from creating to receiving a slice and uploading data. This scenario covers the most typical usage scenarios for 'WorkspaceArray'.

  1. First, create a table (DataFrame) containing two columns: time and value'. This is the required format for importing data into a `WorkspaceArray:

    using DataFrames
    
    df = DataFrame(time = [0.1, 0.2, 0.3, 0.4, 0.5],
                   value = [1, 2, 3, 4, 5])
    • time — the moment of time (in seconds);

    • 'value' — the value of the signal at this moment.

  2. Convert the table to a 'WorkspaceArray' by specifying the name under which the array will be available in Engee:

    wa = WorkspaceArray("myarr", df)

    Here myarr is the name of a variable that can be used in models and scripts.

  3. 'WorkspaceArray` supports index access. For example, get the first record:

    println(wa[1])

    Result:

    (0.1, 1)

    This is a tuple: the first element is the time, the second is the signal value.

    The data is loaded only when it is accessed. That is, even with wa[1], only one element is loaded.
  4. Make a slice of the array, get every second element (1, 3, 5). This will create a lazy view, not a copy.:

    wa_slice = wa[1:2:end]

    To get the data as a table, use collect:

    println(collect(wa_slice))

    Result:

    3×2 DataFrame
     Row │ time     value
         │ Float64  Int64
    ─────┼────────────────
       1 │     0.1      1
       2 │     0.3      3
       3 │     0.5      5

    Slice of wa[1:2:end] does not download data by itself. It is only when collecting, iterating, or accessing the index that the data is actually loaded from the storage.