Working with WorkspaceArray in Engee

WorkspaceArray is a special data type in Engee modelling environment, designed for storing and processing time series. It is based on the AbstractArray interface of Julia language, but unlike standard arrays, it implements Lazy Loading of data, which makes it a lazy array.

A lazy array (or lazy loading array) is a structure that does not store data entirely in RAM, but loads it as needed: for example, when iterating, accessing by index, or converting to a table. This allows you to efficiently handle very large data sets that do not fit in memory or come from external sources (e.g., a database or file).

The WorkspaceArray in Engee is used in the following scenarios:

As the result of a simulation, to store the recorded signals. For example, in the variable simout (see Software processing of simulation results in Engee) for details;
To transfer data between models via the blocks To Workspace and From Workspace;
To import custom time series from CSV files or tables (DataFrame), which can then be used in Engee models.

Thus, `WorkspaceArray' provides a single and convenient interface for accessing time series data throughout the simulation - from input data preparation to results analysis.

Creating a WorkspaceArray

Usually WorkspaceArray is generated automatically during the simulation process - for example, when writing signals via the To Workspace block or when saving the results to the simout variable.

However, it can also be created manually - for example, to load external data or to test the model with pre-prepared inputs. In this case, WorkspaceArray is created in one of the following ways:

*From a CSV file. To load data from a CSV file, just specify the path to the file:
```
wa = WorkspaceArray("my_wa", "/user/path_to_csv.csv")
```
"my_wa" is the name under which the array will be saved in Variable window Engee. It is used for further reference to the array in models, scripts and other applications.
The CSV file must contain two columns: one must be named time, the other must contain the signal values. Example of CSV file structure:

time value 0.1 1 0.2 2 0.3 3

*From a DataFrame table. If you are already working with tabular data in Julia, you can create a WorkspaceArray directly from a DataFrame object. This is especially useful when generating inputs programmatically, during data analysis or transformation.

DataFrame is a table structure from the DataFrames.jl package, similar to tables in Excel or DataFrames in Python (pandas). Each column has a name and type, and data is stored by row. Example creation:

using DataFrames

df = DataFrame(time = [0.1, 0.2, 0.3], value = [1, 2, 3])
wa = WorkspaceArray("my_wa", df)

The table must contain two columns: time (type Float64) and value (can be a scalar, vector or array of numbers).

Working with WorkspaceArray

WorkspaceArray implements the AbstractArray interface, which allows you to treat it like a normal array in Julia - index, iterate, slice, etc. However, in addition to the basic array behaviour, WorkspaceArray contains an internal structure and additional fields that provide its connection to data sources, support lazy loading and allow convenient work with time series.

Each WorkspaceArray is not just a set of numbers, but an object with "metadata": it knows where its data lies, how it is related to time, where it is loaded from, and how it was obtained. These additional properties can be queried directly, simply by referring to the object’s fields by dot:

wa = WorkspaceArray("my_wa")
wa.time    # доступ к временны́м меткам
wa.value   # доступ к значениям
wa.type    # узнаем, какой тип у массива — :pair, :time или :value

The following fields are available:

time - a child WorkspaceArray containing only timestamps. It has the :time type;
value - child WorkspaceArray containing signal values. It has type :value;
type - type of the current array: :time, :value or :pair (timestamps + values);
parent - reference to the parent WorkspaceArray if the current one is the result of a slice;
range - the index range on the basis of which the slice was created. It is used in case of lazy slicing;
dimension - dimension of the data in the value field (for example, scalar, vector, matrix);
signal_id - unique identifier of the data source in Engee.

Most often when working with WorkspaceArray the time and value fields are used directly, especially when it is necessary to get data separately or convert them into a table.

Repeated usage

Once created, a variable containing WorkspaceArray (e.g. named wa) is automatically linked to the signal_id (unique identifier) in Engee. This allows it to be referenced by name and reused in other parts of the project - for example, in web applications based on the Genie framework (see Working with Genie in Engee), in other models, or in nested blocks, e.g. Subsystem:

wa = WorkspaceArray("my_wa")

Such an entry creates a new WorkspaceArray referencing already registered data under the name "my_wa".

Usage with models

The To Workspace and From Workspace blocks are designed to interact with WorkspaceArray in a model:

The To Workspace block writes data from the model to a variable of type WorkspaceArray in the Engee workspace. The variable name is specified by the Variable name parameters. Only numeric data (scalar and arrays) is supported.

The From Workspace block reads data from the workspace and feeds it to the model input. Only variables of WorkspaceArray type are used.

The To Workspace and From Workspace blocks work only with the WorkspaceArray type.

Example usage: From/To Workspace demo.

Working with CSV

Export WorkspaceArray to CSV

To export WorkspaceArray to CSV, use CSV.write with the mandatory parameter delim="\t":

using CSV
CSV.write("/user/workspacearray_csv.csv", delim="\t", data_frame)

The absence of the delim="\t" parameters will cause a BoundsError on import and create an empty file.

Importing WorkspaceArray from CSV

wa = WorkspaceArray("my_wa", "/user/my_data.csv")

The file must contain two columns:

time - mandatory column name;
value - value.

Data unloading

Full unloading to DataFrame

To completely unload data from a WorkspaceArray into a DataFrame, use collect:

df = collect(wa) # получаем DataFrame с колонками time и value

You can also get individual columns from WorkspaceArray:

collect(wa.time)   # только время
collect(wa.value)  # только значения

It is not recommended to use collect for very large data - memory overload may occur.

Lazy slices

One of the key advantages of WorkspaceArray is support for lazy slices (Lazy Slices).

Unlike regular Julia arrays, where a slice (wa[1:10]) creates a new array with a copy of the data, WorkspaceArray works differently. A slice creates a view that stores only the index range (range) and a reference to the original object (parent) - without actually loading the values into memory. Data is loaded from storage only when explicitly queried, e.g. via collect, iteration or index access.

This is particularly useful for:

Memory savings - slices do not copy data, but only point to the required items;
Fast performance - you can work with only the necessary data slices;
Flexibility - lazy slices can be used as model inputs, in visualisations or exports.

Example:

wa_slice = wa[1:2:end] # ленивое представление: каждый второй элемент
collect(wa_slice)      # загружаем данные в память (в виде DataFrame)

Result:

Row │ time     value
    │ Float64  Int64
────┼───────────────
  1 │ 0.1      1
  2 │ 0.3      3
  3 │ 0.5      5

In addition to slices, you can access an individual element by index:

wa[1] # доступ к первой записи, будет получено (0.1, 1)

Slices by `time` and `value` fields

Separate .time and .value fields are available in WorkspaceArray, to which you can also apply lazy slices:

wa_values = wa.value[2:end] # срез только по значениям
wa_times  = wa.time[2:end]  # срез только по временной шкале

A wa[1:2:end] slice returns an object of type WorkspaceArray with the same signal_id, but with a new range field pointing to the slice. This lets you know that you are facing a lazy view and not a copy of the data.

If necessary, the data can be loaded into memory:

wa = WorkspaceArray("my_wa")

wa_slice = wa[1:2:end]     # создаем ленивый срез
df = collect(wa_slice)     # загрузка данных
println(df)                # получаем таблицу со срезанными значениями

WorkspaceArray methods and interface

WorkspaceArray implements the standard Julia interfaces and augments them with its own methods. Below is a brief overview of the most useful functions that can be used when working with time series:

wa[i] - access to element by index, returns tuple (time, value):
```
wa[1] # будет получено (0.1, 42.0)
```

wa[start:stop] - slice by range, returns a lazy view. Data is not copied, but loaded as needed:

wa_slice = wa[1:10]
collect(wa_slice) # загружает только выбранный диапазон

wa[1:2:end] - lazy slice in increments.
wa.time, wa.value - access to child WorkspaceArrays containing time and values separately. Slices can be applied to them as well:
```
wa_times = wa.time[5:end]
wa_values = wa.value[5:end]
```
collect(wa) - converts a WorkspaceArray into a DataFrame. This is the main way to get the data explicitly:
```
df = collect(wa)
```

copy(wa) / similar(wa) - create a surface copy of the object. In the current implementation they work the same way.
wa1 == wa2 - comparison of two WorkspaceArrays by signal_id and range (range).

If you work with time series in Engee, then collect, slices (wa[1:10]) and iterrows are probably enough for everyday tasks. The rest of the methods will be useful for optimising and fine-tuning behaviour.

Example usage

Let’s look at a basic example of working with WorkspaceArray - from creation to getting a slice and unloading data. This scenario covers the most typical usage scenarios of WorkspaceArray.

First, create a table (DataFrame) containing two columns: time and value. This is the mandatory format for importing data into WorkspaceArray:
```
using DataFrames

df = DataFrame(time = [0.1, 0.2, 0.3, 0.4, 0.5],
               value = [1, 2, 3, 4, 5])
```
- time - the moment of time (in seconds);
- value - value of the signal at this moment.
Convert the table to a WorkspaceArray, specifying the name under which the array will be available in Engee:
```
wa = WorkspaceArray("myarr", df)
```
Here "myarr" is the name of a variable to work with in models and scripts.
WorkspaceArray supports access by index. For example, get the first record:
```
println(wa[1])
```
Result:
```
(0.1, 1)
```
This is a tuple: the first element is time, the second element is the value of the signal.

The data is loaded only when it is accessed. That is, even with wa[1] only one element is loaded.
Make a slice of the array, get every other element (1, 3, 5). This will create a lazy representation, not a copy:
```
wa_slice = wa[1:2:end]
```
To get the data as a table, use collect:
```
println(collect(wa_slice))
```
Result:
```
3×2 DataFrame
 Row │ time     value
     │ Float64  Int64
─────┼────────────────
   1 │     0.1      1
   2 │     0.3      3
   3 │     0.5      5
```
The wa[1:2:end] cut does not load data by itself. Only on collect, iteration or index call is the data actually loaded from storage.

Working with WorkspaceArray in Engee

Creating a WorkspaceArray

Working with WorkspaceArray

Repeated usage

Usage with models

Working with CSV

Export WorkspaceArray to CSV

Importing WorkspaceArray from CSV

Data unloading

Full unloading to DataFrame

Lazy slices

Slices by time and value fields

WorkspaceArray methods and interface

Example usage

Useful links

Slices by `time` and `value` fields