Engee documentation
Notebook

Python neural networks and their integration with Engee models

In this demonstration, we will look at an example of training neural networks using the sklearn package.

To work with neural networks using Python in Engee, the PyCall package and Python call commands in Engee are used.

First, let's install the sklearn and PyPlot libraries in Engee.

In [ ]:
# Pkg.add("ScikitLearn")
Pkg.add("PyPlot")

using PyPlot

# Импорт необходимых модулей из Python
@pyimport sklearn.neural_network as nn
@pyimport sklearn.tree as tree
@pyimport numpy as np
   Resolving package versions...
  No Changes to `~/.project/Project.toml`
  No Changes to `~/.project/Manifest.toml`

Generating training data from a model in Engee

This model generates a vector of two sinusoidal values that represent a single signal for the classifier. Based on the determination of the amplitude fluctuation of the output signal, the class labels are set. The figure below shows the top level of the model.

image_2.png

In the figure below, you can see the data generation process.

image_2.png

The following shows the formation of the class label over three ranges:

  1. less than -0.7;
  2. greater than -0.7 and less than 0.7;
  3. greater than 0.7.

image.png

Let's move on to data generation by connecting the auxiliary function to run the model, run the model and see what data has been logged into simout.

In [ ]:
 function run_model(name_model)
    Path = (@__DIR__) * "/" * name_model * ".engee"

    if name_model in [m.name for m in engee.get_all_models()] # Проверка условия загрузки модели в ядро
        model = engee.open( name_model ) # Открыть модель
        model_output = engee.run( model, verbose=true ); # Запустить модель
    else
        model = engee.load( Path, force=true ) # Загрузить модель
        model_output = engee.run( model, verbose=true ); # Запустить модель
        engee.close( name_model, force=true ); # Закрыть модель
    end

    return model_output
end

run_model("PyDataGen")
sleep(1)
collect(simout)
Building...
Progress 0%
Progress 100%
Progress 100%
Out[0]:
2-element Vector{WorkspaceArray}:
 WorkspaceArray{Int64}("PyDataGen/target")
 WorkspaceArray{Vector{Float64}}("PyDataGen/Data")

Next, we will unload the data from simout and convert it to numpy format for further feeding into Python neural networks.

In [ ]:
target = simout["PyDataGen/target"];
target = collect(target);
target = np.array(target.value); 

Data = simout["PyDataGen/Data"];
Data = collect(Data);
Data = np.array(Data.value); 

Multilayer perseptron

The first network, a classifier, that we consider in this demonstration is a multilayer perseptron network for predicting data group membership based on oscillation amplitude.

A multilayer perseptron (MLP) is a model of a feed-forward artificial neural network that maps sets of input data to a set of corresponding output data. MLP consists of several layers, each layer is fully connected to the next. The nodes of the layers are neurons with nonlinear activation functions, except for the nodes of the input layer. There may be one or more nonlinear hidden layers between the input layer and the output layer.

The figure below shows an MLP with one hidden layer with scalar output.

image.png

Let's move on to initialising and training the neural network in Python. There are several parameters to configure MLP:

  1. hidden_layer_sizes - the size of the hidden layer;
  2. max_iter - the maximum allowed number of training iterations;
  3. alpha - training step;
  4. solver - solver that defines the algorithm for optimising the weights on the nodes;
  5. verbose - specifies whether to output additional information;
  6. random_state - setting to control random values;
  7. learning_rate_init - learning rate.
In [ ]:
clf = nn.MLPClassifier(hidden_layer_sizes=(100,), max_iter=10, alpha=1e-4, solver="sgd", verbose=1, random_state=1, learning_rate_init=.1)
clf[:fit](Data, target)
/home/engee/micromamba/lib/python3.11/site-packages/sklearn/neural_network/_multilayer_perceptron.py:691: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (10) reached and the optimization hasn't converged yet.
  warnings.warn(
Out[0]:
MLPClassifier(learning_rate_init=0.1, max_iter=10, random_state=1, solver='sgd',
              verbose=1)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

As we can see from the warning, the maximum number of iterations was reached and the optimisation did not converge. Hence, we need to increase the number of iterations. Let's repeat the initialisation and training of the neural network in Python with new network parameters.

In [ ]:
clf = nn.MLPClassifier(hidden_layer_sizes=(100,), max_iter=140, alpha=1e-4, solver="sgd", verbose=1, random_state=1, learning_rate_init=.1)
clf[:fit](Data, target)
Out[0]:
MLPClassifier(learning_rate_init=0.1, max_iter=140, random_state=1,
              solver='sgd', verbose=1)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

Now let us evaluate the quality of the model. As we can see, the neural network has high guessing accuracy.

In [ ]:
accuracy = clf[:score](Data, target)
println("Точность: $accuracy")
Точность: 0.995004995004995

Decision tree

Let's consider an example of code for training a neural network with a decision tree structure. This is a non-parametric learning method with a teacher, used for classification and regression.

This method requires creating a model that predicts the value of a target variable by learning simple decision rules inferred from data features.

The tree can be viewed as a piecewise constant approximation. We start by declaring and training the decision tree.

In [ ]:
dt_clf = tree.DecisionTreeClassifier()
dt_clf.fit(Data, target)
Out[0]:
DecisionTreeClassifier()
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

Let's construct a graph of the decision tree and increase the pixel size and density to make the graph more visual.

In [ ]:
plt.figure(figsize=(13, 4), dpi=200)
tree.plot_tree(dt_clf, filled=true)
No description has been provided for this image
Out[0]:
7-element Vector{PyObject}:
 PyObject Text(0.6, 0.8333333333333334, 'x[1] <= 0.351\ngini = 0.573\nsamples = 1001\nvalue = [462, 83, 456]')
 PyObject Text(0.4, 0.5, 'x[0] <= -0.35\ngini = 0.261\nsamples = 539\nvalue = [0, 83, 456]')
 PyObject Text(0.5, 0.6666666666666667, 'True  ')
 PyObject Text(0.2, 0.16666666666666666, 'gini = 0.0\nsamples = 456\nvalue = [0, 0, 456]')
 PyObject Text(0.6, 0.16666666666666666, 'gini = 0.0\nsamples = 83\nvalue = [0, 83, 0]')
 PyObject Text(0.8, 0.5, 'gini = 0.0\nsamples = 462\nvalue = [462, 0, 0]')
 PyObject Text(0.7, 0.6666666666666667, '  False')

Create a new graph to display the result of the neural network.

In [ ]:
plt.figure(figsize=(18, 6), dpi=80)  # увеличим размер и плотность пикселей
plt.scatter(Data[:, 1], Data[:, 2], c=target, cmap=plt.cm.Paired, edgecolors="k")
plt.title("Neural Network")
No description has been provided for this image
Out[0]:
PyObject Text(0.5, 1.0, 'Neural Network')

As we can see from the resulting graph, all the values we generated were distributed into three separate groups based on the sum of input data.

Conclusion

We have shown with concrete examples the options for integrating neural networks into Engee, as well as the possibilities of combining model-oriented design and Python functionality.