Parameter estimation

If we think about creating models of a control object in terms of the data we have about the model, then there are often 3 approaches:
A white box model is a model in which we know all the basic dynamics of the system.
On the contrary, the black box model implies that we don't know the dynamics at all.
This example examines the intermediate gray box approach, when we know the structure of the model, but we lack some parameters.
The missing parameters can be selected based on experimental data. This is a rather complex algorithm, but thanks to the collaboration of scripts and models, even such a difficult task can be implemented in the code.

Loading data from a table

using DataFrames, CSV;

Import_data = "/user/start/examples/controls/parameter_estimation_webinar/parameter_estimation.csv" # @param {type:"filepicker"}
table = CSV.read(Import_data, DataFrame);
print(".csv file imported")
Показать_таблицу = true # @param {type:"boolean"}
if to show a table
    display(table)
end

.csv file imported

Столбец_времени = 1 # @param {type:"integer"}
Столбец_входа = 2 # @param {type:"integer"}
Столбец_выхода = 3 # @param {type:"integer"}
Начинать_со_строки = 1 # @param {type:"integer"}
exp_time_data = table[Start with_strings:end, Time Column_]; 
exp_in_data = table[Start with_strings:end, Entry Column_];
exp_out_data = table[Start with_strings:end, Exit Column_];

df_in = DataFrame(time=exp_time_data, value=exp_in_data)
df_in = Float64.(df_in);
wa_in = WorkspaceArray(string(rand()), df_in)

WorkspaceArray{Float64}("0.776689036071537")

In this example, the control object is a DC motor. The model contains all the known parameters from the engine documentation. But the friction is not given in the documentation, so now there are approximate values in the blocks of coefficients of viscous friction and sliding friction.

The next cell runs the model for simulation to make sure that the results of the model will be far from the experimental data.

Comparison of experimental data and model data

model_name = "parameter_estimation" # @param {type:"string"}
if model_name in [m.name for m in engee.get_all_models()]
    m = engee.open( model_name ) # loading the model
else
    m = engee.load( model_name * ".engee" )
end
logout = engee.run(m, verbose=false)
input = plot(exp_time_data, exp_in_data, title = "Entrance", xlabel="t, c", label = "experiment", legend=true)
output = plot(exp_time_data, exp_out_data, title = "Exit", xlabel="t, c", label = "experiment", legend=true)
plot!(logout["w"].time, logout["w"].value, label = "simulation", linewidth=2)
plot(input, output, layout = (2, 1))

Connecting libraries for evaluation

using DataFrames
try
    using NLopt
catch
    Pkg.add(["NLopt"])
    using NLopt
end
try
    using Interpolations
catch
    Pkg.add(["Interpolations"])
    using Interpolations
end

┌ Warning: attempting to remove probably stale pidfile
│   path = "/user/.packages/compiled/v1.12/NLopt/faRdv_P8JK9.ji.pidfile"
└ @ FileWatching.Pidfile /usr/local/julia/share/julia/stdlib/v1.12/FileWatching/src/pidfile.jl:247
[ Info: Precompiling NLopt [76087f3c-5699-56af-9a33-bf431cd00edd] (cache misses: wrong julia version (2))

SYSTEM: caught exception of type :MethodError while trying to print a failed Task notice; giving up
GKS: cannot open display - headless operation mode active
Warning: detected a stack overflow; program state may be corrupted, so further execution might be unreliable.
   Resolving package versions...
   Installed LineSearch ─────────────── v0.1.7
   Installed DiffEqCallbacks ────────── v4.14.0
   Installed ControlSystems ─────────── v1.15.2
   Installed DifferentiationInterface ─ v0.7.16
    Updating `~/.project/Project.toml`
⌅ [a98d9a8b] + Interpolations v0.15.1
    Updating `~/.project/Manifest.toml`
  [13072b0f] + AxisAlgorithms v1.1.0
⌅ [a98d9a8b] + Interpolations v0.15.1
  [6fe1bfb0] + OffsetArrays v1.17.0
  [c84ed2f1] + Ratios v0.4.5
  [efce3f68] + WoodburyMatrices v1.1.0
        Info Packages marked with ⌅ have new versions available but compatibility constraints restrict them from upgrading. To see why use `status --outdated -m`

# @markdown # Estimation of model parameters
# @markdown ## To start evaluating the model parameters, enter:
# @markdown ### Model name:
# Opening the model
model_name = "parameter_estimation" # @param {type:"string"}
if model_name in [m.name for m in engee.get_all_models()] # Checking the condition for loading a model into the kernel
    model = engee.open(model_name) # Open the model
    model = engee.gcm() 
else
    @error "The $model_name model is not open."
end
# @markdown ### Login (FromWorkspace type block):
input_name = "From the workspace" # @param {type:"string"}
input_path = engee.find_system(model; depth=0, blockparams=["BlockType" => "FromWorkspace","BlockName"=>input_name])
# @markdown ### Input value:
input_data = [exp_time_data exp_in_data] # @param {type:"raw"}
typeof(input_data) === Matrix{Float64} || throw(ArgumentError("The input signal must be a matrix containing Float64 numbers."))
size(input_data,2) == 2 || throw(ArgumentError("The input signal should have 2 columns - time and signal values."))

isempty(engee.find_system(model; depth=0, blockparams=["BlockType" => "FromWorkspace","BlockName"=>input_name])) && throw(ArgumentError("Блока $input_name не существует в корневой системе модели $(model.name)."))
pe_udata_workspace = WorkspaceArray("pe_udata_workspace", DataFrame(time = input_data[:,1], value = input_data[:,2]));
engee.set_param!("$(model.name)/$input_name", "VariableName" =>"pe_udata_workspace") 

# @markdown ### Output (logable signal):
output_name = "w" # @param {type:"string"}
output_name in keys(engee.run(model, verbose=false)) || throw(ArgumentError("Сигнал $output_name не найден среди логируемых сигналов."))

# @markdown ### The value of the output signal:
output_data = [exp_time_data exp_out_data] # @param {type:"raw"}
typeof(output_data) === Matrix{Float64} || throw(ArgumentError("The output signal must be a matrix containing Float64 numbers."))
size(output_data,2) == 2 || throw(ArgumentError("The output signal should have 2 columns - time and signal values."))

# @markdown ### Names of evaluated parameters:
# Creating a vector of variables, their initial values and constraints
p_names = ["b","c"] # @param {type:"raw"}
p_names isa Vector{String} || throw(ArgumentError("The names of the parameters must be entered as a vector of strings."))
parametrs_number = length(p_names)
parametrs_number>0 || throw(ArgumentError("The number of parameters must be greater than zero."))

# @markdown ### Initial parameter values:
p_0 = [0.01, 0.7] # @param {type:"raw"}
p_0 isa Vector{Float64} || throw(ArgumentError("The initial values of the parmeters must be a vector containing Float64 numbers."))
parametrs_number==length(p_0) || throw(ArgumentError("The number of initial values and parmeters must match."))

# @markdown ### Lower bound of parameter values:
p_lower_bounds = [0.0, 0.0] # @param {type:"raw"}
p_lower_bounds isa Vector{Float64} || throw(ArgumentError("The lower bound of the parameter values should be a vector containing Float64 numbers."))
parametrs_number==length(p_lower_bounds) || throw(ArgumentError("The number of values of the lower limit and the parmeters must match."))

# @markdown ### Upper bound of parameter values:
p_upper_bounds = [1.0, 10.0] # @param {type:"raw"}
p_upper_bounds isa Vector{Float64} || throw(ArgumentError("The upper bound of the parameter values should be a vector containing Float64 numbers."))
parametrs_number==length(p_0) || throw(ArgumentError("The number of initial values and parmeters must match."))

# @markdown ## Optimization Settings:
# @markdown ### Optimization algorithm (the "G" at the beginning of the algorithm name indicates that it is global, the "L" is local)
algoritm = :G_MLSL # @param [:GN_DIRECT,:GN_DIRECT_L,:GN_DIRECT_L_RAND,:GN_CRS2_LM,:G_MLSL,:GN_AGS,:GN_ISRES,:GN_ESCH,:LN_COBYLA,:LN_BOBYQA,:LN_PRAXIS,:LN_NELDERMEAD,:LN_SBPLX] {type:"raw"}
# @markdown ### Auxiliary algorithm (only for :G_MLSL)
local_algoritm = :LN_COBYLA # @param [:LN_COBYLA,:LN_BOBYQA,:LN_PRAXIS,:LN_NELDERMEAD,:LN_SBPLX] {type:"raw"}
# Configuring optimization parameters
# @markdown ### Minimum relative step change
xtol_rel = 1.0e-1 # @param {type:"number"}
xtol_rel>0 || throw(ArgumentError("The minimum relative step change should be positive."))
# @markdown ### Minimum absolute step change
xtol_abs = 1.0e-1 # @param {type:"number"}
xtol_rel>0 || throw(ArgumentError("The minimum absolute step change should be more than positive."))
# @markdown ### Minimal relative change in objective function
ftol_rel = 1.0e-1 # @param {type:"number"}
xtol_rel>0 || throw(ArgumentError("The minimum relative change in the objective function should be positive."))
# @markdown ### Minimum absolute change in objective function
ftol_abs = 1.0e-1 # @param {type:"number"}
xtol_rel>0 || throw(ArgumentError("The minimum absolute change in the objective function should be positive."))
# @markdown ### Maximum number of estimates of the objective function
maxeval = 30 # @param {type:"integer"}
xtol_rel>0 || throw(ArgumentError("The maximum number of estimates of the objective function should be greater than zero."))

# @markdown ## Displaying parameter evaluation results
# @markdown ### Graph of measured data and simulation results:
compare = true # @param {type:"boolean"}
# @markdown ### Error graph between measured data and simulation results:
reside = true # @param {type:"boolean"}
# @markdown ### Graph of parameter estimation values:
variables = true # @param {type:"boolean"}
# @markdown ### Graph of objective function values:
objective = true # @param {type:"boolean"}

# Graph comparing the output of the model and the measured data
function compare_plot(model, output_name::String, tdata::Vector{Float64}, udata::Vector{Float64}, ydata::Vector{Float64})
    logout = engee.run(model, verbose=false)
    out_plot = plot(
        logout[output_name].time, 
        logout[output_name].value, 
        label="simulation", 
        title = "Exit", 
        xlabel="t, with", 
        legend=:bottomright,
        linewidth=3
        )
    plot!(out_plot, tdata, ydata, label="experiment")
    in_plot = plot(tdata, udata, title = "Entrance", xlabel="t, c", legend = false)
    p = plot(out_plot,in_plot,layout = (2, 1), plot_title="Experiment and simulation")
end

# schedule of discrepancies
function residual_plot(model, output_name::String, tdata::Vector{Float64}, ydata::Vector{Float64})
    logout = engee.run(model, verbose=false)
    nearest = interpolate((logout[output_name].time,), logout[output_name].value, Gridded(Constant()))
    residals = ydata .- nearest.(tdata)
    reside = plot(tdata, residals, title = "Mistakes", xlabel="t, with", legend = false)
end

# Schedule of change of variables
function variable_plot(p_names::Vector{String}, parametrs_values)
    p_trace = plot(title = "Parameter values", xlabel="Number of estimates of the objective function",)
    parametrs_values_m = mapreduce(permutedims, vcat, parametrs_values)
    for i in 1:n
        plot!(p_trace, 1:length(parametrs_values_m[:,i]), parametrs_values_m[:,i], markers=:auto, label = p_names[i])
    end
    return p_trace
end

n = parametrs_number

# Vectors for logging the values of the objective function and variables
loss_function_vector = Vector{Float64}()
parametrs_values = Vector{Vector{Float64}}()

# loss function
function myfunc(x::Vector, grad::Vector, output_name::String, measure_time, measure_value, parametrs_name...)
    str = ""
    iterations_value = Vector{Float64}()
    for (value,p) in zip(x, parametrs_name)
        global parametr = Symbol(p)
        @eval(($parametr)=$(value))
        str = str*"$p = $value, "
        push!(iterations_value, value)
    end
    push!(parametrs_values, iterations_value)
    simulation_out = engee.run(model, verbose=false) |> x->collect(x[output_name])
    if measure_time[end]>simulation_out.time[end]  
        push!(simulation_out,[measure_time[end], simulation_out.time[end]])
    end
    nearest = interpolate(
        (simulation_out.time,), 
        simulation_out.value, 
        Gridded(Constant())
    )
    l = sum(abs2, measure_value .- nearest.(measure_time))
    str = "$(length(parametrs_values)). "*str*"loss = $l"
    println(str)
    push!(loss_function_vector, l)
    return l
end


opt = Opt(algoritm, n) # a gradient-free algorithm
algoritm === :G_MLSL && NLopt.local_optimizer!(opt, Opt(local_algoritm, n))
opt.lower_bounds = p_lower_bounds
opt.upper_bounds = p_upper_bounds
opt.xtol_rel = xtol_rel
opt.xtol_abs = xtol_abs
opt.ftol_rel = ftol_rel
opt.ftol_abs = ftol_rel
opt.maxeval =  maxeval

opt.min_objective = (x,g) -> myfunc(x,g,output_name,output_data[:,1],output_data[:,2],p_names...)

(minf,minx,ret) = optimize(opt,p_0)
numevals = opt.numevals 

str = "\Evaluation results:\n"
for (name, value) in zip(p_names, minx) 
    str = str*"$name = $value\n"
end
str = str*"\n Target function value = $minf.\n\Optimization is completed after $numevals of estimates of the target function with the status: $ret.\n"
print(str)

compare && display(compare_plot(model, output_name, input_data[:,1], input_data[:,2], output_data[:,2]))
println("")
reside && display(residual_plot(model, output_name, output_data[:,1], output_data[:,2]))
println("")
objective &&
    display(
        plot(1:numevals, loss_function_vector, title = "Target function", xlabel="Number of estimates of the objective function", legend = false, markers = :circle)
    )
println("")   
variables && display(variable_plot(p_names, parametrs_values))
println("")

1. b = 0.01, c = 0.7, loss = 782827.2305232552
2. b = 0.018087503815033434, c = 1.6015345394854297, loss = 10751.151224272222
3. b = 0.866082233489766, c = 4.0722066078109895, loss = 2.2261803894698005e6

Warning: detected a stack overflow; program state may be corrupted, so further execution might be unreliable.

4. b = 0.3162940889231156, c = 2.436950692944041, loss = 1.7393234184763955e6
5. b = 0.6520248188926385, c = 0.5744586910483873, loss = 1.7806868536939798e6
6. b = 0.018087503815033434, c = 1.6015345394854297, loss = 10751.151224272222
7. b = 0.03165313167630851, c = 1.6015345394854297, loss = 41093.85357200353
8. b = 0.018087503815033434, c = 2.802685444099502, loss = 256788.70855000004
9. b = 0.016427095185229786, c = 0.4094150027720766, loss = 590410.1647736904
10. b = 0.01725729950013161, c = 1.005474771128753, loss = 185323.31124133876
11. b = 0.017799113771523295, c = 1.9007345996891476, loss = 2339.5281199926985
12. b = 0.016876309227326317, c = 1.8956857264241798, loss = 776.3044984104805
13. b = 0.014777143064714567, c = 2.1315370069861075, loss = 8095.828968167254
14. b = 0.018339508283451707, c = 1.971568186106855, loss = 8960.157076689598
15. b = 0.016463902154722154, c = 1.8300932490469881, loss = 1265.681794168338
16. b = 0.016159324324936164, c = 1.9357545861408532, loss = 856.9200125547208
17. b = 0.017157302475527644, c = 1.9237913353787315, loss = 1980.0248528142763
18. b = 0.016769126425534724, c = 1.8794940554039552, loss = 522.5866018855496
19. b = 0.016670572976754266, c = 1.8628781264788314, loss = 517.7273584417683
20. b = 0.01673318313102868, c = 1.8610367137619854, loss = 510.90152989053706
21. b = 0.016896443710708992, c = 1.8490671676753345, loss = 532.9584938798203
22. b = 0.01662888667192703, c = 1.8593700209089208, loss = 557.5289774965057

Warning: detected a stack overflow; program state may be corrupted, so further execution might be unreliable.

23. b = 0.016784060300742858, c = 1.8597247976362825, loss = 505.31482464295294
24. b = 0.01680559641094343, c = 1.8640118252947393, loss = 490.658135054997
25. b = 0.01683818165159883, c = 1.8677118715051393, loss = 491.4433858887315
26. b = 0.01682188903127113, c = 1.8658618483999394, loss = 489.1417479288798
27. b = 0.016844694598450387, c = 1.8646676220612934, loss = 489.2103388894403
28. b = 0.01683434684998047, c = 1.8679323462990034, loss = 491.47310317880596
29. b = 0.016814813657318146, c = 1.8648701581503473, loss = 489.43293652011465
30. b = 0.016824573579231453, c = 1.866398020407165, loss = 489.3568382001504

Evaluation results:
b = 0.01682188903127113
c = 1.8658618483999394

The value of the objective function is 489.1417479288798.

Optimization is completed after 30 evaluations of the objective function with the status: MAXEVAL_REACHED.

Result:

Over 30 iterations, we have selected the following parameters:

print("Evaluation results: b = ", b, ", c = ", c)

Evaluation results: b = 0.016824573579231453, c = 1.866398020407165

We are quite satisfied with the results.

To improve the result, you can experiment with initial assumptions, optimization settings, and submit various input data to the experimental bench.

Row	time	Ступенчатая функция	Add
	Float64	Float64	Float64
1	0.0	0.0	0.197934
2	0.01	0.0	0.631221
3	0.02	0.0	-0.205922
4	0.03	0.0	-1.02133
5	0.04	0.0	0.48395
6	0.05	0.0	-0.200081
7	0.06	0.0	-0.234782
8	0.07	0.0	-0.52721
9	0.08	0.0	-0.632835
10	0.09	0.0	0.791541
11	0.1	0.0	-0.18815
12	0.11	0.0	-0.39816
13	0.12	0.0	-0.066747
⋮	⋮	⋮	⋮
990	9.89	12.0	50.9197
991	9.9	12.0	53.021
992	9.91	12.0	51.6726
993	9.92	12.0	51.0411
994	9.93	12.0	51.6731
995	9.94	12.0	51.402
996	9.95	12.0	51.2705
997	9.96	12.0	52.398
998	9.97	12.0	51.0498
999	9.98	12.0	52.5656
1000	9.99	12.0	52.7554
1001	10.0	12.0	51.5298