Optim.jl
|
The page is in the process of being translated. |
Optim is Julia package implementing various algorithms to perform univariate and multivariate optimization.
Installation: OptimizationOptimJL.jl
To use this package, install the OptimizationOptimJL package:
import Pkg;
Pkg.add("OptimizationOptimJL");
Methods
Optim.jl algorithms can be one of the following:
-
Optim.NelderMead() -
Optim.SimulatedAnnealing() -
Optim.ParticleSwarm() -
Optim.ConjugateGradient() -
Optim.GradientDescent() -
Optim.BFGS() -
Optim.LBFGS() -
Optim.NGMRES() -
Optim.OACCEL() -
Optim.NewtonTrustRegion() -
Optim.Newton() -
Optim.KrylovTrustRegion() -
Optim.ParticleSwarm() -
Optim.SAMIN()
Each optimizer also takes special arguments which are outlined in the sections below.
The following special keyword arguments which are not covered by the common solve arguments can be used with Optim.jl optimizers:
-
x_tol: Absolute tolerance in changes of the input vectorx, in infinity norm. Defaults to0.0. -
g_tol: Absolute tolerance in the gradient, in infinity norm. Defaults to1e-8. For gradient free methods, this will control the main convergence tolerance, which is solver-specific. -
f_calls_limit: A soft upper limit on the number of objective calls. Defaults to0(unlimited). -
g_calls_limit: A soft upper limit on the number of gradient calls. Defaults to0(unlimited). -
h_calls_limit: A soft upper limit on the number of Hessian calls. Defaults to0(unlimited). -
allow_f_increases: Allow steps that increase the objective value. Defaults tofalse. Note that, when setting this totrue, the last iterate will be returned as the minimizer even if the objective increased. -
store_trace: Should a trace of the optimization algorithm’s state be stored? Defaults tofalse. -
show_trace: Should a trace of the optimization algorithm’s state be shown onstdout? Defaults tofalse. -
extended_trace: Save additional information. Solver dependent. Defaults tofalse. -
trace_simplex: Include the full simplex in the trace forNelderMead. Defaults tofalse. -
show_every: Trace output is printed everyshow_everyth iteration.
For a more extensive documentation of all the algorithms and options, please consult the Documentation
Local Optimizer
Local Constraint
Optim.jl implements the following local constraint algorithms:
-
-
μ0specifies the initial barrier penalty coefficient as either a number or:auto -
show_linesearchis an option to turn on linesearch verbosity. -
Defaults:
-
linesearch::Function = Optim.backtrack_constrained_grad -
μ0::Union{Symbol,Number} = :auto -
show_linesearch::Bool = false
-
-
The Rosenbrock function with constraints can be optimized using the Optim.IPNewton() as follows:
using Optimization, OptimizationOptimJL
rosenbrock(x, p) = (p[1] - x[1])^2 + p[2] * (x[2] - x[1]^2)^2
cons = (res, x, p) -> res .= [x[1]^2 + x[2]^2]
x0 = zeros(2)
p = [1.0, 100.0]
prob = OptimizationFunction(rosenbrock, Optimization.AutoForwardDiff(); cons = cons)
prob = Optimization.OptimizationProblem(prob, x0, p, lcons = [-5.0], ucons = [10.0])
sol = solve(prob, IPNewton())
retcode: Success
u: 2-element Vector{Float64}:
0.9999999992669327
0.9999999985109471
See also in the Optim.jl documentation the Nonlinear constrained optimization example using IPNewton.
Derivative-Free
Derivative-free optimizers are optimizers that can be used even in cases where no derivatives or automatic differentiation is specified. While they tend to be less efficient than derivative-based optimizers, they can be easily applied to cases where defining derivatives is difficult. Note that while these methods do not support general constraints, all support bounds constraints via lb and ub in the Optimization.OptimizationProblem.
Optim.jl implements the following derivative-free algorithms:
-
Optim.NelderMead(): Nelder-Mead optimizer-
solve(problem, NelderMead(parameters, initial_simplex)) -
parameters = AdaptiveParameters()orparameters = FixedParameters() -
initial_simplex = AffineSimplexer() -
Defaults:
-
parameters = AdaptiveParameters() -
initial_simplex = AffineSimplexer()
-
-
-
Optim.SimulatedAnnealing(): Simulated Annealing-
solve(problem, SimulatedAnnealing(neighbor, T, p)) -
neighboris a mutating function of the current and proposedx -
Tis a function of the current iteration that returns a temperature -
pis a function of the current temperature -
Defaults:
-
neighbor = default_neighbor! -
T = default_temperature -
p = kirkpatrick
-
-
The Rosenbrock function can be optimized using the Optim.NelderMead() as follows:
using Optimization, OptimizationOptimJL
rosenbrock(x, p) = (1 - x[1])^2 + 100 * (x[2] - x[1]^2)^2
x0 = zeros(2)
p = [1.0, 100.0]
prob = Optimization.OptimizationProblem(rosenbrock, x0, p)
sol = solve(prob, Optim.NelderMead())
retcode: Success
u: 2-element Vector{Float64}:
0.9999634355313174
0.9999315506115275
Gradient-Based
Gradient-based optimizers are optimizers which utilize the gradient information based on derivatives defined or automatic differentiation.
Optim.jl implements the following gradient-based algorithms:
-
Optim.ConjugateGradient(): Conjugate Gradient Descent-
solve(problem, ConjugateGradient(alphaguess, linesearch, eta, P, precondprep)) -
alphaguesscomputes the initial step length (for more information, consult this source and this example)-
available initial step length procedures:
-
InitialPrevious -
InitialStatic -
InitialHagerZhang -
InitialQuadratic -
InitialConstantChange
-
-
linesearchspecifies the line search algorithm (for more information, consult this source and this example)-
available line search algorithms:
-
HaegerZhang -
MoreThuente -
BackTracking -
StrongWolfe -
Static
-
-
etadetermines the next step direction -
Pis an optional preconditioner (for more information, see this source) -
precondpredis used to updatePas the state variablexchanges -
Defaults:
-
alphaguess = LineSearches.InitialHagerZhang() -
linesearch = LineSearches.HagerZhang() -
eta = 0.4 -
P = nothing -
precondprep = (P, x) -> nothing
-
-
-
Optim.GradientDescent(): Gradient Descent (a quasi-Newton solver)-
solve(problem, GradientDescent(alphaguess, linesearch, P, precondprep)) -
alphaguesscomputes the initial step length (for more information, consult this source and this example)-
available initial step length procedures:
-
InitialPrevious -
InitialStatic -
InitialHagerZhang -
InitialQuadratic -
InitialConstantChange
-
-
linesearchspecifies the line search algorithm (for more information, consult this source and this example)-
available line search algorithms:
-
HaegerZhang -
MoreThuente -
BackTracking -
StrongWolfe -
Static
-
-
Pis an optional preconditioner (for more information, see this source) -
precondpredis used to updatePas the state variablexchanges -
Defaults:
-
alphaguess = LineSearches.InitialPrevious() -
linesearch = LineSearches.HagerZhang() -
P = nothing -
precondprep = (P, x) -> nothing
-
-
-
Optim.BFGS(): Broyden-Fletcher-Goldfarb-Shanno algorithm-
solve(problem, BFGS(alphaguess, linesearch, initial_invH, initial_stepnorm, manifold)) -
alphaguesscomputes the initial step length (for more information, consult this source and this example)-
available initial step length procedures:
-
InitialPrevious -
InitialStatic -
InitialHagerZhang -
InitialQuadratic -
InitialConstantChange
-
-
linesearchspecifies the line search algorithm (for more information, consult this source and this example)-
available line search algorithms:
-
HaegerZhang -
MoreThuente -
BackTracking -
StrongWolfe -
Static
-
-
initial_invHspecifies an optional initial matrix -
initial_stepnormdetermines thatinitial_invHis an identity matrix scaled by the value ofinitial_stepnormmultiplied by the sup-norm of the gradient at the initial point -
manifoldspecifies a (Riemannian) manifold on which the function is to be minimized (for more information, consult this source)-
available manifolds:
-
Flat -
Sphere -
Stiefel -
meta-manifolds:
-
PowerManifold -
ProductManifold -
custom manifolds
-
-
Defaults:
-
alphaguess = LineSearches.InitialStatic() -
linesearch = LineSearches.HagerZhang() -
initial_invH = nothing -
initial_stepnorm = nothing -
manifold = Flat()
-
-
-
Optim.LBFGS(): Limited-memory Broyden-Fletcher-Goldfarb-Shanno algorithm-
mis the number of history points -
alphaguesscomputes the initial step length (for more information, consult this source and this example)-
available initial step length procedures:
-
InitialPrevious -
InitialStatic -
InitialHagerZhang -
InitialQuadratic -
InitialConstantChange
-
-
linesearchspecifies the line search algorithm (for more information, consult this source and this example)-
available line search algorithms:
-
HaegerZhang -
MoreThuente -
BackTracking -
StrongWolfe -
Static
-
-
Pis an optional preconditioner (for more information, see this source) -
precondpredis used to updatePas the state variablexchanges -
manifoldspecifies a (Riemannian) manifold on which the function is to be minimized (for more information, consult this source)-
available manifolds:
-
Flat -
Sphere -
Stiefel -
meta-manifolds:
-
PowerManifold -
ProductManifold -
custom manifolds
-
-
scaleinvH0: whether to scale the initial Hessian approximation -
Defaults:
-
m = 10 -
alphaguess = LineSearches.InitialStatic() -
linesearch = LineSearches.HagerZhang() -
P = nothing -
precondprep = (P, x) -> nothing -
manifold = Flat() -
scaleinvH0::Bool = true && (P isa Nothing)
-
-
The Rosenbrock function can be optimized using the Optim.LBFGS() as follows:
using Optimization, OptimizationOptimJL
rosenbrock(x, p) = (1 - x[1])^2 + 100 * (x[2] - x[1]^2)^2
x0 = zeros(2)
p = [1.0, 100.0]
optprob = OptimizationFunction(rosenbrock, Optimization.AutoForwardDiff())
prob = Optimization.OptimizationProblem(optprob, x0, p, lb = [-1.0, -1.0], ub = [0.8, 0.8])
sol = solve(prob, Optim.LBFGS())
retcode: Success
u: 2-element Vector{Float64}:
0.799999998888889
0.639999998209689
Hessian-Based Second Order
Hessian-based optimization methods are second order optimization methods which use the direct computation of the Hessian. These can converge faster, but require fast and accurate methods for calculating the Hessian in order to be appropriate.
Optim.jl implements the following hessian-based algorithms:
-
Optim.NewtonTrustRegion(): Newton Trust Region method-
initial_delta: The starting trust region radius -
delta_hat: The largest allowable trust region radius -
eta: When rho is at least eta, accept the step. -
rho_lower: When rho is less than rho_lower, shrink the trust region. -
rho_upper: When rho is greater than rho_upper, grow the trust region (though no greater than delta_hat). -
Defaults:
-
initial_delta = 1.0 -
delta_hat = 100.0 -
eta = 0.1 -
rho_lower = 0.25 -
rho_upper = 0.75
-
-
-
Optim.Newton(): Newton’s method with line search-
alphaguesscomputes the initial step length (for more information, consult this source and this example)-
available initial step length procedures:
-
InitialPrevious -
InitialStatic -
InitialHagerZhang -
InitialQuadratic -
InitialConstantChange
-
-
linesearchspecifies the line search algorithm (for more information, consult this source and this example)-
available line search algorithms:
-
HaegerZhang -
MoreThuente -
BackTracking -
StrongWolfe -
Static
-
-
Defaults:
-
alphaguess = LineSearches.InitialStatic() -
linesearch = LineSearches.HagerZhang()
-
-
The Rosenbrock function can be optimized using the Optim.Newton() as follows:
using Optimization, OptimizationOptimJL, ModelingToolkit
rosenbrock(x, p) = (1 - x[1])^2 + 100 * (x[2] - x[1]^2)^2
x0 = zeros(2)
p = [1.0, 100.0]
f = OptimizationFunction(rosenbrock, Optimization.AutoSymbolics())
prob = Optimization.OptimizationProblem(f, x0, p)
sol = solve(prob, Optim.Newton())
retcode: Success
u: 2-element Vector{Float64}:
0.9999999999999994
0.9999999999999989
Hessian-Free Second Order
Hessian-free methods are methods which perform second order optimization by direct computation of Hessian-vector products (Hv) without requiring the construction of the full Hessian. As such, these methods can perform well for large second order optimization problems, but can require special case when considering conditioning of the Hessian.
Optim.jl implements the following hessian-free algorithms:
-
Optim.KrylovTrustRegion(): A Newton-Krylov method with Trust Regions-
initial_delta: The starting trust region radius -
delta_hat: The largest allowable trust region radius -
eta: When rho is at least eta, accept the step. -
rho_lower: When rho is less than rho_lower, shrink the trust region. -
rho_upper: When rho is greater than rho_upper, grow the trust region (though no greater than delta_hat). -
Defaults:
-
initial_delta = 1.0 -
delta_hat = 100.0 -
eta = 0.1 -
rho_lower = 0.25 -
rho_upper = 0.75
-
-
The Rosenbrock function can be optimized using the Optim.KrylovTrustRegion() as follows:
using Optimization, OptimizationOptimJL
rosenbrock(x, p) = (1 - x[1])^2 + 100 * (x[2] - x[1]^2)^2
x0 = zeros(2)
p = [1.0, 100.0]
optprob = OptimizationFunction(rosenbrock, Optimization.AutoForwardDiff())
prob = Optimization.OptimizationProblem(optprob, x0, p)
sol = solve(prob, Optim.KrylovTrustRegion())
retcode: Success
u: 2-element Vector{Float64}:
0.999999999999108
0.9999999999981819
Global Optimizer
Without Constraint Equations
The following method in Optim performs global optimization on problems with or without box constraints. It works both with and without lower and upper bounds set by lb and ub in the Optimization.OptimizationProblem.
-
Optim.ParticleSwarm(): Particle Swarm Optimization-
solve(problem, ParticleSwarm(lower, upper, n_particles)) -
lower/upperare vectors of lower/upper bounds respectively -
n_particlesis the number of particles in the swarm -
defaults to:
lower = [],upper = [],n_particles = 0
-
The Rosenbrock function can be optimized using the Optim.ParticleSwarm() as follows:
using Optimization, OptimizationOptimJL
rosenbrock(x, p) = (p[1] - x[1])^2 + p[2] * (x[2] - x[1]^2)^2
x0 = zeros(2)
p = [1.0, 100.0]
f = OptimizationFunction(rosenbrock)
prob = Optimization.OptimizationProblem(f, x0, p, lb = [-1.0, -1.0], ub = [1.0, 1.0])
sol = solve(prob, Optim.ParticleSwarm(lower = prob.lb, upper = prob.ub, n_particles = 100))
retcode: Failure
u: 2-element Vector{Float64}:
1.0
1.0
With Constraint Equations
The following method in Optim performs global optimization on problems with box constraints.
-
Optim.SAMIN(): Simulated Annealing with bounds-
solve(problem, SAMIN(nt, ns, rt, neps, f_tol, x_tol, coverage_ok, verbosity)) -
Defaults:
-
nt = 5 -
ns = 5 -
rt = 0.9 -
neps = 5 -
f_tol = 1e-12 -
x_tol = 1e-6 -
coverage_ok = false -
verbosity = 0
-
-
The Rosenbrock function can be optimized using the Optim.SAMIN() as follows:
using Optimization, OptimizationOptimJL
rosenbrock(x, p) = (1 - x[1])^2 + 100 * (x[2] - x[1]^2)^2
x0 = zeros(2)
p = [1.0, 100.0]
f = OptimizationFunction(rosenbrock, Optimization.AutoForwardDiff())
prob = Optimization.OptimizationProblem(f, x0, p, lb = [-1.0, -1.0], ub = [1.0, 1.0])
sol = solve(prob, Optim.SAMIN())
retcode: Failure
u: 2-element Vector{Float64}:
0.9464678437495901
0.8933571780479223