qml.NesterovMomentumOptimizer¶
-
class
NesterovMomentumOptimizer
(stepsize=0.01, momentum=0.9)[source]¶ Bases:
pennylane.optimize.momentum.MomentumOptimizer
Gradient-descent optimizer with Nesterov momentum.
Nesterov Momentum works like the
Momentum optimizer
, but shifts the current input by the momentum term when computing the gradient of the objective function:\[a^{(t+1)} = m a^{(t)} + \eta \nabla f(x^{(t)} - m a^{(t)}).\]The user defined parameters are:
\(\eta\): the step size
\(m\): the momentum
- Parameters
stepsize (float) – user-defined hyperparameter \(\eta\)
momentum (float) – user-defined hyperparameter \(m\)
Note
When using
torch
,tensorflow
orjax
interfaces, refer to Gradients and training for suitable optimizers.Methods
apply_grad
(grad, args)Update the trainable args to take a single optimization step.
compute_grad
(objective_fn, args, kwargs[, …])Compute gradient of the objective function at at the shifted point \((x - m\times\text{accumulation})\) and return it along with the objective function forward pass (if available).
reset
()Reset optimizer by erasing memory of past steps.
step
(objective_fn, *args[, grad_fn])Update trainable arguments with one step of the optimizer.
step_and_cost
(objective_fn, *args[, grad_fn])Update trainable arguments with one step of the optimizer and return the corresponding objective function value prior to the step.
-
apply_grad
(grad, args)¶ Update the trainable args to take a single optimization step. Flattens and unflattens the inputs to maintain nested iterables as the parameters of the optimization.
- Parameters
grad (tuple [array]) – the gradient of the objective function at point \(x^{(t)}\): \(\nabla f(x^{(t)})\).
args (tuple) – the current value of the variables \(x^{(t)}\).
- Returns
the new values \(x^{(t+1)}\).
- Return type
list [array]
-
compute_grad
(objective_fn, args, kwargs, grad_fn=None)[source]¶ Compute gradient of the objective function at at the shifted point \((x - m\times\text{accumulation})\) and return it along with the objective function forward pass (if available).
- Parameters
objective_fn (function) – the objective function for optimization.
args (tuple) – tuple of NumPy arrays containing the current values for the objection function.
kwargs (dict) – keyword arguments for the objective function.
grad_fn (function) – optional gradient function of the objective function with respect to the variables
x
. IfNone
, the gradient function is computed automatically. Must return the same shape of tuple [array] as the autograd derivative.
- Returns
the NumPy array containing the gradient \(\nabla f(x^{(t)})\) and the objective function output. If
grad_fn
is provided, the objective function will not be evaluted and insteadNone
will be returned.- Return type
tuple [array]
-
reset
()¶ Reset optimizer by erasing memory of past steps.
-
step
(objective_fn, *args, grad_fn=None, **kwargs)¶ Update trainable arguments with one step of the optimizer.
- Parameters
objective_fn (function) – the objective function for optimization
*args – Variable length argument list for objective function
grad_fn (function) – optional gradient function of the objective function with respect to the variables
x
. IfNone
, the gradient function is computed automatically. Must return atuple[array]
with the same number of elements as*args
. Each array of the tuple should have the same shape as the corresponding argument.**kwargs – variable length of keyword arguments for the objective function
- Returns
the new variable values \(x^{(t+1)}\). If single arg is provided, list [array] is replaced by array.
- Return type
list [array]
-
step_and_cost
(objective_fn, *args, grad_fn=None, **kwargs)¶ Update trainable arguments with one step of the optimizer and return the corresponding objective function value prior to the step.
- Parameters
objective_fn (function) – the objective function for optimization
*args – variable length argument list for objective function
grad_fn (function) – optional gradient function of the objective function with respect to the variables
*args
. IfNone
, the gradient function is computed automatically. Must return atuple[array]
with the same number of elements as*args
. Each array of the tuple should have the same shape as the corresponding argument.**kwargs – variable length of keyword arguments for the objective function
- Returns
the new variable values \(x^{(t+1)}\) and the objective function output prior to the step. If single arg is provided, list [array] is replaced by array.
- Return type
tuple[list [array], float]