qml.QNGOptimizer¶
-
class
QNGOptimizer
(stepsize=0.01, approx='block-diag', lam=0)[source]¶ Bases:
pennylane.optimize.gradient_descent.GradientDescentOptimizer
Optimizer with adaptive learning rate, via calculation of the diagonal or block-diagonal approximation to the Fubini-Study metric tensor. A quantum generalization of natural gradient descent.
The QNG optimizer uses a step- and parameter-dependent learning rate, with the learning rate dependent on the pseudo-inverse of the Fubini-Study metric tensor \(g\):
\[x^{(t+1)} = x^{(t)} - \eta g(f(x^{(t)}))^{-1} \nabla f(x^{(t)}),\]where \(f(x^{(t)}) = \langle 0 | U(x^{(t)})^\dagger \hat{B} U(x^{(t)}) | 0 \rangle\) is an expectation value of some observable measured on the variational quantum circuit \(U(x^{(t)})\).
Consider a quantum node represented by the variational quantum circuit
\[U(\mathbf{\theta}) = W(\theta_{i+1}, \dots, \theta_{N})X(\theta_{i}) V(\theta_1, \dots, \theta_{i-1}),\]where all parametrized gates can be written of the form \(X(\theta_{i}) = e^{i\theta_i K_i}\). That is, the gate \(K_i\) is the generator of the parametrized operation \(X(\theta_i)\) corresponding to the \(i\)-th parameter.
For each parametric layer \(\ell\) in the variational quantum circuit containing \(n\) parameters, the \(n\times n\) block-diagonal submatrix of the Fubini-Study tensor \(g_{ij}^{(\ell)}\) is calculated directly on the quantum device in a single evaluation:
\[g_{ij}^{(\ell)} = \langle \psi_\ell | K_i K_j | \psi_\ell \rangle - \langle \psi_\ell | K_i | \psi_\ell\rangle \langle \psi_\ell |K_j | \psi_\ell\rangle\]where \(|\psi_\ell\rangle = V(\theta_1, \dots, \theta_{i-1})|0\rangle\) (that is, \(|\psi_\ell\rangle\) is the quantum state prior to the application of parameterized layer \(\ell\)).
Combining the quantum natural gradient optimizer with the analytic parameter-shift rule to optimize a variational circuit with \(d\) parameters and \(L\) layers, a total of \(2d+L\) quantum evaluations are required per optimization step.
For more details, see:
James Stokes, Josh Izaac, Nathan Killoran, Giuseppe Carleo. “Quantum Natural Gradient.” Quantum 4, 269, 2020.
Note
The QNG optimizer supports using a single
QNode
as the objective function. Alternatively, the metric tensor can directly be provided to thestep()
method of the optimizer, using themetric_tensor_fn
keyword argument.For the following cases, providing
metric_tensor_fn
may be useful:For hybrid classical-quantum models, the “mixed geometry” of the model makes it unclear which metric should be used for which parameter. For example, parameters of quantum nodes are better suited to one metric (such as the QNG), whereas others (e.g., parameters of classical nodes) are likely better suited to another metric.
For multi-QNode models, we don’t know what geometry is appropriate if a parameter is shared amongst several QNodes.
Examples:
For VQE/VQE-like problems, the objective function for the optimizer can be realized as a
QNode
that returns the expectation value of a Hamiltonian.>>> dev = qml.device("default.qubit", wires=(0, 1, "aux")) >>> @qml.qnode(dev) ... def circuit(params): ... qml.RX(params[0], wires=0) ... qml.RY(params[1], wires=0) ... return qml.expval(qml.X(0) + qml.X(1))
Once constructed, the cost function can be passed directly to the optimizer’s
step()
function:>>> eta = 0.01 >>> init_params = np.array([0.011, 0.012]) >>> opt = qml.QNGOptimizer(eta) >>> theta_new = opt.step(circuit, init_params) >>> theta_new tensor([ 0.01100528, -0.02799954], requires_grad=True)
An alternative function to calculate the metric tensor of the QNode can be provided to
step
via themetric_tensor_fn
keyword argument. For example, we can provide a function to calculate the metric tensor via the adjoint method.>>> adj_metric_tensor = qml.adjoint_metric_tensor(circuit, circuit.device) >>> opt.step(circuit, init_params, metric_tensor_fn=adj_metric_tensor) tensor([ 0.01100528, -0.02799954], requires_grad=True)
Note
If the objective function takes multiple trainable arguments,
QNGOptimizer
applies the metric tensor for each argument individually. This means that “correlations” between parameters from different arguments are not taken into account. In order to take all correlations into account within the optimization, consider combining all parameters into one objective function argument.See also
See the quantum natural gradient example for more details on the Fubini-Study metric tensor and this optimization class.
- Keyword Arguments
stepsize=0.01 (float) – the user-defined hyperparameter \(\eta\)
approx (str) –
Which approximation of the metric tensor to compute.
If
None
, the full metric tensor is computedIf
"block-diag"
, the block-diagonal approximation is computed, reducing the number of evaluated circuits significantly.If
"diag"
, only the diagonal approximation is computed, slightly reducing the classical overhead but not the quantum resources (compared to"block-diag"
).
lam=0 (float) – metric tensor regularization \(G_{ij}+\lambda I\) to be applied at each optimization step
Methods
apply_grad
(grad, args)Update the parameter array \(x\) for a single optimization step.
compute_grad
(objective_fn, args, kwargs[, …])Compute gradient of the objective function at the given point and return it along with the objective function forward pass (if available).
step
(qnode, *args[, grad_fn, …])Update the parameter array \(x\) with one step of the optimizer.
step_and_cost
(qnode, *args[, grad_fn, …])Update the parameter array \(x\) with one step of the optimizer and return the corresponding objective function value prior to the step.
-
apply_grad
(grad, args)[source]¶ Update the parameter array \(x\) for a single optimization step. Flattens and unflattens the inputs to maintain nested iterables as the parameters of the optimization.
- Parameters
grad (array) – The gradient of the objective function at point \(x^{(t)}\): \(\nabla f(x^{(t)})\)
args (array) – the current value of the variables \(x^{(t)}\)
- Returns
the new values \(x^{(t+1)}\)
- Return type
array
-
static
compute_grad
(objective_fn, args, kwargs, grad_fn=None)¶ Compute gradient of the objective function at the given point and return it along with the objective function forward pass (if available).
- Parameters
objective_fn (function) – the objective function for optimization
args (tuple) – tuple of NumPy arrays containing the current parameters for the objection function
kwargs (dict) – keyword arguments for the objective function
grad_fn (function) – optional gradient function of the objective function with respect to the variables
args
. IfNone
, the gradient function is computed automatically. Must return the same shape of tuple [array] as the autograd derivative.
- Returns
NumPy array containing the gradient \(\nabla f(x^{(t)})\) and the objective function output. If
grad_fn
is provided, the objective function will not be evaluted and insteadNone
will be returned.- Return type
tuple (array)
-
step
(qnode, *args, grad_fn=None, recompute_tensor=True, metric_tensor_fn=None, **kwargs)[source]¶ Update the parameter array \(x\) with one step of the optimizer.
- Parameters
qnode (QNode) – the QNode for optimization
*args – variable length argument list for qnode
grad_fn (function) – optional gradient function of the qnode with respect to the variables
*args
. IfNone
, the gradient function is computed automatically. Must return atuple[array]
with the same number of elements as*args
. Each array of the tuple should have the same shape as the corresponding argument.recompute_tensor (bool) – Whether or not the metric tensor should be recomputed. If not, the metric tensor from the previous optimization step is used.
metric_tensor_fn (function) – Optional metric tensor function with respect to the variables
args
. IfNone
, the metric tensor function is computed automatically.**kwargs – variable length of keyword arguments for the qnode
- Returns
the new variable values \(x^{(t+1)}\)
- Return type
array
-
step_and_cost
(qnode, *args, grad_fn=None, recompute_tensor=True, metric_tensor_fn=None, **kwargs)[source]¶ Update the parameter array \(x\) with one step of the optimizer and return the corresponding objective function value prior to the step.
- Parameters
qnode (QNode) – the QNode for optimization
*args – variable length argument list for qnode
grad_fn (function) – optional gradient function of the qnode with respect to the variables
*args
. IfNone
, the gradient function is computed automatically. Must return atuple[array]
with the same number of elements as*args
. Each array of the tuple should have the same shape as the corresponding argument.recompute_tensor (bool) – Whether or not the metric tensor should be recomputed. If not, the metric tensor from the previous optimization step is used.
metric_tensor_fn (function) – Optional metric tensor function with respect to the variables
args
. IfNone
, the metric tensor function is computed automatically.**kwargs – variable length of keyword arguments for the qnode
- Returns
the new variable values \(x^{(t+1)}\) and the objective function output prior to the step
- Return type
tuple