qml.gradients.spsa_grad¶

spsa_grad
(tape, argnum=None, h=1e05, approx_order=2, n=1, strategy='center', f0=None, validate_params=True, shots=None, num_directions=1, sampler=<function _rademacher_sampler>, sampler_seed=None)[source]¶ Transform a QNode to compute the SPSA gradient of all gate parameters with respect to its inputs. This estimator shifts all parameters simultaneously and approximates the gradient based on these shifts and a finitedifference method.
 Parameters
tape (pennylane.QNode or QuantumTape) – quantum tape or QNode to differentiate
argnum (int or list[int] or None) – Trainable parameter indices to differentiate with respect to. If not provided, the derivatives with respect to all trainable parameters are returned.
h (float or tensor_like[float]) – Step size for the finitedifference method underlying the SPSA. Can be a tensorlike object with as many entries as differentiated gate parameters
approx_order (int) – The approximation order of the finitedifference method underlying the SPSA gradient.
n (int) – compute the \(n\)th derivative
strategy (str) – The strategy of the underlying finite difference method. Must be one of
"forward"
,"center"
, or"backward"
. For the"forward"
strategy, the finitedifference shifts occur at the points \(x_0, x_0+h, x_0+2h,\dots\), where \(h\) is the stepsizeh
. The"backwards"
strategy is similar, but in reverse: \(x_0, x_0h, x_02h, \dots\). Finally, the"center"
strategy results in shifts symmetric around the unshifted point: \(\dots, x_02h, x_0h, x_0, x_0+h, x_0+2h,\dots\).f0 (tensor_like[float] or None) – Output of the evaluated input tape in
tape
. If provided, and the gradient recipe contains an unshifted term, this value is used, saving a quantum evaluation.validate_params (bool) – Whether to validate the tape parameters or not. If
True
, theOperation.grad_method
attribute and the circuit structure will be analyzed to determine if the trainable parameters support the finitedifference method, inferring that they support SPSA as well. IfFalse
, the SPSA gradient method will be applied to all parameters without checking.shots (None, int, list[int], list[ShotTuple]) – The device shots that will be used to execute the tapes outputted by this transform. Note that this argument doesn’t influence the shots used for tape execution, but provides information about the shots.
num_directions (int) – Number of sampled simultaneous perturbation vectors. An estimate for the gradient is computed for each vector using the underlying finitedifference method, and afterwards all estimates are averaged.
sampler (callable) –
Sampling method to obtain the simultaneous perturbation directions. The sampler should take the following arguments:
A
Sequence[int]
that contains the indices of those trainable tape parameters that will be perturbed, i.e. have nonzero entries in the output vector.An
int
that indicates the total number of trainable tape parameters. The size of the output vector has to match this input.An
int
indicating the iteration counter during the gradient estimation. A valid sampling method can, but does not have to, take this counter into account. In any case,sampler
has to accept this third argument.The keyword argument
seed
, expected to beNone
or anint
. This argument should be passed to some method that seeds any randomness used in the sampler.
Note that the circuit evaluations in the various sampled directions are averaged, not simply summed up.
sampler_seed (int or None) – Seed passed to
sampler
. The seed is passed in each call to the sampler, so that only one unique direction is sampled even ifnum_directions>1
.
 Returns
If the input is a QNode, an object representing the Jacobian (function) of the QNode that can be executed to obtain the Jacobian. The type of the Jacobian returned is either a tensor, a tuple or a nested tuple depending on the nesting structure of the original QNode output.
If the input is a tape, a tuple containing a list of generated tapes, together with a postprocessing function to be applied to the results of the evaluated tapes in order to obtain the Jacobian.
 Return type
function or tuple[list[QuantumTape], function]
Example
This gradient transform can be applied directly to
QNode
objects:>>> @qml.qnode(dev) ... def circuit(params): ... qml.RX(params[0], wires=0) ... qml.RY(params[1], wires=0) ... qml.RX(params[2], wires=0) ... return qml.expval(qml.PauliZ(0)), qml.var(qml.PauliZ(0)) >>> params = np.array([0.1, 0.2, 0.3], requires_grad=True) >>> qml.gradients.spsa_grad(circuit)(params) ((tensor(0.19280803, requires_grad=True), tensor(0.19280803, requires_grad=True), tensor(0.19280803, requires_grad=True)), (tensor(0.34786926, requires_grad=True), tensor(0.34786926, requires_grad=True), tensor(0.34786926, requires_grad=True)))
Note that the SPSA gradient is a statistical estimator that uses a given number of function evaluations that does not depend on the number of parameters. While this bounds the cost of the estimation, it also implies that the returned values are not exact (even for devices with
shots=None
) and that they will fluctuate. See the usage details below for more information.Usage Details
The number of directions in which the derivative is computed to estimate the gradient can be controlled with the keyword argument
num_directions
. For the QNode above, a more precise gradient estimation fromnum_directions=20
directions yields>>> qml.gradients.spsa_grad(circuit, num_directions=20)(params) ((tensor(0.27362235, requires_grad=True), tensor(0.07219669, requires_grad=True), tensor(0.36369011, requires_grad=True)), (tensor(0.49367656, requires_grad=True), tensor(0.13025915, requires_grad=True), tensor(0.65617915, requires_grad=True)))
We may compare this to the more precise values obtained from finite differences:
>>> qml.gradients.finite_diff(circuit)(params) ((tensor(0.38751724, requires_grad=True), tensor(0.18884792, requires_grad=True), tensor(0.38355708, requires_grad=True)), (tensor(0.69916868, requires_grad=True), tensor(0.34072432, requires_grad=True), tensor(0.69202365, requires_grad=True)))
As we can see, the SPSA output is a rather coarse approximation to the true gradient, and this although the parametershift rule for three parameters uses just six circuit evaluations, much fewer than SPSA! Consequentially, SPSA is not necessarily useful for small circuits with few parameters, but will pay off for large circuits where other gradient estimators require unfeasibly many circuit executions.
This quantum gradient transform can also be applied to lowlevel
QuantumTape
objects. This will result in no implicit quantum device evaluation. Instead, the processed tapes, and postprocessing function, which together define the gradient are directly returned:>>> with qml.tape.QuantumTape() as tape: ... qml.RX(params[0], wires=0) ... qml.RY(params[1], wires=0) ... qml.RX(params[2], wires=0) ... qml.expval(qml.PauliZ(0)) ... qml.var(qml.PauliZ(0)) >>> gradient_tapes, fn = qml.gradients.spsa_grad(tape) >>> gradient_tapes [<QuantumTape: wires=[0], params=3>, <QuantumTape: wires=[0], params=3>]
This can be useful if the underlying circuits representing the gradient computation need to be analyzed. Here we see that for
num_directions=1
, the default, we obtain two tapes.The output tapes can then be evaluated and postprocessed to retrieve the gradient:
>>> dev = qml.device("default.qubit", wires=2) >>> fn(qml.execute(gradient_tapes, dev, None)) ((array(0.58222637), array(0.58222637), array(0.58222637)), (array(1.05046797), array(1.05046797), array(1.05046797)))
Devices that have a shot vector defined can also be used for execution, provided the
shots
argument was passed to the transform:>>> shots = (10, 100, 1000) >>> dev = qml.device("default.qubit", wires=2, shots=shots) >>> @qml.qnode(dev) ... def circuit(params): ... qml.RX(params[0], wires=0) ... qml.RY(params[1], wires=0) ... qml.RX(params[2], wires=0) ... return qml.expval(qml.PauliZ(0)), qml.var(qml.PauliZ(0)) >>> params = np.array([0.1, 0.2, 0.3], requires_grad=True) >>> qml.gradients.spsa_grad(circuit, shots=shots, h=1e2)(params) (((array(0.), array(0.), array(0.)), (array(0.), array(0.), array(0.))), ((array(1.4), array(1.4), array(1.4)), (array(2.548), array(2.548), array(2.548))), ((array(1.06), array(1.06), array(1.06)), (array(1.90588), array(1.90588), array(1.90588))))
The outermost tuple contains results corresponding to each element of the shot vector, as is also visible by the increasing precision. Note that the stochastic approximation and the fluctuations from the shot noise of the device accumulate, leading to a very coarsegrained estimate for the gradient.