qml.gradients.batch_vjp¶
- batch_vjp(tapes, dys, gradient_fn, reduction='append', gradient_kwargs=None)[source]¶
Generate the gradient tapes and processing function required to compute the vector-Jacobian products of a batch of tapes.
Consider a function f(x). The Jacobian is given by
Jf(x)=(∂f1∂x1⋯∂f1∂xn⋮⋱⋮∂fm∂x1⋯∂fm∂xn).During backpropagation, the chain rule is applied. For example, consider the cost function h=y∘f:Rn→R, where y:Rm→R. The gradient is:
∇h(x)=∂y∂f∂f∂x=∂y∂fJf(x).Denote dy=∂y∂f; we can write this in the form of a matrix multiplication:
[∇h(x)]j=m∑i=0dyi Jij.Thus, we can see that the gradient of the cost function is given by the so-called vector-Jacobian product; the product of the row-vector dy, representing the gradient of subsequent components of the cost function, and J, the Jacobian of the current node of interest.
- Parameters
tapes (Sequence[QuantumTape]) – sequence of quantum tapes to differentiate
dys (Sequence[tensor_like]) – Sequence of gradient-output vectors
dy
. Must be the same length astapes
. Eachdy
tensor should have shape matching the output shape of the corresponding tape.gradient_fn (callable) – the gradient transform to use to differentiate the tapes
reduction (str) – Determines how the vector-Jacobian products are returned. If
append
, then the output of the function will be of the formList[tensor_like]
, with each element corresponding to the VJP of each input tape. Ifextend
, then the output VJPs will be concatenated.gradient_kwargs (dict) – dictionary of keyword arguments to pass when determining the gradients of tapes
- Returns
list of vector-Jacobian products.
None
elements corresponds to tapes with no trainable parameters.- Return type
List[tensor_like or None]
Example
Consider the following Torch-compatible quantum tapes:
import torch x = torch.tensor([[0.1, 0.2, 0.3], [0.4, 0.5, 0.6]], requires_grad=True, dtype=torch.float64) ops = [ qml.RX(x[0, 0], wires=0), qml.RY(x[0, 1], wires=1), qml.RZ(x[0, 2], wires=0), qml.CNOT(wires=[0, 1]), qml.RX(x[1, 0], wires=1), qml.RY(x[1, 1], wires=0), qml.RZ(x[1, 2], wires=1) ] measurements1 = [qml.expval(qml.Z(0)), qml.probs(wires=1)] tape1 = qml.tape.QuantumTape(ops, measurements1) measurements2 = [qml.expval(qml.Z(0) @ qml.Z(1))] tape2 = qml.tape.QuantumTape(ops, measurements2) tapes = [tape1, tape2]
Both tapes share the same circuit ansatz, but have different measurement outputs.
We can use the
batch_vjp
function to compute the vector-Jacobian product, given a list of gradient-output vectorsdys
per tape:>>> dys = [torch.tensor([1., 1., 1.], dtype=torch.float64), ... torch.tensor([1.], dtype=torch.float64)] >>> vjp_tapes, fn = qml.gradients.batch_vjp(tapes, dys, qml.gradients.param_shift)
Note that each
dy
has shape matching the output dimension of the tape (tape1
has 1 expectation and 2 probability values — 3 outputs — andtape2
has 1 expectation value).Executing the VJP tapes, and applying the processing function:
>>> dev = qml.device("default.qubit") >>> vjps = fn(qml.execute(vjp_tapes, dev, diff_method=qml.gradients.param_shift, interface="torch")) >>> vjps [tensor([-1.1562e-01, -1.3862e-02, -9.0841e-03, -1.5214e-16, -4.8217e-01, 2.1329e-17], dtype=torch.float64, grad_fn=<SumBackward1>), tensor([ 1.7393e-01, -1.6412e-01, -5.3983e-03, -2.9366e-01, -4.0083e-01, 2.1134e-17], dtype=torch.float64, grad_fn=<SqueezeBackward3>)]
We have two VJPs; one per tape. Each one corresponds to the number of parameters on the tapes (6).
The output VJPs are also differentiable with respect to the tape parameters:
>>> cost = torch.sum(vjps[0] + vjps[1]) >>> cost.backward() >>> x.grad tensor([[-0.4792, -0.9086, -0.2420], [-0.0930, -1.0772, 0.0000]], dtype=torch.float64)