Source code for pennylane.gradients.jvp

# Copyright 2018-2021 Xanadu Quantum Technologies Inc.

# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at

#     http://www.apache.org/licenses/LICENSE-2.0

# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""
This module contains functions for computing the Jacobian vector product
of tapes.
"""
from collections.abc import Sequence

import numpy as np

import pennylane as qml
from pennylane._device import _get_num_copies
from pennylane.measurements import ProbabilityMP


[docs]def compute_jvp_single(tangent, jac): r"""Convenience function to compute the Jacobian vector product for a given tangent vector and a Jacobian for a single measurement tape. Args: tangent (list, tensor_like): tangent vector jac (tensor_like, tuple): Jacobian matrix Returns: tensor_like: the Jacobian vector product **Examples** We start with a number of examples. A more complete, technical description is given further below. 1. For a single parameter and a single measurement without shape (e.g. ``expval``, ``var``): .. code-block:: pycon >>> tangent = np.array([1.0]) >>> jac = np.array(0.2) >>> qml.gradients.compute_jvp_single(tangent, jac) np.array(0.2) 2. For a single parameter and a single measurement with shape (e.g. ``probs``): .. code-block:: pycon >>> tangent = np.array([2.0]) >>> jac = np.array([0.3, 0.4]) >>> qml.gradients.compute_jvp_single(tangent, jac) np.array([0.6, 0.8]) 3. For multiple parameters (in this case 2 parameters) and a single measurement without shape (e.g. ``expval``, ``var``): .. code-block:: pycon >>> tangent = np.array([1.0, 2.0]) >>> jac = tuple([np.array(0.1), np.array(0.2)]) >>> qml.gradients.compute_jvp_single(tangent, jac) np.array(0.5) 4. For multiple parameters (in this case 2 parameters) and a single measurement with shape (e.g. ``probs``): .. code-block:: pycon >>> tangent = np.array([1.0, 0.5]) >>> jac = tuple([np.array([0.1, 0.3]), np.array([0.2, 0.4])]) >>> qml.gradients.compute_jvp_single(tangent, jac) np.array([0.2, 0.5]) .. details:: :title: Technical description :href: technical-description There are multiple case distinctions in this function, for particular examples see above. - The JVP may be for one **(A)** or multiple **(B)** parameters. We call the number of parameters ``k`` - The number ``R`` of tape return type dimensions may be between 0 and 3. We call the return type dimensions ``r_j`` - Each parameter may have an arbitrary number ``L_i>=0`` of dimensions In the following, ``(a, b)`` denotes a tensor_like of shape ``(a, b)`` and ``[(a,), (b,)]`` / ``((a,), (b,))`` denotes a ``list`` / ``tuple`` of tensors with the indicated shapes, respectively. Ignore the case of no trainable parameters, as it is filtered out in advance. For scenario **(A)**, the input shapes can be in .. list-table:: :widths: 30 40 30 :header-rows: 1 * - ``tangent`` shape - ``jac`` shape - Comment * - ``(1,)`` or ``[()]`` or ``(())`` - ``()`` - scalar return, scalar parameter * - ``(1,)`` or ``[()]`` or ``(())`` - ``(r_1,..,r_R)`` - tensor return, scalar parameter * - ``[(l_1,..,l_{L_1})]`` [1] - ``(l_1,..,l_{L_1})`` - scalar return, tensor parameter * - ``[(l_1,..,l_{L_1})]`` [1] - ``(r_1,..,r_R, l_1,..,l_{L_1})`` - tensor return, tensor parameter [1] Note that intuitively, ``tangent`` could be allowed to be a tensor of shape ``(l_1,..,l_{L_1})`` without an outer list. However, this is excluded in order to allow for the distinction from scenario **(B)**. Internally, this input shape for ``tangent`` never occurs for scenario **(A)**. In this scenario, the tangent is reshaped into a one-dimensional tensor with shape ``(tangent_size,)`` and the Jacobian is reshaped to have the dimensions ``(r_1, ... r_R, tangent_size)``. This is followed by a ``tensordot`` contraction over the ``tangent_size`` axis of both tensors. For scenario **(B)**, the input shapes can be in .. list-table:: :widths: 30 40 30 :header-rows: 1 * - ``tangent`` shape - ``jac`` shape - Comment * - ``(k,)`` or ``[(),..,()]`` or ``((),..,())`` - ``((),..,())`` (length ``k``) - scalar return, ``k`` scalar parameters * - ``(k,)`` or ``[(),..,()]`` or ``((),..,())`` - ``((r_1,..,r_R),..,(r_1,..,r_R))`` [1] - tensor return, ``k`` scalar parameters * - ``[(l_1,..,l_{L_1}),..,(l_1,..,l_{L_k})]`` - ``((l_1,..,l_{L_1}),..,(l_1,..,l_{L_k}))`` - scalar return, ``k`` tensor parameters * - ``[(l_1,..,l_{L_1}),..,(l_1,..,l_{L_k})]`` - ``((r_1,..,r_R, l_1,..,l_{L_1}),..,(r_1,..,r_R, l_1,..,l_{L_k}))`` [1] - tensor return, ``k`` tensor parameters [1] Note that the return type dimensions ``(r_1,..,r_R)`` are the same for all entries of ``jac``, whereas the dimensions of the entries in ``tanget``, and the according dimensions ``(l_1,..,l_{L_k})`` of the ``jac`` entries may differ. In this scenario, another case separation is used: If any of the parameters is a tensor (i.e. not a scalar), all tangent entries are reshaped into one-dimensional tensors with shapes ``(tangent_size_i,)`` and then stacked into one one-dimensional tensor. If there are no tensor parameters, the tangent is just stacked and reshaped. The Jacobians are reshaped to have the dimensions ``(r_1, ... r_R, tangent_size_i)`` and then are concatenated along their last (potentially mismatching) axis. This is followed by a tensordot contraction over the axes of size :math:`\sum_i` ``tangent_size_i``. """ if jac is None: return None single_param = not isinstance(jac, tuple) if (single_param and jac.shape == (0,)) or (not single_param and len(jac) == 0): # No trainable parameters return qml.math.zeros((1, 0)) if single_param: tangent = qml.math.stack(tangent) first_tangent_ndim = len(tangent.shape[1:]) tangent = qml.math.flatten(tangent) tangent_size = tangent.shape[0] shape = jac.shape new_shape = shape[: len(shape) - first_tangent_ndim] + (tangent_size,) jac = qml.math.cast(qml.math.convert_like(jac, tangent), tangent.dtype) jac = qml.math.reshape(jac, new_shape) return qml.math.tensordot(jac, tangent, [[-1], [0]]) tangent_ndims = [t.ndim for t in tangent] if isinstance(tangent, (tuple, list)) and any(t.ndim > 0 for t in tangent): # At least one tangent entry is not a scalar, requiring us to flatten them and hstack tangent = [qml.math.flatten(t) for t in tangent] tangent_sizes = [t.shape[0] for t in tangent] tangent = qml.math.hstack(tangent) else: # Only scalar tangent entries, no flattening required and we may use stack tangent_sizes = [1] * len(tangent) tangent = qml.math.stack(tangent) jac_shapes = [j.shape for j in jac] new_shapes = [ shape[: len(shape) - t_ndim] + (tsize,) for shape, t_ndim, tsize in zip(jac_shapes, tangent_ndims, tangent_sizes) ] jac = qml.math.concatenate([qml.math.reshape(j, s) for j, s in zip(jac, new_shapes)], axis=-1) jac = qml.math.cast(qml.math.convert_like(jac, tangent), tangent.dtype) return qml.math.tensordot(jac, tangent, [[-1], [0]])
[docs]def compute_jvp_multi(tangent, jac): """Convenience function to compute the Jacobian-vector product for a given vector of gradient outputs and a Jacobian for a tape with multiple measurements. Args: tangent (tensor_like, list): tangent vector jac (tensor_like, tuple): Jacobian matrix Returns: tensor_like: the Jacobian-vector product **Examples** 1. For a single parameter and multiple measurements (one without shape and one with shape, e.g. expval and probs): .. code-block:: pycon >>> tangent = np.array([2.0]) >>> jac = tuple([np.array([0.3]), np.array([0.2, 0.5])]) >>> qml.gradients.compute_jvp_multi(tangent, jac) (np.array([0.6]), np.array([0.4, 1. ])) 2. For multiple parameters (in this case 2 parameters) and multiple measurements (one without shape and one with shape, e.g. expval and probs): .. code-block:: pycon >>> tangent = np.array([1.0, 2.0]) >>> jac = tuple([tuple([np.array([0.3]), np.array([0.4])]), tuple([np.array([0.2, 0.5]), np.array([0.3, 0.8])]),]) >>> qml.gradients.compute_jvp_multi(tangent, jac) (np.array([1.1]), np.array([0.8, 2.1])) """ if jac is None: return None return tuple(compute_jvp_single(tangent, j) for j in jac)
[docs]def jvp(tape, tangent, gradient_fn, shots=None, gradient_kwargs=None): r"""Generate the gradient tapes and processing function required to compute the Jacobian vector product of a tape. This function only works with the new return type system on. Args: tape (.QuantumTape): quantum tape to differentiate tangent (tensor_like, list): Gradient-output vector. Must have shape matching the number of trainable parameters. gradient_fn (callable): the gradient transform to use to differentiate the tape shots (None, int, list[int]): The device shots that will be used to execute the tapes outputted by this gradient_kwargs (dict): dictionary of keyword arguments to pass when determining the gradients of tapes Returns: tensor_like or tuple or None: Jacobian vector product. Returns None if the tape has no trainable parameters. **Example** Consider the following quantum tape with Jax parameters: .. code-block:: python import jax qml.enable_return() x = jax.numpy.array([[0.1, 0.2, 0.3], [0.4, 0.5, 0.6]]) with qml.tape.QuantumTape() as tape: qml.RX(x[0, 0], wires=0) qml.RY(x[0, 1], wires=1) qml.RZ(x[0, 2], wires=0) qml.CNOT(wires=[0, 1]) qml.RX(x[1, 0], wires=1) qml.RY(x[1, 1], wires=0) qml.RZ(x[1, 2], wires=1) qml.expval(qml.PauliZ(0)) qml.probs(wires=1) We can use the ``jvp`` function to compute the Jacobian vector product, given a tangent vector ``tangent``: >>> tangent = [jax.numpy.array(1.0), jax.numpy.array(1.0), jax.numpy.array(1.0), jax.numpy.array(1.0), jax.numpy.array(1.0), jax.numpy.array(1.0)] >>> jvp_tapes, fn = qml.gradients.jvp(tape, tangent, qml.gradients.param_shift) Note that ``tangent`` has six elements, matching the parameter dimension of the tape. Executing the JVP tapes, and applying the processing function: >>> dev = qml.device("default.qubit", wires=2) >>> jvp = fn(dev.batch_execute(jvp_tapes)) >>> jvp (Array(-0.62073976, dtype=float32), Array([-0.3259707 , 0.32597077], dtype=float32)) """ if len(tape.trainable_params) == 0: # The tape has no trainable parameters; the JVP # is simply none. return [], lambda _, num=None: None multi_m = len(tape.measurements) > 1 try: # if qml.math.allclose(qml.math.stack(tangent), 0): if qml.math.allclose(tangent, 0): # If the tangent vector is zero, then the # corresponding element of the JVP will be zero, # and we can avoid a quantum computation. def func(_): # pylint: disable=unused-argument # TODO: Update shape for CV variables and for qutrit simulations res = tuple(_single_measurement_zero(m, tangent) for m in tape.measurements) if not multi_m: res = res[0] return res return [], func except (AttributeError, TypeError): pass gradient_kwargs = gradient_kwargs or {} gradient_tapes, fn = gradient_fn(tape, shots=shots, **gradient_kwargs) def processing_fn(results): # postprocess results to compute the Jacobian jac = fn(results) _jvp_fn = compute_jvp_multi if multi_m else compute_jvp_single # Jacobian without shot vectors if not isinstance(shots, Sequence): return _jvp_fn(tangent, jac) # The jacobian is calculated for shot vectors len_shot_vec = _get_num_copies(shots) return tuple(_jvp_fn(tangent, jac[i]) for i in range(len_shot_vec)) return gradient_tapes, processing_fn
[docs]def batch_jvp(tapes, tangents, gradient_fn, shots=None, reduction="append", gradient_kwargs=None): r"""Generate the gradient tapes and processing function required to compute the Jacobian vector products of a batch of tapes. Args: tapes (Sequence[.QuantumTape]): sequence of quantum tapes to differentiate tangents (Sequence[tensor_like]): Sequence of gradient-output vectors ``dy``. Must be the same length as ``tapes``. Each ``dy`` tensor should have shape matching the output shape of the corresponding tape. gradient_fn (callable): the gradient transform to use to differentiate the tapes shots (None, int, list[int]): The device shots that will be used to execute the tapes outputted by this reduction (str): Determines how the Jacobian-vector products are returned. If ``append``, then the output of the function will be of the form ``List[tensor_like]``, with each element corresponding to the JVP of each input tape. If ``extend``, then the output JVPs will be concatenated. gradient_kwargs (dict): dictionary of keyword arguments to pass when determining the gradients of tapes Returns: List[tensor_like or None]: list of Jacobian vector products. ``None`` elements corresponds to tapes with no trainable parameters. **Example** .. code-block:: python import jax qml.enable_return() x = jax.numpy.array([[0.1, 0.2, 0.3], [0.4, 0.5, 0.6]]) def ansatz(x): qml.RX(x[0, 0], wires=0) qml.RY(x[0, 1], wires=1) qml.RZ(x[0, 2], wires=0) qml.CNOT(wires=[0, 1]) qml.RX(x[1, 0], wires=1) qml.RY(x[1, 1], wires=0) qml.RZ(x[1, 2], wires=1) with qml.tape.QuantumTape() as tape1: ansatz(x) qml.expval(qml.PauliZ(0)) qml.probs(wires=1) with qml.tape.QuantumTape() as tape2: ansatz(x) qml.expval(qml.PauliZ(0) @ qml.PauliZ(1)) tapes = [tape1, tape2] Both tapes share the same circuit ansatz, but have different measurement outputs. We can use the ``batch_jvp`` function to compute the Jacobian vector product, given a list of tangents ``tangent``: >>> tangent_0 = [jax.numpy.array(1.0), jax.numpy.array(1.0), jax.numpy.array(1.0), jax.numpy.array(1.0), jax.numpy.array(1.0), jax.numpy.array(1.0)] >>> tangent_1 = [jax.numpy.array(1.0), jax.numpy.array(1.0), jax.numpy.array(1.0), jax.numpy.array(1.0), jax.numpy.array(1.0), jax.numpy.array(1.0)] >>> tangents = [tangent_0, tangent_1] Note that each ``tangents`` has shape matching the parameter dimension of the tape. Executing the JVP tapes, and applying the processing function: >>> jvp_tapes, fn = qml.gradients.batch_jvp(tapes, tangents, qml.gradients.param_shift) >>> dev = qml.device("default.qubit", wires=2) >>> jvps = fn(dev.batch_execute(jvp_tapes)) >>> jvps [(Array(-0.62073976, dtype=float32), Array([-0.3259707 , 0.32597077], dtype=float32)), Array(-0.6900841, dtype=float32)] We have two JVPs; one per tape. Each one corresponds to the shape of the output of their respective tape. """ # pylint: disable=too-many-arguments gradient_kwargs = gradient_kwargs or {} reshape_info = [] gradient_tapes = [] processing_fns = [] # Loop through the tapes and dys vector for tape, tangent in zip(tapes, tangents): g_tapes, fn = jvp(tape, tangent, gradient_fn, shots, gradient_kwargs) reshape_info.append(len(g_tapes)) processing_fns.append(fn) gradient_tapes.extend(g_tapes) def processing_fn(results): jvps = [] start = 0 for t_idx in range(len(tapes)): # extract the correct results from the flat list res_len = reshape_info[t_idx] res_t = results[start : start + res_len] start += res_len # postprocess results to compute the JVP jvp_ = processing_fns[t_idx](res_t) if jvp_ is None: if reduction == "append": jvps.append(None) continue if isinstance(reduction, str): getattr(jvps, reduction)(jvp_) elif callable(reduction): reduction(jvps, jvp_) return jvps return gradient_tapes, processing_fn
def _single_measurement_zero(m, tangent): """Aux function to create a zero tensor from a measurement.""" dim = 2 ** len(m.wires) if isinstance(m, ProbabilityMP) else () res = qml.math.convert_like(np.zeros(dim), tangent) res = qml.math.cast_like(res, tangent) return res