Compiling circuits

PennyLane offers multiple tools for compiling circuits. We use the term “compilation” here in a loose sense as the process of transforming one circuit into one or more differing circuits. A circuit could be either a quantum function or a sequence of operators. For example, such a transformation could replace a gate type with another, fuse gates, exploit mathematical relations that simplify an observable, or replace a large circuit by a number of smaller circuits.

Compilation functionality is mostly designed as transforms, which you can read up on in the section on inspecting circuits.

In addition to quantum circuit transforms, PennyLane also supports experimental just-in-time compilation, via Catalyst. This is more general, and supports full hybrid compilation — compiling both the classical and quantum components of your workflow into a binary that can be run close to the accelerators. that you are using.

Simplifying Operators

PennyLane provides the simplify() function to simplify single operators, quantum functions, QNodes and tapes. This function has several purposes:

  • Reducing the arithmetic depth of the given operators to its minimum.

  • Grouping like terms in sums and products.

  • Resolving products of Pauli operators.

  • Combining identical rotation gates by summing its angles.

Here are some simple simplification routines:

>>> qml.simplify(qml.RX(4*np.pi+0.1, 0 ))
RX(0.09999999999999964, wires=[0])
>>> qml.simplify(qml.adjoint(qml.RX(1.23, 0)))
RX(11.336370614359172, wires=[0])
>>> qml.simplify(qml.ops.Pow(qml.RX(1, 0), 3))
RX(3.0, wires=[0])
>>> qml.simplify(qml.sum(qml.PauliY(3), qml.PauliY(3)))
>>> qml.simplify(qml.RX(1, 0) @ qml.RX(1, 0))
RX(2.0, wires=[0])
>>> qml.simplify(, qml.PauliZ(0)))

Now lets simplify a nested operator:

>>> sum_op = qml.RX(1, 0) + qml.PauliX(0)
>>> prod1 = qml.PauliX(0) @ sum_op
>>> nested_op = prod1 @ qml.RX(1, 0)
>>> qml.simplify(nested_op)
(PauliX(wires=[0]) @ RX(2.0, wires=[0])) + RX(1.0, wires=[0])

Several simplifications steps are happening here. First of all, the nested products are removed:, qml.sum(qml.RX(1, 0), qml.PauliX(0)), qml.RX(1, 0))

Then the product of sums is transformed into a sum of products:

qml.sum(, qml.RX(1, 0), qml.RX(1, 0)),, qml.PauliX(0), qml.RX(1, 0)))

And finally like terms in the obtained products are grouped together, removing all identities:

qml.sum(, qml.RX(2, 0)), qml.RX(1, 0))

As mentioned earlier we can also simplify QNode objects to, for example, group rotation gates:

dev = qml.device("default.qubit", wires=2)

def circuit(x):
        qml.RX(x[0], wires=0)
        @ qml.RY(x[1], wires=1)
        @ qml.RZ(x[2], wires=2)
        @ qml.RX(-1, wires=0)
        @ qml.RY(-2, wires=1)
        @ qml.RZ(2, wires=2)
    return qml.probs([0, 1, 2])
>>> x = [1, 2, 3]
>>> print(qml.draw(circuit)(x))
0: ───────────┤ ╭Probs
1: ───────────┤ ├Probs
2: ──RZ(5.00)─┤ ╰Probs

Compilation transforms for circuit optimization

PennyLane includes multiple transforms that take quantum functions and return new quantum functions of optimized circuits:


Quantum function transform to remove any operations that are applied next to their (self-)inverses or adjoint.


Quantum function transform to move commuting gates past control and target qubits of controlled operations.


Quantum function transform to combine amplitude embedding templates that act on different qubits.


Quantum function transform to remove any operations that are applied next to their (self-)inverses or adjoint.


Quantum function transform to combine rotation gates of the same type that act sequentially.


Function that applies the pattern matching algorithm and returns the list of maximal matches.


Quantum function transform to remove Barrier gates.


Quantum function transform to fuse together groups of single-qubit operations into a general single-qubit unitary operation (Rot).


Quantum function transform to remove SWAP gates by running from right to left through the circuit changing the position of the qubits accordingly.


Most compilation transforms support just-in-time compilation with jax.jit.

The compile() transform allows you to chain together sequences of quantum function transforms into custom circuit optimization pipelines.

For example, take the following decorated quantum function:

dev = qml.device('default.qubit', wires=[0, 1, 2])

def qfunc(x, y, z):
    qml.RZ(z, wires=2)
    qml.CNOT(wires=[2, 1])
    qml.RX(z, wires=0)
    qml.CNOT(wires=[1, 0])
    qml.RX(x, wires=0)
    qml.CNOT(wires=[1, 0])
    qml.RZ(-z, wires=2)
    qml.RX(y, wires=2)
    qml.CZ(wires=[1, 2])
    return qml.expval(qml.PauliZ(wires=0))

The default behaviour of compile() applies a sequence of three transforms: commute_controlled(), cancel_inverses(), and then merge_rotations().

>>> print(qml.draw(qfunc)(0.2, 0.3, 0.4))
0: ──H──RX(0.60)─────────────────┤  <Z>
1: ──H─╭X─────────────────────╭●─┤
2: ──H─╰●─────────RX(0.30)──Y─╰Z─┤

The compile() transform is flexible and accepts a custom pipeline of quantum function transforms (you can even write your own!). For example, if we wanted to only push single-qubit gates through controlled gates and cancel adjacent inverses, we could do:

from pennylane.transforms import commute_controlled, cancel_inverses
pipeline = [commute_controlled, cancel_inverses]

def qfunc(x, y, z):
    qml.RZ(z, wires=2)
    qml.CNOT(wires=[2, 1])
    qml.RX(z, wires=0)
    qml.CNOT(wires=[1, 0])
    qml.RX(x, wires=0)
    qml.CNOT(wires=[1, 0])
    qml.RZ(-z, wires=2)
    qml.RX(y, wires=2)
    qml.CZ(wires=[1, 2])
    return qml.expval(qml.PauliZ(wires=0))
>>> print(qml.draw(qfunc)(0.2, 0.3, 0.4))
0: ──H──RX(0.40)──RX(0.20)────────────────────────────┤  <Z>
1: ──H─╭X──────────────────────────────────────────╭●─┤
2: ──H─╰●─────────RZ(0.40)──RZ(-0.40)──RX(0.30)──Y─╰Z─┤


The Barrier operator can be used to prevent blocks of code from being merged during compilation.

For more details on compile() and the available compilation transforms, visit the compilation documentation.

Custom decompositions

PennyLane decomposes gates unknown to the device into other, “lower-level” gates. As a user, you may want to fine-tune this mechanism. For example, you may wish your circuit to use different fundamental gates.

For example, suppose we would like to implement the following QNode:

def circuit(weights):
    qml.BasicEntanglerLayers(weights, wires=[0, 1, 2])
    return qml.expval(qml.PauliZ(0))

original_dev = qml.device("default.qubit", wires=3)
original_qnode = qml.QNode(circuit, original_dev)
>>> weights = np.array([[0.4, 0.5, 0.6]])
>>> print(qml.draw(original_qnode, expansion_strategy="device")(weights))
0: ──RX(0.40)─╭●────╭X─┤  <Z>
1: ──RX(0.50)─╰X─╭●─│──┤
2: ──RX(0.60)────╰X─╰●─┤

Now, let’s swap out PennyLane’s default decomposition of the CNOT gate into CZ and Hadamard. We define the custom decompositions like so, and pass them to a device:

def custom_cnot(wires):
    return [
        qml.CZ(wires=[wires[0], wires[1]]),

custom_decomps = {qml.CNOT: custom_cnot}

decomp_dev = qml.device("default.qubit", wires=3, custom_decomps=custom_decomps)
decomp_qnode = qml.QNode(circuit, decomp_dev)

Now when we draw or run a QNode on this device, the gates will be expanded according to our specifications:

>>> print(qml.draw(decomp_qnode, expansion_strategy="device")(weights))
0: ──RX(0.40)────╭●──H───────╭Z──H─┤  <Z>
1: ──RX(0.50)──H─╰Z──H─╭●────│─────┤
2: ──RX(0.60)──H───────╰Z──H─╰●────┤


If the custom decomposition is only supposed to be used in a specific code context, a separate context manager set_decomposition() can be used.

Circuit cutting

Circuit cutting allows you to replace a circuit with N wires by a set of circuits with less than N wires (see also Peng et. al). Of course this comes with a cost: The smaller circuits require a greater number of device executions to be evaluated.

In PennyLane, circuit cutting can be activated by positioning WireCut operators at the desired cut locations, and by decorating the QNode with the cut_circuit() transform.

The example below shows how a three-wire circuit can be run on a two-wire device:

dev = qml.device("default.qubit", wires=2)

def circuit(x):
    qml.RX(x, wires=0)
    qml.RY(0.9, wires=1)
    qml.RX(0.3, wires=2)

    qml.CZ(wires=[0, 1])
    qml.RY(-0.4, wires=0)


    qml.CZ(wires=[1, 2])

    return qml.expval(qml.pauli.string_to_pauli_word("ZZZ"))

Instead of being executed directly, the circuit will be partitioned into smaller fragments according to the WireCut locations, and each fragment will be executed multiple times. PennyLane automatically combines the results of the fragment executions to recover the expected output of the original uncut circuit.

>>> x = np.array(0.531, requires_grad=True)
>>> circuit(0.531)

Circuit cutting support is also differentiable:

>>> qml.grad(circuit)(x)


Simulated quantum circuits that produce samples can be cut using the cut_circuit_mc() transform, which is based on the Monte Carlo method.

Groups of commuting Pauli words

Mutually commuting Pauli words can be measured simultaneously on a quantum computer. Finding groups of mutually commuting observables can therefore reduce the number of circuit executions, and is an example of how observables can be “compiled”.

PennyLane contains different functionalities for this purpose, ranging from higher-level transforms acting on QNodes to lower-level functions acting on operators.

An example of a transform manipulating QNodes is split_non_commuting(). It turns a QNode that measures non-commuting observables into a QNode that internally uses multiple circuit executions with qubit-wise commuting groups. The transform is used by devices to make such measurements possible.

On a lower level, the group_observables() function can be used to split lists of observables and coefficients:

>>> obs = [qml.PauliY(0), qml.PauliX(0) @ qml.PauliX(1), qml.PauliZ(1)]
>>> coeffs = [1.43, 4.21, 0.97]
>>> obs_groupings, coeffs_groupings = qml.pauli.group_observables(obs, coeffs, 'anticommuting', 'lf')
>>> obs_groupings
[[PauliZ(wires=[1]), PauliX(wires=[0]) @ PauliX(wires=[1])],
>>> coeffs_groupings
[[0.97, 4.21], [1.43]]

This and more logic to manipulate Pauli observables is found in the pauli module.

Just-in-time compilation with Catalyst

In addition to quantum circuit transformations, PennyLane also supports full hybrid just-in-time (JIT) compilation via Catalyst. Catalyst allows you to compile the entire quantum-classical workflow, including any optimization loops, which allows for optimized performance, and the ability to run the entire workflow on accelerator devices as appropriate.

Currently, Catalyst must be installed separately, and only supports the JAX interface and lightning.qubit. Check out the Catalyst documentation for installation instructions.

Using Catalyst with PennyLane is a simple as using the @qjit decorator to compile your hybrid workflows:

from catalyst import qjit
from jax import numpy as jnp

dev = qml.device("lightning.qubit", wires=2, shots=1000)

def cost(params):
    qml.RX(jnp.sin(params[0]) ** 2, wires=1)
    qml.CRY(params[0], wires=[0, 1])
    qml.RX(jnp.sqrt(params[1]), wires=1)
    return qml.expval(qml.PauliZ(1))

The qjit decorator can also be used on hybrid cost functions – that is, cost functions that include both QNodes and classical processing. We can even JIT compile the full optimization loop, for example when training models:

import jaxopt

def optimization():
    # initial parameter
    params = jnp.array([0.54, 0.3154])

    # define the optimizer
    opt = jaxopt.GradientDescent(cost, stepsize=0.4)
    update = lambda i, args: tuple(opt.update(*args))

    # perform optimization loop
    state = opt.init_state(params)
    (params, _) = jax.lax.fori_loop(0, 100, update, (params, state))

    return params

Finally, Catalyst provides additional features to PennyLane, such as classical control of quantum operations that are JIT-enabled, via the function catalyst.for_loop and catalyst.cond. It also enables arbitrary post-processing of mid-circuit measurements.

For more details, see the Catalyst documentation and tutorials.