Release notes¶
This page contains the release notes for Catalyst.
Release 0.14.0 (development release)¶
New features since last release
Improvements 🛠
The
decompose-loweringMLIR pass now supportsqml.MultiRZwith an arbitrary number of wires. This decomposition is performed at MLIR when both capture and graph-decomposition are enabled. (#2160)A new option
use_namelochas been added toqjit()that embeds variable names from Python into the compiler IR, which can make it easier to read when debugging programs. (#2054)Passes registered under
qml.transformcan now take in options when used withqjit()with program capture enabled. (#2154)Pytree inputs can now be used when program capture is enabled. (#2165)
Breaking changes 💔
Deprecations 👋
Bug fixes 🐛
Fixes the issue with capturing unutilized abstracted adjoint and controlled rules by the graph in the new decomposition framework. (#2160)
Fixes the translation of plxpr control flow for edge cases where the
constswere being reordered. (#2128) (#2133)Fixes the translation of
QubitUnitaryandGlobalPhaseops when they are modified by adjoint or control. (##2158)Fixes the translation of a workflow with different transforms applied to different qnodes. (#2167)
Internal changes ⚙️
Refactor Catalyst pass registering so that it’s no longer necessary to manually add new passes at
registerAllCatalystPasses. (#1984)Split
from_plxpr.pyinto two files. (#2142)Re-work
DataViewto avoid an axis of size 0 possibly triggering a segfault via an underflow error, as discovered in this comment. (#1621)
Documentation 📝
A typo in the code example for
ppr_to_ppm()has been corrected. (#2136)Fix
catalyst.qjitandcatalyst.CompileOptionsdocs rendering. (#2156)
Contributors ✍️
This release contains contributions from (in alphabetical order):
Ali Asadi, Christina Lee, River McCubbin, Roberto Turrado, Paul Haochen Wang.
Release 0.13.0 (current release)¶
New features since last release
Catalyst now supports
qml.specs, meaning that users can use theqml.specsfunction to track the exact resources of programs compiled withqjit()! This new feature is currently only supported when usinglevel="device". (#2033) (#2055)This is made possible by leveraging resource-tracking capabilities using the
null.qubitdevice under the hood, which gathers circuit information via mock execution. This makes getting exact resources from large circuits extremely performant. For example, the circuit below has 100 qubits and its device-level resources can be calculated in around 1 minute!from functools import partial gateset = {qml.H, qml.S, qml.CNOT, qml.T, qml.RX, qml.RY, qml.RZ} @qml.qjit @partial(qml.transforms.decompose, gate_set=gateset) @qml.qnode(qml.device("null.qubit", wires=100)) def circuit(): qml.QFT(wires=range(100)) qml.Hadamard(wires=0) qml.CNOT(wires=[0, 1]) qml.OutAdder(x_wires=range(10), y_wires=range(10, 20), output_wires=range(20, 31)) return qml.expval(qml.Z(0) @ qml.Z(1)) circ_specs = qml.specs(circuit, level="device")()
>>> print(circ_specs['resources']) num_wires: 100 num_gates: 138134 depth: 90142 shots: Shots(total=None) gate_types: {'CNOT': 55313, 'RZ': 82698, 'Hadamard': 123} gate_sizes: {2: 55313, 1: 82821}
Note that there are certain limitations to
specssupport. For example,whileloops might not terminate when executing on thenull.qubitdevice due to the quantum execution being mocked out.The graph-based decomposition system, enabled with the global toggle
qml.decomposition.enable_graph(), is now supported with Catalyst with PennyLane program capture enabled (qml.capture.enable()). This providesqjit()compatibility to defining custom decomposition rules and access to the many decomposition rules for templates and operators in PennyLane that have been added over the past few release cycles. (#1820) (#2099) (#2091) (#2029) (#2001) (#2115)qml.decomposition.enable_graph() qml.capture.enable() @qml.register_resources({qml.H: 2, qml.CZ: 1}) def my_cnot1(wires): qml.H(wires=wires[1]) qml.CZ(wires=wires) qml.H(wires=wires[1]) @qml.qjit @partial( qml.transforms.decompose, gate_set={"H", "CZ", "GlobalPhase"}, alt_decomps={qml.CNOT: [my_cnot1]}, ) @qml.qnode(qml.device("lightning.qubit", wires=2)) def circuit(): qml.H(0) qml.CNOT(wires=[0, 1]) return qml.state()
>>> circuit() Array([0.70710678+0.j, 0. +0.j, 0. +0.j, 0.70710678+0.j], dtype=complex128)
Similar to PennyLane’s behaviour, this feature will fall back to the old system whenever the graph cannot find decomposition rules for all unsupported operators in the program, and a
UserWarningis raised.For more information, please consult the PennyLane decomposition module.
Catalyst now supports dynamic wire allocation with
qml.allocate()andqml.deallocate()when program capture is enabled, unlockingqjit-able applications like decompositions of gates that require temporary auxiliary wires and logical patterns in subroutines that benefit from having dynamic wire management. (#2002) (#2075)Two new functions,
qml.allocate()andqml.deallocate(), have been added to PennyLane to support dynamic wire allocation. With Catalyst, these features can be accessed onlightning.qubit,lightning.kokkos, andlightning.gpu.Dynamic wire allocation refers to the allocation of wires in the middle of a circuit, as opposed to the static allocation during device initialization. For example:
qml.capture.enable() @qjit @qml.qnode(qml.device("lightning.qubit", wires=2)) # 2 initial qubits def circuit(): qml.X(0) # |10> with qml.allocate(1) as q: # |10> and |0>, 1 dynamically allocated qubit qml.X(q[0]) # |10> and |1> qml.CNOT(wires=[q[0], 1]) # |11> and |1> return qml.probs(wires=[0, 1])
>>> print(circuit()) [0. 0. 0. 1.]
In the above program, 2 qubits are allocated during device initialization, and 1 additional qubit is allocated inside the circuit with
qml.allocate(1).For more information on what
qml.allocate()andqml.deallocate()do, please consult the PennyLane v0.43 release notes.There are some notable differences between the behaviour of these features with
qjitversus without. For details, please see the relevant sections in the Catalyst sharp bits page.A new quantum compilation pass called
reduce_t_depth()has been added, which reduces the depth and count of non-Clifford Pauli product rotations (PPRs) in circuits. This compilation pass works by commuting non-Clifford PPRs (those requiring aT-state to implement) in adjacent layers and merging compatible ones. More details can be found in Figure 6 of A Game of Surface Codes. (#1975) (#2048) (#2085)The impact of the
reduce_t_depth()pass can be measured usingppm_specs()to compare the circuit depth before and after applying the pass. Consider the following circuit:import pennylane as qml from catalyst import qjit, measure pips = [("pipe", ["enforce-runtime-invariants-pipeline"])] no_reduce_T = { "to_ppr": {}, "commute_ppr": {}, "merge_ppr_ppm": {}, } reduce_T = { "to_ppr": {}, "commute_ppr": {}, "merge_ppr_ppm": {}, "reduce_t_depth": {} } for pipeline in [reduce_T, no_reduce_T]: @qjit(pipelines=pips, target="mlir", circuit_transform_pipeline=pipeline) @qml.qnode(qml.device("null.qubit", wires=3)) def circuit(): n = 3 for i in range(n): qml.H(wires=i) qml.S(wires=i) qml.CNOT(wires=[i, (i + 1) % n]) qml.T(wires=i) qml.H(wires=i) qml.T(wires=i) return [measure(wires=i) for i in range(n)] print(ppm_specs(circuit))
{'circuit_0': {'depth_pi8_ppr': 3, 'depth_ppm': 1, 'logical_qubits': 3, 'max_weight_pi8': 3, 'num_of_ppm': 3, 'pi8_ppr': 6}} {'circuit_0': {'depth_pi8_ppr': 4, 'depth_ppm': 1, 'logical_qubits': 3, 'max_weight_pi8': 3, 'num_of_ppm': 3, 'pi8_ppr': 6}}
After performing the
to_ppr(),commute_ppr(), andmerge_ppr_ppm()passes, the circuit contains a depth of four of non-Clifford PPRs (depth_pi8_ppr). Subsequently applying thereduce_t_depth()pass will move PPRs around via commutation, resulting in a circuit with a smaller PPR depth of three.Catalyst now handles more types of hybrid workflows by supporting returning classical and MCM values with the dynamic one-shot MCM method. (#2004) (#2090)
For example, the code below will generate 10 values, with an equal probability of 42 and 43 appearing.
import pennylane as qml from catalyst import qjit, measure @qjit(autograph=True) @qml.qnode(qml.device("lightning.qubit", wires=1), mcm_method="one-shot", shots=10) def circuit(): qml.Hadamard(wires=0) m = measure(0) if m: return 42, m else: return 43, m
>>> print(circuit()) (Array([42, 43, 42, 42, 43, 42, 42, 43, 42, 42], dtype=int64), Array([ True, False, True, True, False, True, True, False, True, True], dtype=bool))
The default mid-circuit measurement method in catalyst has been changed from
"single-branch-statistics"to"one-shot"when mcms are present in the program, which provides a more sensible experience overall when using finite shots. [#2017] [#2019]The main differentiator is that
"one-shot"explores all branches of the decision tree when probabilistic elements are present in the program, such as mid-circuit measurements, device noise, or other sources of randomness. The cost is that simulation / device execution is repeatedshotsnumber of times.Catalyst now provides native support for
qml.SingleExcitation,qml.DoubleExcitation, andqml.PCPhaseon compatible devices (e.g., Lightning simulators). This enhancement avoids unnecessary gate decomposition, leading to reduced compilation time and improved overall performance. (#1980) (#1987)
Improvements 🛠
Adjoint differentiation is used by default when executing on lightning devices, which significantly reduces gradient computation time. (#1961)
The
ppm_specs()function now tracks the non-Clifford and Clifford PPR depth and the overall PPM depth. (#2014)For example:
from catalyst import qjit, measure from catalyst.passes import to_ppr, commute_ppr, reduce_t_depth, merge_ppr_ppm pips = [("pipe", ["enforce-runtime-invariants-pipeline"])] circuit_transforms = { "to_ppr": {}, "commute_ppr": {}, "merge_ppr_ppm": {}, } @qjit(pipelines=pips, target="mlir", circuit_transform_pipeline=circuit_transforms) @qml.qnode(qml.device("null.qubit", wires=3)) def circuit(): n = 3 for i in range(n): qml.H(wires=i) qml.S(wires=i) qml.CNOT(wires=[i, (i + 1) % n]) qml.T(wires=i) qml.H(wires=i) qml.T(wires=i) return [measure(wires=i) for i in range(n)]
>>> print(ppm_specs(circuit)) {'circuit_0': {'depth_pi8_ppr': 3, 'depth_ppm': 1, 'logical_qubits': 3, 'max_weight_pi8': 3, 'num_of_ppm': 3, 'pi8_ppr': 6}}
pennylane.QubitUnitaryis no longer favoured in the decomposition of controlled operators when the operator is not natively supported by the device, but the device supportspennylane.QubitUnitary. Instead, conversion topennylane.QubitUnitaryonly happens if the operator does not define another decomposition. The previous behaviour was the cause of performance issues when dealing with large controlled operators, as their matrix representation could be embedded as dense constant data into the program. The performance difference can span multiple orders of magnitude. (#2100)Conditional operators, such as
cond()orpennylane.cond(), now allow the target and branch functions to use arguments in their call signature. Previously, one had to supply all values via closure, but this is now done automatically under the hood. (#2096)Improvements have been made to the
catalyst.from_plxpr.from_plxprfeature set. (#1844) (#1850) (#1903) (#1896) (#1889) (#1973) (#1983) (#2041)It now supports:
qml.adjointandqml.ctrloperations and transforms,operator arithmetic observables and
qml.Hermitianobservables,qml.for_loop,qml.condandqml.while_loopoutside of QNodes,qml.condwithelifbranches,dynamic-value shots and dynamically-settable shots,
and the
qml.countsmeasurement process.
Parallelization is now considered in the IR. As part of that, Catalyst can represent parallel layers, compute depth, and optimize depth.
Two change were made as part of this overall improvement to the IR:
A new pass, accessible with
--partition-layersin the Catalyst CLI, has been added to group PPR and PPM operations intoqec.layeroperations based on qubit interactivity and commutativity, enabling circuit analysis and potential support for parallel execution. (#1951)The
qec.layerandqec.yieldoperations have been added to the QEC dialect to represent a group of QEC operations. The main use case is to analyze the depth of a circuit. Also, this is a preliminary step towards supporting parallel execution of QEC layers. (#1917)
Utility functions for modifying an existing compilation pipeline have been added to the
pipelinesmodule. (#1941)These functions provide a simple interface to insert passes and stages into a compilation pipeline. The available functions are
insert_pass_after,insert_pass_before,insert_stage_after, andinsert_stage_before. For example,>>> from catalyst.pipelines import insert_pass_after >>> pipeline = ["pass1", "pass2"] >>> insert_pass_after(pipeline, "new_pass", ref_pass="pass1") >>> pipeline ['pass1', 'new_pass', 'pass2']
A new pass called
detensorize-function-boundaryhas been added, which removes scalar tensors across function boundaries and enables thesymbol-dcepass to remove dead functions, reducing the number of instructions for compilation and thus improving performance. (#1904)The error message for unsupported mid-circuit measurements in measurement processes when using
mcm_method="single-branch-statistics"has been improved. (#2105)Catalyst’s native control flow functions (
for_loop(),while_loop()andcond()) now raise an error if used with PennyLane program capture (i.e.,qml.capture.enable()is present). (#1945)The Catalyst CLI now prints the Catalyst version when invoked with
catalyst --versionorquantum-opt --version. (#1922)A runtime error is now raised when the qubits provided to a quantum gate are not distinct (i.e. overlap). (#2006).
The Pauli product optimization pass that commutes Clifford rotations (\(\frac{\pi}{4}\)) past non-Clifford rotations (\(\frac{\pi}{8}\)) now also supports \(\frac{\pi}{2}\) angles. (#1966)
The default value for the
decompose_methodparameter in theppr_to_ppm()compilation pass is now"pauli-corrected", an improved decomposition of non-Clifford PPRs into two PPMs, instead of two PPMs, and a Clifford correction. This decomposition is based on Figure 13(a) in arXiv:2211.15465. (#2043) (#2047)In the Pauli-based compilation pipeline, identity operations (
qml.Identity) are now accepted in the input program converted to a corresponding PPR gate. Additionally, internal validation was improved across PPR/PPM passes. (#2058)Using the
keep_intermediate='pass'option now prints the whole module scope of a program to the intermediate files instead of just the pass scope. (#2051)
Breaking changes 💔
The
get_ppm_specsfunction has been renamed toppm_specs(). (#2031)The
shotsproperty has been removed fromOQDDevice. The number of shots for a QNode execution is now set directly on the QNode viaqml.qnode(..., shots=N), or via the decoratorqml.set_shots. (#1988)The JAX version used by Catalyst has been updated to 0.6.2. (#1897)
(Device implementers only) The
ReleaseAllQubitsdevice interface function has been replaced withReleaseQubits. (#1996)Instead of releasing all currently active qubits, the new interface function
ReleaseQubitsexplicitly takes in an array of qubit IDs to be released.For devices without dynamic allocation support it is expected that this function only succeed if the ID array contains the same values as those produced by the initial
AllocateQubitscall, otherwise the device is encouraged to raise an error.(Compiler integrators only) The version of LLVM and Enzyme used by Catalyst has been updated and the
mlir-hlodependency has been replaced withstablehlo. (#1916) (#1921)The LLVM version has been updated to commit f8cb798.
The stablehlo version has been updated to commit 69d6dae.
The Enzyme version has been updated to v0.0.186.
Deprecations 👋
Usage of the
Device.shotsproperty, along with settingdevice(..., shots=...), has been deprecated. Please set the shots at the QNode level withqml.qnode(..., shots=...)or using the decoratorqml.set_shots. (#1952)
Bug fixes 🐛
Fixed an issue with PennyLane program capture and static argnums on the QNode where the same lowering was being used no matter if the static arguments changed. The lowering to MLIR is no longer cached if there are static argnums. (#2053)
Fixed a bug where applying a quantum transform after a QNode could produce incorrect results or errors in certain cases. This resolves issues related to transforms operating on QNodes with classical outputs and improves compatibility with measurement transforms. (#2081)
Fixed a bug with incorrect type promotion on conditional branches, which was giving inconsistent output types from qjit’d QNodes. (#1977)
Snake case keyword arguments supplied to
apply_pass()are now correctly converted to the kebab case used for pass options in MLIR. (#1954).For example:
@qjit(target="mlir") @catalyst.passes.apply_pass("some-pass", "an-option", maxValue=1, multi_word_option=1) @qml.qnode(qml.device("null.qubit", wires=1)) def example(): return qml.state()
The pass application instruction will look like the following in MLIR:
%0 = transform.apply_registered_pass "some-pass" with options = {"an-option" = true, "maxValue" = 1 : i64, "multi-word-option" = 1 : i64}Fixed incorrect handling of partitioned shots in the decomposition pass of
measurements_from_samples. (#1981)Fixed a compiler error that occurred when
qml.prodwas used together with other operator transforms (e.g.,qml.adjoint) when Autograph was enabled. (#1910) (#2083)A bug in the
NullQubit::ReleaseQubit()method that prevented the deallocation of individual qubits on the"null.qubit"device has been fixed. (#1926)Stacked Python decorators for built-in Catalyst passes are now applied in the correct order when PennyLane program capture is enabled. (#2027)
Various issues in the OQC device plugin have been fixed:
Fixed a mistake in the gate sequence generated by the
ppr_to_ppmcompilation pass whendecompose_method="auto-corrected"is used. (#2043)static_argnumsis now correctly propagated when tracing the target functions of certain transformations and decorators, like the one used in the dynamic-one-shot mcm method. (#2056)Fixed a bug where deallocating the auxiliary qubit in
ppr_to_ppmwithdecompose_method="clifford-corrected"was deallocating the wrong auxiliary qubit. (#2039)
Internal changes ⚙️
The NullQubit device now provides the resource-tracking filename to allow for cleanup. (#1861)
The type of the
number_original_argattribute inCustomCallOphas been changed from a dense array to an integer. (#2022)QregManagerhas been renamed toQubitHandlerand has been extended to manage converting PLxPR wire indices into Catalyst JAXPR qubits. This is especially useful for lowering subroutines that take in qubits as arguments, like in decomposition rules. (#1820)The error message for using a quantum subroutine that was defined outside of a QNode scope has been improved. (#1932)
The usage of
qml.transforms.dynamic_one_shot.parse_native_mid_circuit_measurementsin Catalyst’sdynamic_one_shotimplementation was updated to use its new call signature. (#1953)When capture is enabled with
qml.capture.enable(),@qml.qjit(autograph=True)will use PennyLane’s autograph implementation instead of Catalyst’s. (#1960)The
extract_backend_infohelper function for theQJITDeviceno longer has a redundantcapabilitiesargument. (#1956)A warning is now raised when subroutines are used without PennyLane program capture enabled (
qml.capture.enable()). (#1930)Import paths for noise transforms have been updated from
pennylane.transformstopennylane.noise. (#1918) (#2020)Conversion patterns for the single-qubit
quantum.alloc_qbandquantum.dealloc_qboperations have been added for lowering to the LLVM dialect. These conversion patterns allow for execution of programs containing these operations. (#1920)The default compilation pipeline is now available as
catalyst.pipelines.default_pipeline(). The functioncatalyst.pipelines.get_stages()has also been removed, as it was not used and duplicated theCompileOptions.get_stages()method. (#1941)A new built-in compilation pipeline for experimental MBQC workloads called
catalyst.ftqc.mbqc_pipeline()has been added. (#1942)The output of this function can be used directly as input to the
pipelinesargument ofqjit(). For example:from catalyst.ftqc import mbqc_pipeline @qjit(pipelines=mbqc_pipeline()) @qml.qnode(dev) def workload(): ...
The
mbqc.graph_state_prepoperation has been added to the MBQC dialect. This operation prepares a graph state with arbitrary qubit connectivity, specified by an input adjacency-matrix operand, for use in MBQC workloads. (#1965)catalyst.accelerate,catalyst.debug.callback, andcatalyst.pure_callback,catalyst.debug.print, andcatalyst.debug.print_memrefnow work when PennyLane program capture is enabled withqml.capture.enable(). (#1902)The merge rotation pass in Catalyst (
merge_rotations()) now also considersqml.Rotandqml.CRot. (#1955)Catalyst now supports array-backed registers, meaning that
quantum.insertoperations can be configured to allow for the insertion of a qubit into an arbitrary position within a register. (#2000)This feature is disabled by default. To enable it, configure the pass pipeline to set the
use-array-backed-registersoption of theconvert-quantum-to-llvmpass totrue. For example:catalyst --tool=opt --pass-pipeline="builtin.module(convert-quantum-to-llvm{use-array-backed-registers=true})" <input file>The
NoMemoryEffecttrait has been removed from thequantum.allocoperation, which allowed for supporting the dynamic wire allocation feature. (#2044)Validation in the
ppm_specsfunction has been improved to prevent duplicate unnecessary duplication in the pipeline configuration. (#2049)A new compilation pass called
ppr_to_mbqc()has been added to lowerqec.pprandqec.ppminstructions into MBQC-style instructions. (#2057)This pass is part of a bottom-of-stack MBQC execution pathway, with a small separation between the PPR/PPM and MBQC layers to enable end-to-end compilation on a mocked backend.
import pennylane as qml from catalyst import qjit, measure from catalyst.passes import ppr_to_mbqc, to_ppr pipeline = [("pipe", ["enforce-runtime-invariants-pipeline"])] @qjit(target="mlir", pipelines=pipeline) @ppr_to_mbqc @to_ppr @qml.qnode(qml.device("lightning.qubit", wires=2)) def circuit(): qml.CNOT(wires=[0, 1]) qml.T(0) return measure(0) print(circuit.mlir_opt)
... %out_qubits = quantum.custom "Hadamard"() %2 : !quantum.bit %out_qubits_2:2 = quantum.custom "CNOT"() %out_qubits, %1 : !quantum.bit, !quantum.bit %out_qubits_3 = quantum.custom "RZ"(%cst_1) %out_qubits_2#1 : !quantum.bit %out_qubits_4:2 = quantum.custom "CNOT"() %out_qubits_2#0, %out_qubits_3 : !quantum.bit, !quantum.bit %out_qubits_5 = quantum.custom "Hadamard"() %out_qubits_4#0 : !quantum.bit %out_qubits_6 = quantum.custom "RZ"(%cst_0) %out_qubits_4#1 : !quantum.bit %out_qubits_7 = quantum.custom "Hadamard"() %out_qubits_5 : !quantum.bit %out_qubits_8 = quantum.custom "RZ"(%cst_0) %out_qubits_7 : !quantum.bit %out_qubits_9 = quantum.custom "Hadamard"() %out_qubits_8 : !quantum.bit %out_qubits_10 = quantum.custom "RZ"(%cst) %out_qubits_6 : !quantum.bit %mres, %out_qubit = quantum.measure %out_qubits_10 : i1, !quantum.bit ...
Note that in an MBQC gate set, the
RotXZXgate cannot yet be executed on available backends.A new jax primitive
qdealloc_qb_pis available for single qubit deallocations, which may be useful for the development of new features. (#2005)
Documentation 📝
Typos were fixed and supplemental information was added to the docstrings for
ppm_compilaion,to_ppr,commute_ppr,ppr_to_ppm,merge_ppr_ppm, andppm_specs. (#2050)The Catalyst Command Line Interface documentation incorrectly stated that the
catalystexecutable is available in thecatalyst/bin/directory relative to the environment’s installation directory when installed viapip. The documentation has been updated to point to the correct location, which is thebin/directory relative to the environment’s installation directory. (#2030)A handful of typos were fixed in the sharp bits page and transforms API. (#2046)
Links to demos were updated and corrected to point to relevant, up-to-date demos. (#2042)
Contributors ✍️
This release contains contributions from (in alphabetical order):
Ali Asadi, Joey Carter, Yushao Chen, Isaac De Vlugt, Sengthai Heng, David Ittah, Jeffrey Kam, Christina Lee, Joseph Lee, Andrija Paurevic, Justin Pickering, Ritu Thombre, Roberto Turrado, Paul Haochen Wang, Jake Zaia, Hongsheng Zheng.
Release 0.12.0¶
New features since last release
A new compilation pass called
ppm_compilation()has been added to Catalyst to transform Clifford+T gates into Pauli Product Measurements (PPMs) using just one transform, allowing for exploring representations of programs in a new paradigm in logical quantum compilation. (#1750)Based on arXiv:1808.02892, this new compilation pass simplifies circuit transformations and optimizations by combining multiple sub-passes into a single compilation pass, where Clifford+T gates are compiled down to Pauli product rotations (PPRs, \(\exp(-iP_{\{x, y, z\}} \theta)\)) and PPMs:
to_ppr(): converts Clifford+T gates into PPRs.commute_ppr(): commutes PPRs past non-Clifford PPRs.merge_ppr_ppm(): merges Clifford PPRs into PPMs.ppr_to_ppm(): decomposes both non-Clifford PPRs (\(\theta = \tfrac{\pi}{8}\)), consuming a magic state in the process, and Clifford PPRs (\(\theta = \tfrac{\pi}{4}\)) into PPMs. (#1664)
import pennylane as qml from catalyst.passes import ppm_compilation pipeline = [("pipe", ["enforce-runtime-invariants-pipeline"])] @qml.qjit(pipelines=pipeline, target="mlir") @ppm_compilation(decompose_method="clifford-corrected", avoid_y_measure=True, max_pauli_size=2) @qml.qnode(qml.device("null.qubit", wires=2)) def circuit(): qml.CNOT([0, 1]) qml.CNOT([1, 0]) qml.adjoint(qml.T)(0) qml.T(1) return catalyst.measure(0), catalyst.measure(1)
>>> print(circuit.mlir_opt) ... %m, %out:3 = qec.ppm ["Z", "Z", "Z"] %1, %2, %4 : !quantum.bit, !quantum.bit, !quantum.bit %m_0, %out_1:2 = qec.ppm ["Z", "Y"] %3, %out#2 : !quantum.bit, !quantum.bit %m_2, %out_3 = qec.ppm ["X"] %out_1#1 : !quantum.bit %m_4, %out_5 = qec.select.ppm(%m, ["X"], ["Z"]) %out_1#0 : !quantum.bit %5 = arith.xori %m_0, %m_2 : i1 %6:2 = qec.ppr ["Z", "Z"](2) %out#0, %out#1 cond(%5) : !quantum.bit, !quantum.bit quantum.dealloc_qb %out_5 : !quantum.bit quantum.dealloc_qb %out_3 : !quantum.bit %7 = quantum.alloc_qb : !quantum.bit %8 = qec.fabricate magic_conj : !quantum.bit %m_6, %out_7:2 = qec.ppm ["Z", "Z"] %6#1, %8 : !quantum.bit, !quantum.bit %m_8, %out_9:2 = qec.ppm ["Z", "Y"] %7, %out_7#1 : !quantum.bit, !quantum.bit %m_10, %out_11 = qec.ppm ["X"] %out_9#1 : !quantum.bit %m_12, %out_13 = qec.select.ppm(%m_6, ["X"], ["Z"]) %out_9#0 : !quantum.bit %9 = arith.xori %m_8, %m_10 : i1 %10 = qec.ppr ["Z"](2) %out_7#0 cond(%9) : !quantum.bit quantum.dealloc_qb %out_13 : !quantum.bit quantum.dealloc_qb %out_11 : !quantum.bit %m_14, %out_15:2 = qec.ppm ["Z", "Z"] %6#0, %10 : !quantum.bit, !quantum.bit %from_elements = tensor.from_elements %m_14 : tensor<i1> %m_16, %out_17 = qec.ppm ["Z"] %out_15#1 : !quantum.bit ...
A new function called
get_ppm_specs()has been added for acquiring statistics after PPM compilation. (#1794)After compiling a workflow with any combination of
to_ppr(),commute_ppr(),merge_ppr_ppm(),ppr_to_ppm(), orppm_compilation(), useget_ppm_specs()to track useful statistics of the compiled workflow, including:num_pi4_gates: number of Clifford PPRsnum_pi8_gates: number of non-Clifford PPRsnum_pi2_gates: number of classical PPRsmax_weight_pi4: maximum weight of Clifford PPRsmax_weight_pi8: maximum weight of non-Clifford PPRsmax_weight_pi2: maximum weight of classical PPRsnum_logical_qubits: number of logical qubitsnum_of_ppm: number of PPMs
from catalyst.passes import get_ppm_specs, to_ppr, merge_ppr_ppm, commute_ppr pipe = [("pipe", ["enforce-runtime-invariants-pipeline"])] @qjit(pipelines=pipe, target="mlir", autograph=True) def test_convert_clifford_to_ppr_workflow(): device = qml.device("lightning.qubit", wires=2) @merge_ppr_ppm @commute_ppr(max_pauli_size=2) @to_ppr @qml.qnode(device) def f(): qml.CNOT([0, 2]) qml.T(0) return measure(0), measure(1) @merge_ppr_ppm(max_pauli_size=1) @commute_ppr @to_ppr @qml.qnode(device) def g(): qml.CNOT([0, 2]) qml.T(0) qml.T(1) qml.CNOT([0, 1]) for i in range(10): qml.Hadamard(0) return measure(0), measure(1) return f(), g()
>>> ppm_specs = get_ppm_specs(test_convert_clifford_to_ppr_workflow) >>> print(ppm_specs) { 'f_0': {'max_weight_pi8': 1, 'num_logical_qubits': 2, 'num_of_ppm': 2, 'num_pi8_gates': 1}, 'g_0': {'max_weight_pi4': 2, 'max_weight_pi8': 1, 'num_logical_qubits': 2, 'num_of_ppm': 2, 'num_pi4_gates': 36, 'num_pi8_gates': 2} }
Catalyst now supports
qml.Snapshot, which captures quantum states at any point in a circuit. (#1741)For example, the code below is capturing two snapshot’d states, all within a qjit’d circuit:
NUM_QUBITS = 2 dev = qml.device("lightning.qubit", wires=NUM_QUBITS) @qjit @qml.qnode(dev) def circuit(): wires = list(range(NUM_QUBITS)) qml.Snapshot("Initial state") for wire in wires: qml.Hadamard(wires=wire) qml.Snapshot("After applying Hadamard gates") return qml.probs() results = circuit() snapshots, *results = circuit() >>> print(snapshots) [Array([1.+0.j, 0.+0.j, 0.+0.j, 0.+0.j], dtype=complex128), Array([0.5+0.j, 0.5+0.j, 0.5+0.j, 0.5+0.j], dtype=complex128)] >>> print(results) Array([0.25, 0.25, 0.25, 0.25], dtype=float64)
>>> print(results) ([Array([1.+0.j, 0.+0.j, 0.+0.j, 0.+0.j], dtype=complex128), Array([0.5+0.j, 0.5+0.j, 0.5+0.j, 0.5+0.j], dtype=complex128)], Array([0.25, 0.25, 0.25, 0.25], dtype=float64))
Catalyst now supports automatic qubit management, meaning that the number of wires does not need to be specified during device initialization. (#1788)
@qjit def workflow(): dev = qml.device("lightning.qubit") # no wires here! @qml.qnode(dev) def circuit(): qml.PauliX(wires=2) return qml.probs() return circuit() print(workflow())
[0. 1. 0. 0. 0. 0. 0. 0.]While this feature adds a lot of convenience, it may also reduce performance on devices where reallocating resources can be expensive, such as statevector simulators.
Two new peephole-optimization compilation passes called
disentangle_cnot()anddisentangle_swap()have been added. Each compilation pass replacesSWAPorCNOTinstructions with other equivalent elementary gates. (#1823)As an example,
disentangle_cnot()applied to the circuit below will replace theCNOTgate with anXgate.dev = qml.device("lightning.qubit", wires=2) @qml.qjit(keep_intermediate=True) @catalyst.passes.disentangle_cnot @qml.qnode(dev) def circuit(): # first qubit in |1> qml.X(0) # second qubit in |0> # current state : |10> qml.CNOT([0,1]) # state after CNOT : |11> return qml.state()
>>> from catalyst.debug import get_compilation_stage >>> print(get_compilation_stage(circuit, stage="QuantumCompilationPass")) ... %out_qubits = quantum.custom "PauliX"() %1 : !quantum.bit %2 = quantum.extract %0[ 1] : !quantum.reg -> !quantum.bit %out_qubits_0 = quantum.custom "PauliX"() %2 : !quantum.bit ...
Improvements 🛠
The
qml.measureoperation for mid-circuit measurements can now be used in qjit-compiled circuits with program capture enabled. (#1766)Note that the simulation behaviour of mid-circuit measurements can differ between PennyLane and Catalyst, depending on the chosen
mcm_method. Please see the Functionality differences from PennyLane section in the sharp bits and debugging tips page for additional information.The behaviour of measurement processes executed on
null.qubitwith qjit is now more consistent with their behaviour onnull.qubitwithout qjit. (#1598)Previously, measurement processes like
qml.sample,qml.counts,qml.probs, etc., returned values from uninitialized memory when executed onnull.qubitwith qjit. This change ensures that measurement processes onnull.qubitalways return the value 0 or the result corresponding to the ‘0’ state, depending on the context.The package name of the Catalyst distribution has been updated to be consistent with PyPA standards, from
PennyLane-Catalysttopennylane_catalyst. This change is not expected to affect users as tools in the Python ecosystem (e.g.pip) already handle both versions through normalization. (#1817)The
commute_ppr()andmerge_ppr_ppm()passes now accept an optionalmax_pauli_sizeargument, which limits the size of the Pauli strings generated by the passes through commutation or absorption rules. (#1719)The
to_ppr()pass is now more efficient by adding support for the direct conversion of Pauli gates (qml.X,qml.Y,qml.Z), the adjoint ofqml.Sgate, and the adjoint of theqml.Tgate. (#1738)The
keep_intermediateargument in theqjitdecorator now accepts a new value that allows for saving intermediate files after each pass. The updated possible options for this argument are:Falseor0orNone: No intermediate files are kept.Trueor1or"pipeline": Intermediate files are saved after each pipeline.2or"pass": Intermediate files are saved after each pass.
The default value is
False. (#1791)The
static_argnumskeyword argument in theqjitdecorator is now compatible with PennyLane program capture enabled (qml.capture.enable). (#1810)Catalyst is compatible with the new
qml.set_shotstransform introduced in PennyLane v0.42. (#1784)null.qubitcan now support an optionaltrack_resourceskeyword argument, which allows it to record which gates are executed. (#1619)import json import glob dev = qml.device("null.qubit", wires=2, track_resources=True) @qml.qjit @qml.qnode(dev) def circuit(): for _ in range(5): qml.H(0) qml.CNOT([0, 1]) return qml.probs() circuit() pattern = "./__pennylane_resources_data_*" filepath = glob.glob(pattern)[0] with open(filepath) as f: resources = json.loads(f.read())
>>> print(resources) {'num_qubits': 2, 'num_gates': 6, 'gate_types': {'CNOT': 1, 'Hadamard': 5}}
Breaking changes 💔
Support for Mac x86 has been removed. This includes Macs running on Intel processors. (#1716)
This is because JAX has also dropped support for it since 0.5.0, with the rationale being that such machines are becoming increasingly scarce.
If support for Mac x86 platforms is still desired, please install Catalyst v0.11.0, PennyLane v0.41.0, PennyLane-Lightning v0.41.0, and JAX v0.4.28.
(Device Developers Only) The
QuantumDeviceinterface in the Catalyst Runtime plugin system has been modified, which requires recompiling plugins for binary compatibility. (#1680)As announced in the 0.10.0 release, the
shotsargument has been removed from theSampleandCountsmethods in the interface, since it unnecessarily duplicated this information. Additionally,shotswill no longer be supplied by Catalyst through thekwargsparameter of the device constructor. The shot value must now be obtained through theSetDeviceShotsmethod.Further, the documentation for the interface has been overhauled and now describes the expected behaviour of each method in detail. A quality of life improvement is that optional methods are now clearly marked as such and also come with a default implementation in the base class, so device plugins need only override the methods they wish to support.
Finally, the
PrintStateand theOne/Zeroutility functions have been removed, since they did not serve a convincing purpose.(Frontend Developers Only) Some Catalyst primitives for JAX have been renamed, and the qubit deallocation primitive has been split into deallocation and a separate device release primitive. (#1720)
qunitary_pis nowunitary_p(unchanged)qmeasure_pis nowmeasure_p(unchanged)qdevice_pis nowdevice_init_p(unchanged)qdealloc_pno longer releases the device, thus it can be used at any point of a quantum execution scopedevice_release_pis a new primitive that must be used to mark the end of a quantum execution scope, which will release the quantum device
Catalyst has removed the
experimental_capturekeyword from theqjitdecorator in favour of unified behaviour with PennyLane. (#1657)Instead of enabling program capture with Catalyst via
qjit(experimental_capture=True), program capture can be enabled via the global toggleqml.capture.enable():import pennylane as qml from catalyst import qjit dev = qml.device("lightning.qubit", wires=2) qml.capture.enable() @qjit @qml.qnode(dev) def circuit(x): qml.Hadamard(0) qml.CNOT([0, 1]) return qml.expval(qml.Z(0)) circuit(0.1)
Disabling program capture can be done with
qml.capture.disable().The
ppr_to_ppmpass functionality has been moved to a new pass calledmerge_ppr_ppm. Theppr_to_ppmfunctionality now handles direct decomposition of PPRs into PPMs. (#1688)The version of JAX used by Catalyst has been updated to v0.6.0. (#1652) (#1729)
Several internal changes were made for this update.
LAPACK kernels are updated to adhere to the new JAX lowering rules for external functions. (#1685)
The trace stack is removed and replaced with a tracing context manager. (#1662)
A new
debug_infoargument is added toJaxpr, themake_jaxprfunctions, andjax.extend.linear_util.wrap_init. (#1670) (#1671) (#1681)
The version of LLVM, mlir-hlo, and Enzyme used by Catalyst has been updated to track those in JAX v0.6.0. (#1752)
The LLVM version has been updated to commit a8513158. The mlir-hlo version has been updated to commit e30c22d1. The Enzyme version has been updated to v0.0.180.
(Device developers only) Device parameters which are forwarded by the Catalyst runtime to plugin devices as a string may not contain nested dictionaries. Previously, these would be parsed incorrectly, and instead will now raise an error. (#1843) (#1846)
Deprecations 👋
Python 3.10 is now deprecated and will not be supported in Catalyst v0.13. Please upgrade to a newer Python version.
Bug fixes 🐛
Fixed Boolean arguments/results not working with the debugging functions
debug.get_cmainanddebug.compile_executable. (#1687)Fixed AutoGraph fallback for valid iteration targets with constant data but no length, for example
itertools.product(range(2), repeat=2). (#1665)Catalyst now correctly supports
qml.StatePrep()andqml.BasisState()operations in the experimental PennyLane program capture pipeline. (#1631)make allnow correctly compiles the standalone plugin with the same compiler used to compile LLVM and MLIR. (#1768)Stacked Python decorators for built-in Catalyst passes are now applied in the correct order. (#1798)
MLIR plugins can now be specified via lists and tuples, not just sets. (#1812)
Fixed the conversion of PLxPR to JAXPR with quantum primitives when using control flow. (#1809)
Fixed a bug in the internal simplification of qubit chains in the compiler, which manifested in certain transformations like
cancel_inversesand led to incorrect results. (#1840)Fixes the conversion of PLxPR to JAXPR with quantum primitives when using dynamic wires. (#1842)
Internal changes ⚙️
The clang-format and clang-tidy versions used by Catalyst have been updated to v20. (#1721)
The Sphinx version has been updated to v8.1. (#1734)
Integration with PennyLane’s experimental Python compiler based on xDSL has been added. This allows developers and users to write xDSL transformations that can be used with Catalyst. (#1715)
An xDSL MLIR plugin has been added to denote whether to use xDSL to execute compilation passes. (#1707)
The function
dataclass.replaceis now used to updateExecutionConfigandMCMConfigrather than mutating properties. (#1814)A function has been added that allows developers to register an equivalent MLIR transform for a given PLxPR transform. (#1705)
Overriding the
num_wiresproperty ofHybridOpis no longer happening when the operator can exist onAnyWires. This allows the deprecation ofWiresEnumin PennyLane. (#1667) (#1676)Catalyst now includes an experimental
mbqcdialect for representing measurement-based quantum-computing protocols in MLIR. (#1663) (#1679)The Catalyst Runtime C-API now includes a stub for the experimental
mbqc.measure_in_basisoperation,__catalyst__mbqc__measure_in_basis(), allowing for mock execution of MBQC workloads containing parameterized arbitrary-basis measurements. (#1674)This runtime stub is currently for mock execution only and should be treated as a placeholder operation. Internally, it functions just as a computational-basis measurement instruction.
Support for quantum subroutines was added. This feature is expected to improve compilation times for large quantum programs. (#1774) (#1828)
PennyLane’s arbitrary-basis measurement operations, such as
qml.ftqc.measure_arbitrary_basis, are now qjit-compatible with PennyLane program capture enabled. (#1645) (#1710)The utility function
EnsureFunctionDeclarationhas been refactored into theUtilsof the Catalyst dialect instead of being duplicated in each individual dialect. (#1683)The assembly format for some MLIR operations now includes
adjoint. (#1695)Improved the definition of
YieldOpin the quantum dialect by removingAnyTypeOf. (#1696)The assembly format of
MeasureOpin theQuantumdialect andMeasureInBasisOpin theMBQCdialect now contains thepostselectattribute. (#1732)The bufferization of custom Catalyst dialects has been migrated to the new one-shot bufferization interface in MLIR. The new MLIR bufferization interface is required by JAX v0.4.29 or higher. (#1027) (#1686) (#1708) (#1740) (#1751) (#1769)
The redundant
OptionalAttrhas been removed from theadjointargument in theQuantumOps.tdTableGen file. (#1746)ValueRangehas been replaced withTypeRangefor creatingCustomOpinIonsDecompositionPatterns.cppto match the build constructors. (#1749)The unused helper function
genArgMapFunctionin the--lower-gradientspass has been removed. (#1753)Base components of
QFuncPLxPRInterpreterhave been moved into a base class calledSubroutineInterpreter. This is intended to reduce code duplication. (#1787)An argument (
openapl_file_name) has been added to theOQDDeviceconstructor to specify the name of the output OpenAPL file. (#1763)The OQD device TOML file has been modified to only include gates that are decomposable to the OQD device target gate set. (#1763)
The
quantum-to-ionpass has been renamed togates-to-pulses. (#1818)The runtime CAPI function
__catalyst__rt__num_qubitsnow has a corresponding JAX primitivenum_qubits_pand quantum dialect operationNumQubitsOp. (#1793)For measurements whose shapes depend on the number of qubits, they now properly retrieve the number of qubits through this new operation when it is dynamic.
The PPR/PPM pass names have been renamed from snake-case to kebab-case in MLIR to align with MLIR conventions. Class names and tests were updated accordingly. Example:
--to_ppris now--to-ppr. (#1802)A new internal python module called
catalyst.from_plxprhas been created to better organize the code for plxpr integration. (#1813)A new
from_plxpr.QregManagerhas been created to handle converting plxpr wire index semantics into catalyst qubit value semantics. (#1813)
Documentation 📝
The header (logo+title) images in the README and in the overview on ReadTheDocs have been updated, reflecting that Catalyst is now beyond beta 🎉! (#1718)
The API section in the documentation has been simplified. The Catalyst ‘Runtime Device Interface’ page has been updated to point directly to the documented
QuantumDevicestruct, and the ‘QIR C-API’ page has been removed due to limited utility. (#1739)
Contributors ✍️
This release contains contributions from (in alphabetical order):
Runor Agbaire, Joey Carter, Isaac De Vlugt, Sengthai Heng, David Ittah, Tzung-Han Juang, Christina Lee, Mehrdad Malekmohammadi, Anton Naim Ibrahim, Erick Ochoa Lopez, Ritu Thombre, Raul Torres, Paul Haochen Wang, Jake Zaia.
Release 0.11.0¶
New features since last release
A novel optimization technique is implemented in Catalyst that performs quantum peephole optimizations across loop boundaries. The technique has been added to the existing optimizations
cancel_inversesandmerge_rotationsto increase their effectiveness in structured programs. (#1476)A frequently occurring pattern is operations at the beginning and end of a loop that cancel each other out. With loop boundary analysis, the
cancel_inversesoptimization can eliminate these redundant operations and thus reduce quantum circuit depth.For example,
dev = qml.device("lightning.qubit", wires=2) @qml.qjit @catalyst.passes.cancel_inverses @qml.qnode(dev) def circuit(): for i in range(3): qml.Hadamard(0) qml.CNOT([0, 1]) qml.Hadamard(0) return qml.expval(qml.Z(0))
Here, the Hadamard gate pairs which are consecutive across two iterations are eliminated, leaving behind only two unpaired Hadamard gates, from the first and last iteration, without unrolling the for loop. For more details on loop-boundary optimization, see the PennyLane Compilation entry.
A new intermediate representation and compilation framework has been added to Catalyst to describe and manipulate programs in the Pauli product measurement (PPM) representation. As part of this framework, three new passes are now available to convert Clifford + T gates to Pauli product measurements as described in arXiv:1808.02892. (#1499) (#1551) (#1563) (#1564) (#1577)
Note that programs in the PPM representation cannot yet be executed on available backends. The passes currently exist for analysis, but PPM programs may become executable in the future when a suitable backend is available.
The following new compilation passes can be accessed from the
passesmodule or inpipeline():catalyst.passes.to_ppr: Clifford + T gates are converted into Pauli product rotations (PPRs) (\(\exp{iP \theta}\), where \(P\) is a tensor product of Pauli operators):Hgate → 3 rotations with \(P_1 = Z, P_2 = X, P_3 = Z\) and \(\theta = \tfrac{\pi}{4}\)Sgate → 1 rotation with \(P = Z\) and \(\theta = \tfrac{\pi}{4}\)Tgate → 1 rotation with \(P = Z\) and \(\theta = \tfrac{\pi}{8}\)CNOTgate → 3 rotations with \(P_1 = (Z \otimes X), P_2 = (-Z \otimes \mathbb{1}), P_3 = (-\mathbb{1} \otimes X)\) and \(\theta = \tfrac{\pi}{4}\)
catalyst.passes.commute_ppr: Commute Clifford PPR operations (PPRs with \(\theta = \tfrac{\pi}{4}\)) to the end of the circuit, past non-Clifford PPRs (PPRs with \(\theta = \tfrac{\pi}{8}\))catalyst.passes.ppr_to_ppm: Absorb Clifford PPRs into terminal Pauli product measurements (PPMs).
For more information on PPMs, please refer to our PPM documentation page.
Catalyst now supports qubit number-invariant compilation. That is, programs can be compiled without specifying the number of qubits to allocate ahead of time. Instead, the device can be supplied with a dynamic program variable as the number of wires. (#1549) (#1553) (#1565) (#1574)
For example, the following toy workflow is now supported, where the number of qubits,
n, is provided as an argument to a qjit’d function:import catalyst import pennylane as qml @catalyst.qjit(autograph=True) def f(n): device = qml.device("lightning.qubit", wires=n, shots=10) @qml.qnode(device) def circuit(): for i in range(n): qml.RX(1.5, wires=i) return qml.counts() return circuit()
>>> f(3) (Array([0, 1, 2, 3, 4, 5, 6, 7], dtype=int64), Array([0, 0, 3, 2, 3, 1, 1, 0], dtype=int64)) >>> f(4) (Array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15], dtype=int64), Array([0, 0, 1, 1, 2, 0, 0, 0, 0, 0, 1, 1, 2, 1, 0, 1], dtype=int64))
Catalyst better integrates with PennyLane program capture, supporting PennyLane-native control flow operations and providing more efficient transform handling when both Catalyst and PennyLane support a transform. (#1468) (#1509) (#1521) (#1544) (#1561) (#1567) (#1578)
Using PennyLane’s program capture mechanism involves setting
experimental_capture=Truein the qjit decorator. With this present, the following control flow functions in PennyLane are now usable with qjit:Support for
qml.cond:import pennylane as qml from catalyst import qjit dev = qml.device("lightning.qubit", wires=1) @qjit(experimental_capture=True) @qml.qnode(dev) def circuit(x: float): def ansatz_true(): qml.RX(x, wires=0) qml.Hadamard(wires=0) def ansatz_false(): qml.RY(x, wires=0) qml.cond(x > 1.4, ansatz_true, ansatz_false)() return qml.expval(qml.Z(0))
>>> circuit(0.1) Array(0.99500417, dtype=float64)
Support for
qml.for_loop:dev = qml.device("lightning.qubit", wires=2) @qjit(experimental_capture=True) @qml.qnode(dev) def circuit(x: float): @qml.for_loop(10) def loop(i): qml.H(wires=1) qml.RX(x, wires=0) qml.CNOT(wires=[0, 1]) loop() return qml.expval(qml.Z(0))
>>> circuit(0.1) Array(0.97986841, dtype=float64)
Support for
qml.while_loop:@qjit(experimental_capture=True) @qml.qnode(dev) def circuit(x: float): f = lambda c: c < 5 @qml.while_loop(f) def loop(c): qml.H(wires=1) qml.RX(x, wires=0) qml.CNOT(wires=[0, 1]) return c + 1 loop(0) return qml.expval(qml.Z(0))
>>> circuit(0.1) Array(0.97526892, dtype=float64)
Additionally, Catalyst can now apply its own compilation passes when equivalent transforms are provided by PennyLane (e.g.,
cancel_inversesandmerge_rotations). In cases where Catalyst does not have its own analogous implementation of a transform available in PennyLane, the transform will be expanded according to rules provided by PennyLane.For example, consider this workflow that contains two PennyLane transforms:
cancel_inversesandsingle_qubit_fusion. Catalyst has its own implementation ofcancel_inversesin thepassesmodule, and will smartly invoke its implementation intead. Conversely, Catalyst does not have its own implementation ofsingle_qubit_fusion, and will therefore resort to PennyLane’s implementation of the transform.dev = qml.device("lightning.qubit", wires=1) @qjit(experimental_capture=True) def func(r1, r2): @qml.transforms.cancel_inverses @qml.transforms.single_qubit_fusion @qml.qnode(dev) def circuit(r1, r2): qml.Rot(*r1, wires=0) qml.Rot(*r2, wires=0) qml.RZ(r1[0], wires=0) qml.RZ(r2[0], wires=0) qml.Hadamard(wires=0) qml.Hadamard(wires=0) return qml.expval(qml.PauliZ(0)) return circuit(r1, r2)
>>> r1 = jnp.array([0.1, 0.2, 0.3]) >>> r2 = jnp.array([0.4, 0.5, 0.6]) >>> func(r1, r2) Array(0.7872403, dtype=float64)
Improvements 🛠
Several changes have been made to reduce compile time:
Catalyst now decomposes non-differentiable gates when differentiating through workflows. Additionally, with
diff_method=parameter-shift, circuits are now verified to be fully compatible with Catalyst’s parameter-shift implementation before compilation. (#1562) (#1568) (#1569) (#1604)Gates that are constant, such as when all parameters are Python or NumPy data types, are not decomposed when this is allowable. For the adjoint differentiation method, this is allowable for the
StatePrep,BasisState, andQubitUnitaryoperations. For the parameter-shift method, this is allowable for all operations.An
mlir_optproperty has been added toqjitto access the optimized MLIR representation of a compiled function. This is the representation of the program after running everything in the MLIR stage of the entire pipeline. (#1579) (#1637)from catalyst import qjit @qjit def f(x): return x**2
>>> f(2) Array(4, dtype=int64) >>> print(f.mlir_opt) module @f { llvm.func @__catalyst__rt__finalize() llvm.func @__catalyst__rt__initialize(!llvm.ptr) llvm.func @_mlir_memref_to_llvm_alloc(i64) -> !llvm.ptr llvm.func @jit_f(%arg0: !llvm.ptr, %arg1: !llvm.ptr, %arg2: i64) -> !llvm.struct<(ptr, ptr, i64)> attributes {llvm.copy_memref, llvm.emit_c_interface} ... llvm.func @teardown() { llvm.call @__catalyst__rt__finalize() : () -> () llvm.return } }
The error messages that indicate invalid
scale_factorsincatalyst.mitigate_with_znehave been improved to be formatted properly. (#1603)
Bug fixes 🐛
Fixed the
argnumsparameter ofgradandvalue_and_gradbeing ignored. (#1478)All dialects are loaded preemptively. This allows third-party plugins to load their dialects. (#1584)
Fixed an issue where Catalyst could give incorrect results for circuits containing
qml.StatePrep. (#1491)Fixed an issue where using autograph in conjunction with catalyst passes caused a crash. (#1541)
Fixed an issue where using autograph in conjunction with catalyst pipeline caused a crash. (#1576)
Fixed an issue where using chained catalyst passes decorators caused a crash. (#1576)
Specialized handling for
pipelines was added. (#1599)Fixed an issue where using autograph with control/adjoint functions used on operator objects caused a crash. (#1605)
Fixed an issue where using pytrees inside a loop with autograph caused falling back to Python. (#1601)
For example, the following example will now be captured and executed properly with Autograph enabled:
from catalyst import qjit
def updateList(x):
return [x[0]+1, x[1]+2]
@qjit(autograph=True)
def fn(x):
for i in range(4):
x = updateList(x)
return x
>>> fn([1, 2])
[Array(5, dtype=int64), Array(10, dtype=int64)]
Closure variables are now supported with
gradandvalue_and_grad. (#1613)
Internal changes ⚙️
Pattern rewriting in the
quantum-to-ionlowering pass has been changed to use MLIR’s dialect conversion infrastructure. (#1442)Updated the call signature for the plxpr
qnode_primprimitive. (#1538)Update deprecated access to
QNode.execute_kwargs["mcm_config"]. Insteadpostselect_modeandmcm_methodshould be accessed instead. (#1452)from_plxprnow uses theqml.capture.PlxprInterpreterclass for reduced code duplication. (#1398)Improved the error message for invalid measurement in
adjoin()orctrl()region. (#1425)Replaced
ValueRangewithResultRangeandValuewithOpResultto better align with the semantics of**QubitResult()functions likegetNonCtrlQubitResults(). This change ensures clearer intent and usage. Also, thematchAndRewritefunction has improved by usingreplaceAllUsesWithinstead of aforloop. (#1426)Several changes for experimental support of trapped-ion OQD devices have been made, including:
The
get_c_interfacemethod has been added to the OQD device, which enables retrieval of the C++ implementation of the device from Python. This allowsqjitto accept an instance of the device and connect to its runtime. (#1420)The ion dialect has been improved to reduce redundant code generated, a string attribute
labelhas been added to Level, and the levels of a transition have changed fromLevelAttrtostring. (#1471)The region of a
ParallelProtocolOpis now always terminated with aion::YieldOpwith explicitly yielded SSA values. This ensures the op is well-formed, and improves readability. (#1475)Added a new pass called
convert-ion-to-llvmwhich lowers the Ion dialect to llvm dialect. This pass introduces oqd device specific stubs that will be implemented in oqd runtime including:@ __catalyst__oqd__pulse,@ __catalyst__oqd__ParallelProtocol. (#1466)The OQD device can now generate OpenAPL JSON specs during runtime. The oqd stubs
@ __catalyst__oqd__pulse, and@ __catalyst__oqd__ParallelProtocol, which are called in the llvm dialect after the aforementioned lowering ((#1466)), are defined to produce JSON specs that OpenAPL expects. (#1516)The OQD device has been moved from
frontend/catalyst/third_party/oqdtoruntime/lib/backend/oqd. An overall switch,ENABLE_OQD, is added to control the OQD build system from a single entry point. The switch isOFFby default, and OQD can be built from source viamake all ENABLE_OQD=ON, ormake runtime ENABLE_OQD=ON. (#1508)Ion dialect now supports phonon modes using
ion.modesoperation. (#1517)Rotation angles are normalized to avoid negative duration for pulses during ion dialect lowering. (#1517)
Catalyst now generates OpenAPL programs for Pennylane circuits of up to two qubits using the OQD device. (#1517)
The end-to-end compilation pipeline for OQD devices is available as an API function. (#1545)
The source code has been updated to comply with changes requested by black v25.1.0 (#1490)
Reverted
StaticCustomOpin favour of adding helper functionsisStatic(),getStaticParams()to theCustomOpwhich preserves the same functionality. More specifically, this reverts [#1387] and [#1396], modifies [#1489]. (#1558) (#1555)Updated the C++ standard in mlir layer from 17 to 20. (#1229)
Documentation 📝
Added more details to JAX integration documentation regarding the use of
.atwith multiple indices. (#1595)
Contributors ✍️
This release contains contributions from (in alphabetical order):
Joey Carter, Yushao Chen, Isaac De Vlugt, Zach Goldthorpe, Sengthai Heng, David Ittah, Rohan Nolan Lasrado, Christina Lee, Mehrdad Malekmohammadi, Erick Ochoa Lopez, Andrija Paurevic, Raul Torres, Paul Haochen Wang.
Release 0.10.0¶
New features since last release
Catalyst can now load and apply local MLIR plugins from the PennyLane frontend. (#1287) (#1317) (#1361) (#1370)
Custom compilation passes and dialects in MLIR can be specified for use in Catalyst via a shared object (
*.soor*.dylibon macOS) that implements the pass. Details on creating your own plugin can be found in our compiler plugin documentation. At a high level, there are three ways to use a plugin once it’s properly specified:apply_pass()can be used on QNodes when there is a Python entry point defined for the plugin. In that case, the plugin and pass should both be specified and separated by a period.@catalyst.passes.apply_pass("plugin_name.pass_name") @qml.qnode(qml.device("lightning.qubit", wires=1)) def qnode(): return qml.state() @qml.qjit def module(): return qnode()
apply_pass_plugin()can be used on QNodes when the plugin did not define an entry point. In that case the full filesystem path must be specified in addition to the pass name.from pathlib import Path @catalyst.passes.apply_pass_plugin(Path("path_to_plugin"), "pass_name") @qml.qnode(qml.device("lightning.qubit", wires=1)) def qnode(): return qml.state() @qml.qjit def module(): return qnode()
Alternatively, one or more dialect and pass plugins can be specified in advance in the
qjit()decorator, via thepass_pluginsanddialect_pluginskeyword arguments. Theapply_pass()function can then be used without specifying the plugin.from pathlib import Path plugin = Path("shared_object_file.so") @catalyst.passes.apply_pass("pass_name") @qml.qnode(qml.device("lightning.qubit", wires=0)) def qnode(): qml.Hadamard(wires=0) return qml.state() @qml.qjit(pass_plugins=[plugin], dialect_plugins=[plugin]) def module(): return qnode()
For more information on usage, visit our compiler plugin documentation.
Improvements 🛠
The Catalyst CLI, a command line interface for debugging and dissecting different stages of compilation, is now available under the
catalystcommand after installing Catalyst with pip. Even though the tool was first introduced inv0.9, it was not yet included in binary distributions of Catalyst (wheels). The full usage instructions are available in the Catalyst CLI documentation. (#1285) (#1368) (#1405)Lightning devices now support finite-shot expectation values of
qml.Hermitianwhen used with Catalyst. (#451)The PennyLane state preparation template
qml.CosineWindowis now compatible with Catalyst. (#1166)A development distribution of Python with dynamic linking support (
libpython.so) is no longer needed in order to usecompile_executable()to generate standalone executables of compiled programs. (#1305)In Catalyst
v0.9the output of the compiler instrumentation (instrumentation()) had inadvertently been made more verbose by printing timing information for each run of each pass. This change has been reverted. Instead, theqjit()optionverbose=Truewill now instruct the instrumentation to produce this more detailed output. (#1343)Two additional circuit optimizations have been added to Catalyst:
disentangle-CNOTanddisentangle-SWAP. The optimizations are available via thepassesmodule. (#1154) (#1407)The optimizations use a finite state machine to propagate limited qubit state information through the circuit to turn CNOT and SWAP gates into cheaper instructions. The pass is based on the work by J. Liu, L. Bello, and H. Zhou, Relaxed Peephole Optimization: A Novel Compiler Optimization for Quantum Circuits, 2020, arXiv:2012.07711.
Breaking changes 💔
The minimum supported PennyLane version has been updated to
v0.40; backwards compatibility in either direction is not maintained. (#1308)(Device Developers Only) The way the
shotsparameter is initialized in C++ device backends is changing. (#1310)The previous method of including the shot number in the
kwargsargument of the device constructor is deprecated and will be removed in the next release (v0.11). Instead, the shots value will be specified exclusively via the existingSetDeviceShotsfunction called at the beginning of a quantum execution. Device developers are encouraged to update their device implementations between this and the next release while both methods are supported.Similarly, the
SampleandCountsfunctions (and theirPartial*equivalents) will no longer provide ashotsargument, since they are redundant. The signature of these functions will update in the next release.(Device Developers Only) The
toml-based device schemas have been integrated with PennyLane and updated to a new versionschema = 3. (#1275)Devices with existing TOML
schema = 2will not be compatible with the current release of Catalyst until updated. A summary of the most importation changes is listed here:operators.gates.nativerenamed tooperators.gatesoperators.gates.decompandoperators.gates.matrixare removed and no longer necessaryconditionproperty is renamed toconditionsEntries in the
measurement_processessection now expect the full PennyLane class name as opposed to the deprecatedmp.return_typeshorthand (e.g.ExpectationMPinstead ofExpval).The
mid_circuit_measurementsfield has been replaced withsupported_mcm_methods, which expects a list of mcm methods that the device is able to work with (or empty if unsupported).A new field has been added,
overlapping_observables, which indicates whether a device supports multiple measurements during one execution on overlapping wires.The
optionssection has been removed. Instead, the Python device class should define adevice_kwargsfield holding the name and values of C++ device constructor kwargs.
See the Custom Devices page for the most up-to-date information on integrating your device with Catalyst and PennyLane.
Bug fixes 🐛
Fixed a bug introduced in Catalyst
v0.8that breaks nested invocations ofqml.adjointandqml.ctrl(e.g.qml.adjoint(qml.adjoint(qml.H(0)))). (#1301)Fixed a bug in
compile_executable()when using non-64bit arrays as input to the compiled function, due to incorrectly computed stride information. (#1338)Fixed a bug in catalyst cli where using
checkpoint-stagewould causesave-ir-after-eachto not work properly. (#1405)
Internal changes ⚙️
Starting with Python 3.12, Catalyst’s binary distributions (wheels) will now follow Python’s Stable ABI, eliminating the need for a separate wheel per minor Python version. To enable this, the following changes have made:
Stable ABI wheels are now generated for Python 3.12 and up. (#1357) (#1385)
Pybind11 has been replaced with nanobind for C++/Python bindings across all components. (#1173) (#1293) (#1391) (#624)
Nanobind has been developed as a natural successor to the pybind11 library and offers a number of advantages like its ability to target Python’s Stable ABI.
Python C-API calls have been replaced with functions from Python’s Limited API. (#1354)
The
QuantumExtensionmodule for MLIR Python bindings, which relies on pybind11, has been removed. The module was never included in the distributed wheels and could not be converted to nanobind easily due to its dependency on upstream MLIR code. Pybind11 does not support the Python Stable ABI. (#1187)
Catalyst no longer depends on or pins the
scipypackage. Instead, OpenBLAS is sourced directly from scipy-openblas32 or Accelerate is used. (#1322) (#1328)The Catalyst plugin for the
lightning.qubitdevice has been migrated from the Catalyst repo to the Lightning repository. This reduces the size of Catalyst’s binary distributions and the build time of the project, by avoiding re-compilation of the lightning source code. (#1227) (#1307) (#1312)The AutoGraph exception mechanism (
allowlistparameter) has been streamlined to only be used in places where it’s required. (#1332) (#1337)Each QNode now has its own transformation schedule. Instead of relying on the name of the QNode, each QNode now has a transformation module, which denotes the transformation schedule, embedded in its MLIR representation. (#1323)
The
apply_registered_pass_pprimitive has been removed and the API for scheduling passes to run using the transform dialect has been refactored. In particular, passes are appended to a tuple as they are being registered and they will be run in order. If there are no local passes, the globalpass_pipelineis scheduled. Furthermore, this commit also reworks the caching mechanism for primitives, which is important as qnodes and functions are primitives and now that we can apply passes to them, they are distinct based on which passes have been scheduled to run on them. (#1317)The Catalyst infrastructure has been upgraded to support a dynamic
shotsparameter for quantum execution. Previously, this value had to be a static compile-time constant, and could not be changed once the program was compiled. Upcoming UI changes will make the feature accessible to users. (#1360)Several changes for experimental support of trapped-ion OQD devices have been made, including:
An experimental
iondialect has been added for Catalyst programs targeting OQD trapped-ion quantum devices. (#1260) (#1372)The
iondialect defines the set of physical properties of the device, such as the ion species and their atomic energy levels, as well as the operations to manipulate the qubits in the trapped-ion system, such as laser pulse durations, polarizations, detuning frequencies, etc.A new pass,
--quantum-to-ion, has also been added to convert logical gate-based circuits in the Catalystquantumdialect to laser pulse operations in theiondialect. This pass accepts logical quantum gates from the set{RX, RY, MS}, whereMSis the Mølmer–Sørensen gate. Doing so enables the insertion of physical device parameters into the IR, which will be necessary when lowering to OQD’s backend calls. The physical parameters, which are typically obtained from hardware-calibration runs, are read in from TOML files during the--quantum-to-ionconversion. The TOML filepaths are taken in as pass options.A plugin and device backend for OQD trapped-ion quantum devices has been added. (#1355) (#1403)
An MLIR transformation has been added to decompose
{T, S, Z, Hadamard, RZ, PhaseShift, CNOT}gates into the set{RX, RY, MS}. (#1226)
Support for OQD devices is still under development, therefore OQD modules are currently not included in binary distributions (wheels) of Catalyst.
The Catalyst IR has been extended to support literal values as opposed to SSA Values for static parameters of quantum gates by adding a new gate called
StaticCustomOp, with eventual lowering to the regularCustomOpoperation. (#1387) (#1396)Code readability in the
catalyst.pipelinesmodule has been improved, in particular for pipelines with conditionally included passes. (#1194)
Documentation 📝
A new tutorial going through how to write a new MLIR pass is available. The tutorial writes an empty pass that prints
hello world. The code for the tutorial is located in a separate github branch. (#872)The
verboseparameter ofqjit()was incorrectly listed asverbosityin the API documentation. This is now fixed. (#1440)Added more details to catalyst-cli documentation specifying available options for checkpoint-stage and default pipelines (#1405)
Contributors ✍️
This release contains contributions from (in alphabetical order):
Astral Cai, Joey Carter, David Ittah, Erick Ochoa Lopez, Mehrdad Malekmohammadi, William Maxwell, Romain Moyard, Shuli Shu, Ritu Thombre, Raul Torres, Paul Haochen Wang.
Release 0.9.0¶
New features
Catalyst now supports the specification of shot-vectors when used with
qml.samplemeasurements on thelightning.qubitdevice. (#1051)Shot-vectors allow shots to be specified as a list of shots,
[20, 1, 100], or as a tuple of the form((num_shots, repetitions), ...)such that((20, 3), (1, 100))is equivalent toshots=[20, 20, 20, 1, 1, ..., 1].This can result in more efficient quantum execution, as a single job representing the total number of shots is executed on the quantum device, with the measurement post-processing then coarse-grained with respect to the shot-vector.
For example,
dev = qml.device("lightning.qubit", wires=1, shots=((5, 2), 7)) @qjit @qml.qnode(dev) def circuit(): qml.Hadamard(0) return qml.sample()
>>> circuit() (Array([[0], [1], [0], [1], [1]], dtype=int64), Array([[0], [1], [1], [0], [1]], dtype=int64), Array([[1], [0], [1], [1], [0], [1], [0]], dtype=int64))
Note that other measurement types, such as
expvalandprobs, currently do not support shot-vectors.A new function
catalyst.pipelineallows the quantum-circuit-transformation pass pipeline for QNodes within a qjit-compiled workflow to be configured. (#1131) (#1240)import pennylane as qml from catalyst import pipeline, qjit my_passes = { "cancel_inverses": {}, "my_circuit_transformation_pass": {"my-option" : "my-option-value"}, } dev = qml.device("lightning.qubit", wires=2) @pipeline(my_passes) @qml.qnode(dev) def circuit(x): qml.RX(x, wires=0) return qml.expval(qml.PauliZ(0)) @qjit def fn(x): return jnp.sin(circuit(x ** 2))
pipelinecan also be used to specify different pass pipelines for different parts of the same qjit-compiled workflow:my_pipeline = { "cancel_inverses": {}, "my_circuit_transformation_pass": {"my-option" : "my-option-value"}, } my_other_pipeline = {"cancel_inverses": {}} @qjit def fn(x): circuit_pipeline = pipeline(my_pipeline)(circuit) circuit_other = pipeline(my_other_pipeline)(circuit) return jnp.abs(circuit_pipeline(x) - circuit_other(x))
The pass pipeline order and options can be configured globally for a qjit-compiled function, by using the
circuit_transform_pipelineargument of theqjit()decorator.my_passes = { "cancel_inverses": {}, "my_circuit_transformation_pass": {"my-option" : "my-option-value"}, } @qjit(circuit_transform_pipeline=my_passes) def fn(x): return jnp.sin(circuit(x ** 2))
Global and local (via
@pipeline) configurations can coexist, however local pass pipelines will always take precedence over global pass pipelines.The available MLIR passes are listed and documented in the passes module documentation.
A peephole merge rotations pass, which acts similarly to the Python-based PennyLane merge rotations transform, is now available in MLIR and can be applied to QNodes within a qjit-compiled function. (#1162) (#1205) (#1206)
The
merge_rotationspass can be provided to thecatalyst.pipelinedecorator:from catalyst import pipeline, qjit my_passes = { "merge_rotations": {} } dev = qml.device("lightning.qubit", wires=1) @qjit @pipeline(my_passes) @qml.qnode(dev) def g(x: float): qml.RX(x, wires=0) qml.RX(x, wires=0) qml.Hadamard(wires=0) return qml.expval(qml.PauliX(0))
It can also be applied directly to qjit-compiled QNodes via the
catalyst.passes.merge_rotationsPython decorator:from catalyst.passes import merge_rotations @qjit @merge_rotations @qml.qnode(dev) def g(x: float): qml.RX(x, wires=0) qml.RX(x, wires=0) qml.Hadamard(wires=0) return qml.expval(qml.PauliX(0))
Static arguments of a qjit-compiled function can now be indicated by name via a
static_argnamesargument to theqjitdecorator. (#1158)Specified static argument names will be treated as compile-time static values, allowing any hashable Python object to be passed to this function argument during compilation.
>>> @qjit(static_argnames="y") ... def f(x, y): ... print(f"Compiling with y={y}") ... return x + y >>> f(0.5, 0.3) Compiling with y=0.3
The function will only be re-compiled if the hash values of the static arguments change. Otherwise, re-using previous static argument values will result in no re-compilation:
Array(0.8, dtype=float64) >>> f(0.1, 0.3) # no re-compilation occurs Array(0.4, dtype=float64) >>> f(0.1, 0.4) # y changes, re-compilation Compiling with y=0.4 Array(0.5, dtype=float64)
Catalyst Autograph now supports updating a single index or a slice of JAX arrays using Python’s array assignment operator syntax. (#769) (#1143)
Using operator assignment syntax in favor of
at...opexpressions is now possible for the following operations:x[i] += yin favor ofx.at[i].add(y)x[i] -= yin favor ofx.at[i].add(-y)x[i] *= yin favor ofx.at[i].multiply(y)x[i] /= yin favor ofx.at[i].divide(y)x[i] **= yin favor ofx.at[i].power(y)
@qjit(autograph=True) def f(x): first_dim = x.shape[0] result = jnp.copy(x) for i in range(first_dim): result[i] *= 2 # This is now supported return result
>>> f(jnp.array([1, 2, 3])) Array([2, 4, 6], dtype=int64)
Catalyst now has a standalone compiler tool called
catalyst-clithat quantum-compiles MLIR input files into an object file independent of the Python frontend. (#1208) (#1255)This compiler tool combines three stages of compilation:
quantum-opt: Performs the MLIR-level optimizations and lowers the input dialect to the LLVM dialect.mlir-translate: Translates the input in the LLVM dialect into LLVM IR.llc: Performs lower-level optimizations and creates the object file.
catalyst-cliruns all three stages under the hood by default, but it also has the ability to run each stage individually. For example:# Creates both the optimized IR and an object file catalyst-cli input.mlir -o output.o # Only performs MLIR optimizations catalyst-cli --tool=opt input.mlir -o llvm-dialect.mlir # Only lowers LLVM dialect MLIR input to LLVM IR catalyst-cli --tool=translate llvm-dialect.mlir -o llvm-ir.ll # Only performs lower-level optimizations and creates object file catalyst-cli --tool=llc llvm-ir.ll -o output.o
Note that
catalyst-cliis only available when Catalyst is built from source, and is not included when installing Catalyst via pip or from wheels.Experimental integration of the PennyLane capture module is available. It currently only supports quantum gates, without control flow. (#1109)
To trigger the PennyLane pipeline for capturing the program as a Jaxpr, simply set
experimental_capture=Truein the qjit decorator.import pennylane as qml from catalyst import qjit dev = qml.device("lightning.qubit", wires=1) @qjit(experimental_capture=True) @qml.qnode(dev) def circuit(): qml.Hadamard(0) qml.CNOT([0, 1]) return qml.expval(qml.Z(0))
Improvements
Multiple
qml.samplecalls can now be returned from the same program, and can be structured using Python containers. For example, a program can return a dictionary of the formreturn {"first": qml.sample(), "second": qml.sample()}. (#1051)Catalyst now ships with
null.qubit, a Catalyst runtime plugin that mocks out all functions in the QuantumDevice interface. This device is provided as a convenience for testing and benchmarking purposes. (#1179)qml.device("null.qubit", wires=1) @qml.qjit @qml.qnode(dev) def g(x): qml.RX(x, wires=0) return qml.probs(wires=[0])
Setting the
seedargument in theqjitdecorator will now seed sampled results, in addition to mid-circuit measurement results. (#1164)dev = qml.device("lightning.qubit", wires=1, shots=10) @qml.qnode(dev) def circuit(x): qml.RX(x, wires=0) m = catalyst.measure(0) if m: qml.Hadamard(0) return qml.sample() @qml.qjit(seed=37, autograph=True) def workflow(x): return jnp.squeeze(jnp.stack([circuit(x) for i in range(4)]))
>>> workflow(1.8) Array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [1, 1, 0, 0, 1, 1, 0, 0, 1, 0], [0, 0, 1, 0, 1, 1, 0, 0, 1, 1], [1, 1, 1, 0, 0, 1, 1, 0, 1, 1]], dtype=int64) >>> workflow(1.8) Array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [1, 1, 0, 0, 1, 1, 0, 0, 1, 0], [0, 0, 1, 0, 1, 1, 0, 0, 1, 1], [1, 1, 1, 0, 0, 1, 1, 0, 1, 1]], dtype=int64)
Note that statistical measurement processes such as
expval,var, andprobsare currently not affected by seeding when shot noise is present.The
cancel_inversesMLIR compilation pass (-remove-chained-self-inverse) now supports cancelling all Hermitian gates, as well as adjoints of arbitrary unitary operations. (#1136) (#1186) (#1211)For the full list of supported Hermitian gates please see the
cancel_inversesdocumentation incatalyst.passes.Support is expanded for backend devices that exclusively return samples in the measurement basis. Pre- and post-processing now allows
qjitto be used on these devices withqml.expval,qml.varandqml.probsmeasurements in addition toqml.sample, using themeasurements_from_samplestransform. (#1106)Scalar tensors are eliminated from control flow operations in the program, and are replaced with bare scalars instead. This improves compilation time and memory usage at runtime by avoiding heap allocations and reducing the amount of instructions. (#1075)
Compiling QNodes to asynchronous functions will no longer print to
stderrin case of an error. (#645)Gradient computations have been made more efficient, as calling gradients twice (with the same gradient parameters) will now only lower to a single MLIR function. (#1172)
qml.sample()andqml.counts()onlightning.qubit/kokkoscan now be seeded withqjit(seed=...). (#1164) (#1248)The compiler pass
-remove-chained-self-inversecan now also cancel adjoints of arbitrary unitary operations (in addition to the named Hermitian gates). (#1186) (#1211)Add Lightning-GPU support to Catalyst docs and update tests. (#1254)
Breaking changes
The
static_sizefield in theAbstractQregclass has been removed. (#1113)This reverts a previous breaking change.
Nesting QNodes within one another now raises an error. (#1176)
The
debug.compile_from_mlirfunction has been removed; please usedebug.replace_irinstead. (#1181)The
compiler.last_compiler_outputfunction has been removed; please usecompiler.get_output_of("last", workspace)instead. (#1208)
Bug fixes
Fixes a bug where the second execution of a function with abstracted axes is failing. (#1247)
Fixes a bug in
catalyst.mitigate_with_znethat would lead to incorrectly extrapolated results. (#1213)Fixes a bug preventing the target of
qml.adjointandqml.ctrlcalls from being transformed by AutoGraph. (#1212)Resolves a bug where
mitigate_with_znedoes not work properly with shots and devices supporting only counts and samples (e.g., Qrack). (#1165)Resolves a bug in the
vmapfunction when passing shapeless values to the target. (#1150)Fixes a bug that resulted in an error message when using
qml.condon callables with arguments. (#1151)Fixes a bug that prevented taking the gradient of nested accelerate callbacks. (#1156)
Fixes some small issues with scatter lowering: (#1216) (#1217)
Registers the func dialect as a requirement for running the scatter lowering pass.
Emits error if
%input,%updateand%resultare not of length 1 instead of segfaulting.
Fixes a performance issue with
catalyst.vmap, where the root cause was in the lowering of the scatter operation. (#1214)Fixes a bug where conditional-ed single gates cannot be used in qjit, e.g.
qml.cond(x > 1, qml.Hadamard)(wires=0). (#1232)
Internal changes
Removes deprecated PennyLane code across the frontend. (#1168)
Updates Enzyme to version
v0.0.149. (#1142)Adjoint canonicalization is now available in MLIR for
CustomOpandMultiRZOp. It can be used with the--canonicalizepass inquantum-opt. (#1205)Removes the
MemMemCpyOptPassin llvm O2 (applied for Enzyme), which reduces bugs when running gradient-like functions. (#1063)Bufferization of
gradient.ForwardOpandgradient.ReverseOpnow requires three steps:gradient-preprocessing,gradient-bufferize, andgradient-postprocessing.gradient-bufferizehas a new rewrite forgradient.ReturnOp. (#1139)A new MLIR pass
detensorize-scfis added that works in conjunction with the existinglinalg-detensorizepass to detensorize input programs. The IR generated by JAX wraps all values in the program in tensors, including scalars, leading to unnecessary memory allocations for programs compiled to CPU via the MLIR-to-LLVM pipeline. (#1075)Importing Catalyst will now pollute less of JAX’s global variables by using
LoweringParameters. (#1152)Cached primitive lowerings is used instead of a custom cache structure. (#1159)
Functions with multiple tapes are now split with a new mlir pass
--split-multiple-tapes, with one tape per function. The reset routine that makes a measurement between tapes and inserts an X gate if measured one is no longer used. (#1017) (#1130)Prefer creating new
qml.devices.ExecutionConfigobjects over using the globalqml.devices.DefaultExecutionConfig. Doing so helps avoid unexpected bugs and test failures in case theDefaultExecutionConfigobject becomes modified from its original state. (#1137)Remove the old
QJITDeviceAPI. (#1138)The device-capability loading mechanism has been moved into the
QJITDeviceconstructor. (#1141)Several functions related to device capabilities have been refactored. (#1149)
In particular, the signatures of
get_device_capability,catalyst_decompose,catalyst_acceptance, andQJITDevice.__init__have changed, and thepennylane_operation_setfunction has been removed entirely.Catalyst now generates nested modules denoting quantum programs. (#1144)
Similar to MLIR’s
gpu.launch_kernelfunction, Catalyst, now supports acall_function_in_module. This allows Catalyst to call functions in modules and have modules denote a quantum kernel. This will allow for device-specific optimizations and compilation pipelines.At the moment, no one is using this. This is just the necessary scaffolding to support device-specific transformations. As such, the module will be inlined to preserve current semantics. However, in the future, we will explore lowering this nested module into other IRs/binary formats and lowering
call_function_in_moduleto something that can dispatch calls to another runtime/VM.
Contributors
This release contains contributions from (in alphabetical order):
Joey Carter, Spencer Comin, Amintor Dusko, Lillian M.A. Frederiksen, Sengthai Heng, David Ittah, Mehrdad Malekmohammadi, Vincent Michaud-Rioux, Romain Moyard, Erick Ochoa Lopez, Daniel Strano, Raul Torres, Paul Haochen Wang.
Release 0.8.0¶
New features
JAX-compatible functions that run on classical accelerators, such as GPUs, via
catalyst.acceleratenow support autodifferentiation. (#920)For example,
from catalyst import qjit, grad @qjit @grad def f(x): expm = catalyst.accelerate(jax.scipy.linalg.expm) return jnp.sum(expm(jnp.sin(x)) ** 2)
>>> x = jnp.array([[0.1, 0.2], [0.3, 0.4]]) >>> f(x) Array([[2.80120452, 1.67518663], [1.61605839, 4.42856163]], dtype=float64)
Assertions can now be raised at runtime via the
catalyst.debug_assertfunction. (#925)Python-based exceptions (via
raise) and assertions (viaassert) will always be evaluated at program capture time, before certain runtime information may be available.Use
debug_assertto instead raise assertions at runtime, including assertions that depend on values of dynamic variables.For example,
from catalyst import debug_assert @qjit def f(x): debug_assert(x < 5, "x was greater than 5") return x * 8
>>> f(4) Array(32, dtype=int64) >>> f(6) RuntimeError: x was greater than 5
Assertions can be disabled globally for a qjit-compiled function via the
disable_assertionskeyword argument:@qjit(disable_assertions=True) def g(x): debug_assert(x < 5, "x was greater than 5") return x * 8
>>> g(6) Array(48, dtype=int64)
Mid-circuit measurement results when using
lightning.qubitandlightning.kokkoscan now be seeded via the newseedargument of theqjitdecorator. (#936)The seed argument accepts an unsigned 32-bit integer, which is used to initialize the pseudo-random state at the beginning of each execution of the compiled function. Therefor, different
qjitobjects with the same seed (including repeated calls to the sameqjit) will always return the same sequence of mid-circuit measurement results.dev = qml.device("lightning.qubit", wires=1) @qml.qnode(dev) def circuit(x): qml.RX(x, wires=0) m = measure(0) if m: qml.Hadamard(0) return qml.probs() @qjit(seed=37, autograph=True) def workflow(x): return jnp.stack([circuit(x) for i in range(4)])
Repeatedly calling the
workflowfunction above will always result in the same values:>>> workflow(1.8) Array([[1. , 0. ], [1. , 0. ], [1. , 0. ], [0.5, 0.5]], dtype=float64) >>> workflow(1.8) Array([[1. , 0. ], [1. , 0. ], [1. , 0. ], [0.5, 0.5]], dtype=float64)
Note that setting the seed will not avoid shot-noise stochasticity in terminal measurement statistics such as
sampleorexpval:dev = qml.device("lightning.qubit", wires=1, shots=10) @qml.qnode(dev) def circuit(x): qml.RX(x, wires=0) m = measure(0) if m: qml.Hadamard(0) return qml.expval(qml.PauliZ(0)) @qjit(seed=37, autograph=True) def workflow(x): return jnp.stack([circuit(x) for i in range(4)])
>>> workflow(1.8) Array([1. , 1. , 1. , 0.4], dtype=float64) >>> workflow(1.8) Array([ 1. , 1. , 1. , -0.2], dtype=float64)
Exponential fitting is now a supported method of zero-noise extrapolation when performing error mitigation in Catalyst using
mitigate_with_zne. (#953)This new functionality fits the data from noise-scaled circuits with an exponential function, and returns the zero-noise value:
from pennylane.transforms import exponential_extrapolate from catalyst import mitigate_with_zne dev = qml.device("lightning.qubit", wires=2, shots=100000) @qml.qnode(dev) def circuit(weights): qml.StronglyEntanglingLayers(weights, wires=[0, 1]) return qml.expval(qml.PauliZ(0) @ qml.PauliZ(1)) @qjit def workflow(weights, s): zne_circuit = mitigate_with_zne(circuit, scale_factors=s, extrapolate=exponential_extrapolate) return zne_circuit(weights)
>>> weights = jnp.ones([3, 2, 3]) >>> scale_factors = jnp.array([1, 2, 3]) >>> workflow(weights, scale_factors) Array(-0.19946598, dtype=float64)
A new module is available,
catalyst.passes, which provides Python decorators for enabling and configuring Catalyst MLIR compiler passes. (#911) (#1037)The first pass available is
catalyst.passes.cancel_inverses, which enables the-removed-chained-self-inverseMLIR pass that cancels two neighbouring Hadamard gates.from catalyst.debug import get_compilation_stage from catalyst.passes import cancel_inverses dev = qml.device("lightning.qubit", wires=1) @qml.qnode(dev) def circuit(x: float): qml.RX(x, wires=0) qml.Hadamard(wires=0) qml.Hadamard(wires=0) return qml.expval(qml.PauliZ(0)) @qjit(keep_intermediate=True) def workflow(x): optimized_circuit = cancel_inverses(circuit) return circuit(x), optimized_circuit(x)
Catalyst now has debug functions
get_compilation_stageandreplace_irto acquire and recompile the IR from a given pipeline pass for functions compiled withkeep_intermediate=True. (#981)For example, consider the following function:
@qjit(keep_intermediate=True) def f(x): return x**2
>>> f(2.0) 4.0
Here we use
get_compilation_stageto acquire the IR, and then modify%2 = arith.mulf %in, %in_0 : f64to turn the square function into a cubic one viareplace_ir:from catalyst.debug import get_compilation_stage, replace_ir old_ir = get_compilation_stage(f, "HLOLoweringPass") new_ir = old_ir.replace( "%2 = arith.mulf %in, %in_0 : f64\n", "%t = arith.mulf %in, %in_0 : f64\n %2 = arith.mulf %t, %in_0 : f64\n" ) replace_ir(f, "HLOLoweringPass", new_ir)
The recompilation starts after the given checkpoint stage:
>>> f(2.0) 8.0
Either function can also be used independently of each other. Note that
get_compilation_stagereplaces theprint_compilation_stagefunction; please see the Breaking Changes section for more details.Catalyst now supports generating executables from compiled functions for the native host architecture using
catalyst.debug.compile_executable. (#1003)>>> @qjit ... def f(x): ... y = x * x ... catalyst.debug.print_memref(y) ... return y >>> f(5) MemRef: base@ = 0x31ac22580 rank = 0 offset = 0 sizes = [] strides = [] data = 25 Array(25, dtype=int64)
We can use
compile_executableto compile this function to a binary:>>> from catalyst.debug import compile_executable >>> binary = compile_executable(f, 5) >>> print(binary) /path/to/executable
Executing this function from a shell environment:
$ /path/to/executable MemRef: base@ = 0x64fc9dd5ffc0 rank = 0 offset = 0 sizes = [] strides = [] data = 25
Improvements
Catalyst has been updated to work with JAX v0.4.28 (exact version match required). (#931) (#995)
Catalyst now supports keyword arguments for qjit-compiled functions. (#1004)
>>> @qjit ... @grad ... def f(x, y): ... return x * y >>> f(3., y=2.) Array(2., dtype=float64)
Note that the
static_argnumsargument to theqjitdecorator is not supported when passing argument values as keyword arguments.Support has been added for the
jax.numpy.argsortfunction within qjit-compiled functions. (#901)Autograph now supports in-place array assignments with static slices. (#843)
For example,
@qjit(autograph=True) def f(x, y): y[1:10:2] = x return y
>>> f(jnp.ones(5), jnp.zeros(10)) Array([0., 1., 0., 1., 0., 1., 0., 1., 0., 1.], dtype=float64)
Autograph now works when
qjitis applied to a function decorated withvmap,cond,for_looporwhile_loop. Previously, stacking the autograph-enabled qjit decorator directly on top of other Catalyst decorators would lead to errors. (#835) (#938) (#942)from catalyst import vmap, qjit dev = qml.device("lightning.qubit", wires=2) @qml.qnode(dev) def circuit(x): qml.RX(x, wires=0) return qml.expval(qml.PauliZ(0))
>>> x = jnp.array([0.1, 0.2, 0.3]) >>> qjit(vmap(circuit), autograph=True)(x) Array([0.99500417, 0.98006658, 0.95533649], dtype=float64)
Runtime memory usage, and compilation complexity, has been reduced by eliminating some scalar tensors from the IR. This has been done by adding a
linalg-detensorizepass at the end of the HLO lowering pipeline. (#1010)Program verification is extended to confirm that the measurements included in QNodes are compatible with the specified device and settings. (#945) (#962)
>>> dev = qml.device("lightning.qubit", wires=2, shots=None) >>> @qjit ... @qml.qnode(dev) ... def circuit(params): ... qml.RX(params[0], wires=0) ... qml.RX(params[1], wires=1) ... return { ... "sample": qml.sample(wires=[0, 1]), ... "expval": qml.expval(qml.PauliZ(0)) ... } >>> circuit([0.1, 0.2]) CompileError: Sample-based measurements like sample(wires=[0, 1]) cannot work with shots=None. Please specify a finite number of shots.
On devices that support it, initial state preparation routines
qml.StatePrepandqml.BasisStateare no longer decomposed when using Catalyst, improving compilation and runtime performance. (#955) (#1047) (#1062) (#1073)Improved type validation and error messaging has been added to both the
catalyst.jvpandcatalyst.vjpfunctions to ensure that the (co)tangent and parameter types are compatible. (#1020) (#1030) (#1031)For example, providing an integer tangent for a function with float64 parameters will result in an error:
>>> f = lambda x: (2 * x, x * x) >>> f_jvp = lambda x: catalyst.jvp(f, params=(x,), tangents=(1,)) >>> qjit(f_jvp)(0.5) TypeError: function params and tangents arguments to catalyst.jvp do not match; dtypes must be equal. Got function params dtype float64 and so expected tangent dtype float64, but got tangent dtype int64 instead.
Ensuring that the types match will resolve the error:
>>> f_jvp = lambda x: catalyst.jvp(f, params=(x,), tangents=(1.0,)) >>> qjit(f_jvp)(0.5) ((Array(1., dtype=float64), Array(0.25, dtype=float64)), (Array(2., dtype=float64), Array(1., dtype=float64)))
Add a script for setting up a Frontend-Only Development Environment that does not require compilation, as it uses the TestPyPI wheel shared libraries. (#1022)
Breaking changes
The
argnumkeyword argument in thegrad,jacobian,value_and_grad,vjp, andjvpfunctions has been renamed toargnumsto better match JAX. (#1036)Return values of qjit-compiled functions that were previously
numpy.ndarrayare now of typejax.Arrayinstead. This should have minimal impact, but code that depends on the output of qjit-compiled function being NumPy arrays will need to be updated. (#895)The
print_compilation_stagefunction has been renamedget_compilation_stage. It no longer prints the IR to the standard output, instead it simply returns the IR as a string. (#981)>>> @qjit(keep_intermediate=True) ... def func(x: float): ... return x >>> print(get_compilation_stage(func, "HLOLoweringPass")) module @func { func.func public @jit_func(%arg0: tensor<f64>) -> tensor<f64> attributes {llvm.emit_c_interface} { return %arg0 : tensor<f64> } func.func @setup() { quantum.init return } func.func @teardown() { quantum.finalize return } }
Support for TOML files in Schema 1 has been disabled. (#960)
The
mitigate_with_znefunction no longer accepts adegreeparameter for polynomial fitting and instead accepts a callable to perform extrapolation. Any qjit-compatible extrapolation function is valid. Keyword arguments can be passed to this function using theextrapolate_kwargskeyword argument inmitigate_with_zne. (#806)The QuantumDevice API has now added the functions
SetStateandSetBasisStatefor simulators that may benefit from instructions that directly set the state. Implementing these methods is optional, and device support can be indicated via theinitial_state_prepflag in the TOML configuration file. (#955)
Bug fixes
Catalyst no longer silently converts complex parameters to floats where floats are expected, instead an error is raised. (#1008)
Fixes a bug where dynamic one-shot did not work when no mid-circuit measurements are present and when the return type is an iterable. (#1060)
Fixes a bug finding the quantum function jaxpr when using quantum primitives with dynamic one-shot (#1041)
Fix a bug where LegacyDevice number of shots is not correctly extracted when using the legacyDeviceFacade. (#1035)
Catalyst no longer generates a
QubitUnitaryoperation during decomposition if a device doesn’t support it. Instead, the operation that would lead to aQubitUnitaryis either decomposed or raises an error. (#1002)Correctly errors out when user uses
qml.density_matrix(#1118)Catalyst now preserves output PyTrees in QNodes executed with
mcm_method="one-shot". (#957)For example:
dev = qml.device("lightning.qubit", wires=1, shots=20) @qml.qjit @qml.qnode(dev, mcm_method="one-shot") def func(x): qml.RX(x, wires=0) m_0 = catalyst.measure(0, postselect=1) return {"hi": qml.expval(qml.Z(0))}
>>> func(0.9) {'hi': Array(-1., dtype=float64)}
Fixes a bug where scatter did not work correctly with list indices. (#982)
A = jnp.ones([3, 3]) * 2 def update(A): A = A.at[[0, 1], :].set(jnp.ones([2, 3]), indices_are_sorted=True, unique_indices=True) return A
>>> update [[1. 1. 1.] [1. 1. 1.] [2. 2. 2.]]
Static arguments can now be passed through a QNode when specified with the
static_argnumskeyword argument. (#932)dev = qml.device("lightning.qubit", wires=1) @qjit(static_argnums=(1,)) @qml.qnode(dev) def circuit(x, c): print("Inside QNode:", c) qml.RY(c, 0) qml.RX(x, 0) return qml.expval(qml.PauliZ(0))
When executing the qjit-compiled function above,
cwill be a static variable with value known at compile time:>>> circuit(0.5, 0.5) "Inside QNode: 0.5" Array(0.77015115, dtype=float64)
Changing the value of
cwill result in re-compilation:>>> circuit(0.5, 0.8) "Inside QNode: 0.8" Array(0.61141766, dtype=float64)
Fixes a bug where Catalyst would fail to apply quantum transforms and preserve QNode configuration settings when Autograph was enabled. (#900)
pure_callbackwill no longer cause a crash in the compiler if the return type signature is declared incorrectly and the callback function is differentiated. (#916)Instead, this is caught early and a useful error message returned:
@catalyst.pure_callback def callback_fn(x) -> jax.ShapeDtypeStruct((2,), jnp.float32): return np.array([np.sin(x), np.cos(x)]) callback_fn.fwd(lambda x: (callback_fn(x), x)) callback_fn.bwd(lambda x, dy: (jnp.array([jnp.cos(x), -jnp.sin(x)]) @ dy,)) @qjit @catalyst.grad def f(x): return jnp.sum(callback_fn(jnp.sin(x)))
>>> f(0.54) TypeError: Callback callback_fn expected type ShapedArray(float32[2]) but observed ShapedArray(float64[2]) in its return value
AutoGraph will now correctly convert conditional statements where the condition is a non-boolean static value. (#944)
Internally, statically known non-boolean predicates (such as
1) will be converted tobool:@qml.qjit(autograph=True) def workflow(x): n = 1 if n: y = x ** 2 else: y = x return y
value_and_gradwill now correctly differentiate functions with multiple arguments. Previously, attempting to differentiate functions with multiple arguments, or pass theargnumsargument, would result in an error. (#1034)@qjit def g(x, y, z): def f(x, y, z): return x * y ** 2 * jnp.sin(z) return catalyst.value_and_grad(f, argnums=[1, 2])(x, y, z)
>>> g(0.4, 0.2, 0.6) (Array(0.00903428, dtype=float64), (Array(0.0903428, dtype=float64), Array(0.01320537, dtype=float64)))
A bug is fixed in
catalyst.debug.get_cmainto support multi-dimensional arrays as function inputs. (#1003)Bug fixed when parameter annotations return strings. (#1078)
In certain cases,
jax.scipy.linalg.expmmay return incorrect numerical results when used within a qjit-compiled function. A warning will now be raised whenjax.scipy.linalg.expmis used to inform of this issue.In the meantime, we strongly recommend the catalyst.accelerate function within qjit-compiled function to call
jax.scipy.linalg.expmdirectly.@qjit def f(A): B = catalyst.accelerate(jax.scipy.linalg.expm)(A) return B
Note that this PR doesn’t actually fix the aforementioned numerical errors, and just raises a warning. (#1082)
Documentation
A page has been added to the documentation, listing devices that are Catalyst compatible. (#966)
Internal changes
Adds
catalyst.from_plxpr.from_plxprfor converting a PennyLane variant jaxpr into a Catalyst variant jaxpr. (#837)Catalyst now uses Enzyme
v0.0.130(#898)When memrefs have no identity layout, memrefs copy operations are replaced by the linalg copy operation. It does not use a runtime function but instead lowers to scf and standard dialects. It also ensures a better compatibility with Enzyme. (#917)
LLVM’s O2 optimization pipeline and Enzyme’s AD transformations are now only run in the presence of gradients, significantly improving compilation times for programs without derivatives. Similarly, LLVM’s coroutine lowering passes only run when
async_qnodesis enabled in the QJIT decorator. (#968)The function
inactive_callbackwas renamed__catalyst_inactive_callback. (#899)The function
__catalyst_inactive_callbackhas the nofree attribute. (#898)catalyst.dynamic_one_shotusespostselect_mode="pad-invalid-samples"in favour ofinterface="jax"when processing results. (#956)Callbacks now have nicer identifiers in their MLIR representation. The identifiers include the name of the Python function being called back into. (#919)
Fix tracing of
SProdoperations to bring Catalyst in line with PennyLane v0.38. (#935)After some changes in PennyLane,
Sprod.terms()returns the terms as leaves instead of a tree. This means that we need to manually trace each term and finally multiply it with the coefficients to create a Hamiltonian.The function
mitigate_with_zneaccomodates afoldinginput argument for specifying the type of circuit folding technique to be used by the error-mitigation routine (onlyglobalvalue is supported to date.) (#946)Catalyst’s implementation of Lightning Kokkos plugin has been removed in favor of Lightning’s one. (#974)
The
validate_device_capabilitiesfunction is considered obsolete. Hence, it has been removed. (#1045)
Contributors
This release contains contributions from (in alphabetical order):
Joey Carter, Alessandro Cosentino, Lillian M. A. Frederiksen, David Ittah, Josh Izaac, Christina Lee, Kunwar Maheep Singh, Mehrdad Malekmohammadi, Romain Moyard, Erick Ochoa Lopez, Mudit Pandey, Nate Stemen, Raul Torres, Tzung-Han Juang, Paul Haochen Wang,
Release 0.7.0¶
New features
Add support for accelerating classical processing via JAX with
catalyst.accelerate. (#805)Classical code that can be just-in-time compiled with JAX can now be seamlessly executed on GPUs or other accelerators with
catalyst.accelerate, right inside of QJIT-compiled functions.@accelerate(dev=jax.devices("gpu")[0]) def classical_fn(x): return jnp.sin(x) ** 2 @qjit def hybrid_fn(x): y = classical_fn(jnp.sqrt(x)) # will be executed on a GPU return jnp.cos(y)
Available devices can be retrieved via
jax.devices(). If not provided, the default value ofjax.devices()[0]as determined by JAX will be used.Catalyst callback functions, such as
pure_callback,debug.callback, anddebug.print, now all support auto-differentiation. (#706) (#782) (#822) (#834) (#882) (#907)When using callbacks that do not return any values, such as
catalyst.debug.callbackandcatalyst.debug.print, these functions are marked as ‘inactive’ and do not contribute to or affect the derivative of the function:import logging log = logging.getLogger(__name__) log.setLevel(logging.INFO) @qml.qjit @catalyst.grad def f(x): y = jnp.cos(x) catalyst.debug.print("Debug print: y = {0:.4f}", y) catalyst.debug.callback(lambda _: log.info("Value of y = %s", _))(y) return y ** 2
>>> f(0.54) INFO:__main__:Value of y = 0.8577086813638242 Debug print: y = 0.8577 array(-0.88195781)
Callbacks that do return values and may affect the qjit-compiled functions computation, such as
pure_callback, may have custom derivatives manually registered with the Catalyst compiler in order to support differentiation.This can be done via the
pure_callback.fwdandpure_callback.bwdmethods, to specify how the forwards and backwards pass (the vector-Jacobian product) of the callback should be computed:@catalyst.pure_callback def callback_fn(x) -> float: return np.sin(x[0]) * x[1] @callback_fn.fwd def callback_fn_fwd(x): # returns the evaluated function as well as residual # values that may be useful for the backwards pass return callback_fn(x), x @callback_fn.bwd def callback_fn_vjp(res, dy): # Accepts residuals from the forward pass, as well # as (one or more) cotangent vectors dy, and returns # a tuple of VJPs corresponding to each input parameter. def vjp(x, dy) -> (jax.ShapeDtypeStruct((2,), jnp.float64),): return (np.array([np.cos(x[0]) * dy * x[1], np.sin(x[0]) * dy]),) # The VJP function can also be a pure callback return catalyst.pure_callback(vjp)(res, dy) @qml.qjit @catalyst.grad def f(x): y = jnp.array([jnp.cos(x[0]), x[1]]) return jnp.sin(callback_fn(y))
>>> x = jnp.array([0.1, 0.2]) >>> f(x) array([-0.01071923, 0.82698717])
Catalyst now supports the ‘dynamic one shot’ method for simulating circuits with mid-circuit measurements, which compared to other methods, may be advantageous for circuits with many mid-circuit measurements executed for few shots. (#5617) (#798)
The dynamic one shot method evaluates dynamic circuits by executing them one shot at a time via
catalyst.vmap, sampling a dynamic execution path for each shot. This method only works for a QNode executing with finite shots, and it requires the device to support mid-circuit measurements natively.This new mode can be specified by using the
mcm_methodargument of the QNode:dev = qml.device("lightning.qubit", wires=5, shots=20) @qml.qjit(autograph=True) @qml.qnode(dev, mcm_method="one-shot") def circuit(x): for i in range(10): qml.RX(x, 0) m = catalyst.measure(0) if m: qml.RY(x ** 2, 1) x = jnp.sin(x) return qml.expval(qml.Z(1))
Catalyst’s existing method for simulating mid-circuit measurements remains available via
mcm_method="single-branch-statistics".When using
mcm_method="one-shot", thepostselect_modekeyword argument can also be used to specify whether the returned result should includeshots-number of postselected measurements ("fill-shots"), or whether results should include all results, including invalid postselections ("hw_like"):@qml.qjit @qml.qnode(dev, mcm_method="one-shot", postselect_mode="hw-like") def func(x): qml.RX(x, wires=0) m_0 = catalyst.measure(0, postselect=1) return qml.sample(wires=0)
>>> res = func(0.9) >>> res array([-2147483648, -2147483648, 1, -2147483648, -2147483648, -2147483648, -2147483648, 1, -2147483648, -2147483648, -2147483648, -2147483648, 1, -2147483648, -2147483648, -2147483648, -2147483648, -2147483648, -2147483648, -2147483648]) >>> jnp.delete(res, jnp.where(res == np.iinfo(np.int32).min)[0]) Array([1, 1, 1], dtype=int64)
Note that invalid shots will not be discarded, but will be replaced by
np.iinfo(np.int32).min. They will not be used for processing final results (like expectation values), but they will appear in the output of QNodes that return samples directly.For more details, see the dynamic quantum circuit documentation.
Catalyst now has support for returning
qml.sample(m)wheremis the result of a mid-circuit measurement. (#731)When used with
mcm_method="one-shot", this will return an array with one measurement result for each shot:dev = qml.device("lightning.qubit", wires=2, shots=10) @qml.qjit @qml.qnode(dev, mcm_method="one-shot") def func(x): qml.RX(x, wires=0) m = catalyst.measure(0) qml.RX(x ** 2, wires=0) return qml.sample(m), qml.expval(qml.PauliZ(0))
>>> func(0.9) (array([0, 1, 0, 0, 0, 0, 1, 0, 0, 0]), array(0.4))
In
mcm_method="single-branch-statistics"mode, it will be equivalent to returningmdirectly from the quantum function — that is, it will return a single boolean corresponding to the measurement in the branch selected:@qml.qjit @qml.qnode(dev, mcm_method="single-branch-statistics") def func(x): qml.RX(x, wires=0) m = catalyst.measure(0) qml.RX(x ** 2, wires=0) return qml.sample(m), qml.expval(qml.PauliZ(0))
>>> func(0.9) (array(False), array(0.8))
A new function,
catalyst.value_and_grad, returns both the result of a function and its gradient with a single forward and backwards pass. (#804) (#859)This can be more efficient, and reduce overall quantum executions, compared to separately executing the function and then computing its gradient.
For example:
dev = qml.device("lightning.qubit", wires=3) @qml.qnode(dev) def circuit(x): qml.RX(x, wires=0) qml.CNOT(wires=[0, 1]) qml.RX(x, wires=2) return qml.probs() @qml.qjit @catalyst.value_and_grad def cost(x): return jnp.sum(jnp.cos(circuit(x)))
>>> cost(0.543) (array(7.64695856), array(0.33413963))
Autograph now supports single index JAX array assignments (#717)
When using Autograph, syntax of the form
x[i] = ywhereiis a single integer will now be automatically converted to the JAX equivalent ofx = x.at(i).set(y):@qml.qjit(autograph=True) def f(array): result = jnp.ones(array.shape, dtype=array.dtype) for i, x in enumerate(array): result[i] = result[i] + x * 3 return result
>>> f(jnp.array([-0.1, 0.12, 0.43, 0.54])) array([0.7 , 1.36, 2.29, 2.62])
Catalyst now supports dynamically-shaped arrays in control-flow primitives. Arrays with dynamic shapes can now be used with
for_loop,while_loop, andcondprimitives. (#775) (#777) (#830)@qjit def f(shape): a = jnp.ones([shape], dtype=float) @for_loop(0, 10, 2) def loop(i, a): return a + i return loop(a)
>>> f(3) array([21., 21., 21.])
Support has been added for disabling Autograph for specific functions. (#705) (#710)
The decorator
catalyst.disable_autographallows one to disable Autograph from auto-converting specific external functions when called within a qjit-compiled function withautograph=True:def approximate_e(n): num = 1. fac = 1. for i in range(1, n + 1): fac *= i num += 1. / fac return num @qml.qjit(autograph=True) def g(x: float, N: int): for i in range(N): x = x + catalyst.disable_autograph(approximate_e)(10) / x ** i return x
>>> g(0.1, 10) array(4.02997319)
Note that for Autograph to be disabled, the decorated function must be defined outside the qjit-compiled function. If it is defined within the qjit-compiled function, it will continue to be converted with Autograph.
In addition, Autograph can also be disabled for all externally defined functions within a qjit-compiled function via the context manager syntax:
@qml.qjit(autograph=True) def g(x: float, N: int): for i in range(N): with catalyst.disable_autograph: x = x + approximate_e(10) / x ** i return x
Support for including a list of (sub)modules to be allowlisted for autograph conversion. (#725)
Although library code is not meant to be targeted by Autograph conversion, it sometimes make sense to enable it for specific submodules that might benefit from such conversion:
@qjit(autograph=True, autograph_include=["excluded_module.submodule"]) def f(x): return excluded_module.submodule.func(x)
For example, this might be useful if importing functionality from PennyLane (such as a transform or decomposition), and would like to have Autograph capture and convert associated control flow.
Controlled operations that do not have a matrix representation defined are now supported via applying PennyLane’s decomposition. (#831)
@qjit @qml.qnode(qml.device("lightning.qubit", wires=2)) def circuit(): qml.Hadamard(0) qml.ctrl(qml.TrotterProduct(H, time=2.4, order=2), control=[1]) return qml.state()
Catalyst has now officially support on Linux aarch64, with pre-built binaries available on PyPI; simply
pip install pennylane-catalyston Linux aarch64 systems. (#767)
Improvements
Validation is now performed for observables and operations to ensure that provided circuits are compatible with the devices for execution. (#626) (#783)
dev = qml.device("lightning.qubit", wires=2, shots=10000) @qjit @qml.qnode(dev) def circuit(x): qml.Hadamard(wires=0) qml.CRX(x, wires=[0, 1]) return qml.var(qml.PauliZ(1))
>>> circuit(0.43) DifferentiableCompileError: Variance returns are forbidden in gradients
Catalyst’s adjoint and ctrl methods are now fully compatible with the PennyLane equivalent when applied to a single Operator. This should lead to improved compatibility with PennyLane library code, as well when reusing quantum functions with both Catalyst and PennyLane. (#768) (#771) (#802)
Controlled operations defined via specialized classes (like
ToffoliorControlledQubitUnitary) are now implemented as controlled versions of their base operation if the device supports it. In particular,MultiControlledXis no longer executed as aQubitUnitarywith Lightning. (#792)The Catalyst frontend now supports Python logging through PennyLane’s
qml.loggingmodule. For more details, please see the logging documentation. (#660)Catalyst now performs a stricter validation of the wire requirements for devices. In particular, only integer, continuous wire labels starting at 0 are allowed. (#784)
Catalyst no longer disallows quantum circuits with 0 qubits. (#784)
Added support for
IsingZZas a native gate in Catalyst. Previously, the IsingZZ gate would be decomposed into a CNOT and RZ gates, even if a device supported it. (#730)All decorators in Catalyst, including
vmap,qjit,mitigate_with_zne, as well as gradient decoratorsgrad,jacobian,jvp, andvjp, can now be used both with and without keyword arguments as a decorator without the need forfunctools.partial: (#758) (#761) (#762) (#763)@qjit @grad(method="fd") def fn1(x): return x ** 2 @qjit(autograph=True) @grad def fn2(x): return jnp.sin(x)
>>> fn1(0.43) array(0.8600001) >>> fn2(0.12) array(0.99280864)
The built-in instrumentation with
detailedoutput will no longer report the cumulative time for MLIR pipelines, since the cumulative time was being reported as just another step alongside individual timings for each pipeline. (#772)Raise a better error message when no shots are specified and
qml.sampleorqml.countsis used. (#786)The finite difference method for differentiation is now always allowed, even on functions with mid-circuit measurements, callbacks without custom derivates, or other operations that cannot be differentiated via traditional autodiff. (#789)
A
non_commuting_observablesflag has been added to the device TOML schema, indicating whether or not the device supports measuring non-commuting observables. Iffalse, non-commuting measurements will be split into multiple executions. (#821)The underlying PennyLane
Operationobjects forcond,for_loop, andwhile_loopcan now be accessed directly viabody_function.operation. (#711)This can be beneficial when, among other things, writing transforms without using the queuing mechanism:
@qml.transform def my_quantum_transform(tape): ops = tape.operations.copy() @for_loop(0, 4, 1) def f(i, sum): qml.Hadamard(0) return sum+1 res = f(0) ops.append(f.operation) # This is now supported! def post_processing_fn(results): return results modified_tape = qml.tape.QuantumTape(ops, tape.measurements) print(res) print(modified_tape.operations) return [modified_tape], post_processing_fn @qml.qjit @my_quantum_transform @qml.qnode(qml.device("lightning.qubit", wires=2)) def main(): qml.Hadamard(0) return qml.probs()
>>> main() Traced<ShapedArray(int64[], weak_type=True)>with<DynamicJaxprTrace(level=2/1)> [Hadamard(wires=[0]), ForLoop(tapes=[[Hadamard(wires=[0])]])] (array([0.5, 0. , 0.5, 0. ]),)
Breaking changes
Binary distributions for Linux are now based on
manylinux_2_28instead ofmanylinux_2014. As a result, Catalyst will only be compatible on systems withglibcversions2.28and above (e.g., Ubuntu 20.04 and above). (#663)
Bug fixes
Functions that have been annotated with return type annotations will now correctly compile with
@qjit. (#751)An issue in the Lightning backend for the Catalyst runtime has been fixed that would only compute approximate probabilities when implementing mid-circuit measurements. As a result, low shot numbers would lead to unexpected behaviours or projections on zero probability states. Probabilities for mid-circuit measurements are now always computed analytically. (#801)
The Catalyst runtime now raises an error if a qubit is accessed out of bounds from the allocated register. (#784)
jax.scipy.linalg.expmis now supported within qjit-compiled functions. (#733) (#752)This required correctly linking openblas routines necessary for
jax.scipy.linalg.expm. In this bug fix, four openblas routines were newly linked and are now discoverable bystablehlo.custom_call@<blas_routine>. They areblas_dtrsm,blas_ztrsm,lapack_dgetrf,lapack_zgetrf.Fixes a bug where QNodes that contained
QubitUnitarywith a complex matrix would error during gradient computation. (#778)Callbacks can now return types which can be flattened and unflattened. (#812)
catalyst.qjitandcatalyst.gradnow work correctly on functions that have been wrapped withfunctools.partial. (#820)
Internal changes
Catalyst uses the
collapsemethod of Lightning simulators inMeasureto select a state vector branch and normalize. (#801)Measurement process primitives for Catalyst’s JAXPR representation now have a standardized call signature so that
shotsandshapecan both be provided as keyword arguments. (#790)The
QCtrlclass in Catalyst has been renamed toHybridCtrl, indicating its capability to contain a nested scope of both quantum and classical operations. Usingctrlon a single operation will now directly dispatch to the equivalent PennyLane class. (#771)The
Adjointclass in Catalyst has been renamed toHybridAdjoint, indicating its capability to contain a nested scope of both quantum and classical operations. Usingadjointon a single operation will now directly dispatch to the equivalent PennyLane class. (#768) (#802)Add support to use a locally cloned PennyLane Lightning repository with the runtime. (#732)
The
qjit_device.pyandpreprocessing.pymodules have been refactored into the sub-packagecatalyst.device. (#721)The
ag_autograph.pyandautograph.pymodules have been refactored into the sub-packagecatalyst.autograph. (#722)Callback refactoring. This refactoring creates the classes
FlatCallableandMemrefCallable. (#742)The
FlatCallableclass is aCallablethat is initialized by providing some parameters and kwparameters that match the the expected shapes that will be received at the callsite. Instead of taking shaped*argsand**kwargs, it receives flattened arguments. The flattened arguments are unflattened with the shapes with which the function was initialized. TheFlatCallablereturn values will allways be flattened before returning to the caller.The
MemrefCallableis a subclass ofFlatCallable. It takes a result type parameter during initialization that corresponds to the expected return type. This class is expected to be called only from the Catalyst runtime. It expects all arguments to bevoid*to memrefs. Thesevoid*are casted to MemrefStructDescriptors using ctypes, numpy arrays, and finally jax arrays. These flat jax arrays are then sent to theFlatCallable.MemrefCallableis again expected to be called only from within the Catalyst runtime. And the return values match those expected by Catalyst runtime.This separation allows for a better separation of concerns, provides a nicer interface and allows for multiple
MemrefCallableto be defined for a single callback, which is necessary for custom gradient ofpure_callbacks.A new
catalyst::gradient::GradientOpInterfaceis available when querying the gradient method in the mlir c++ api. (#800)catalyst::gradient::GradOp,ValueAndGradOp,JVPOp, andVJPOpnow inherits traits in this newGradientOpInterface. The supported attributes are nowgetMethod(),getCallee(),getDiffArgIndices(),getDiffArgIndicesAttr(),getFiniteDiffParam(), andgetFiniteDiffParamAttr().There are operations that could potentially be used as
GradOp,ValueAndGradOp,JVPOporVJPOp. When trying to get the gradient method, instead of doingauto gradOp = dyn_cast<GradOp>(op); auto jvpOp = dyn_cast<JVPOp>(op); auto vjpOp = dyn_cast<VJPOp>(op); llvm::StringRef MethodName; if (gradOp) MethodName = gradOp.getMethod(); else if (jvpOp) MethodName = jvpOp.getMethod(); else if (vjpOp) MethodName = vjpOp.getMethod();
to identify which op it actually is and protect against segfaults (calling
nullptr.getMethod()), in the new interface we just doauto gradOpInterface = cast<GradientOpInterface>(op); llvm::StringRef MethodName = gradOpInterface.getMethod();
Another advantage is that any concrete gradient operation object can behave like a
GradientOpInterface:GradOp op; // or ValueAndGradOp op, ... auto foo = [](GradientOpInterface op){ llvm::errs() << op.getCallee(); }; foo(op); // this works!
Finally, concrete op specific methods can still be called by “reinterpret”-casting the interface back to a concrete op (provided the concrete op type is correct):
auto foo = [](GradientOpInterface op){ size_t numGradients = cast<ValueAndGradOp>(&op)->getGradients().size(); }; ValueAndGradOp op; foo(op); // this works!
Contributors
This release contains contributions from (in alphabetical order):
Ali Asadi, Lillian M.A. Frederiksen, David Ittah, Christina Lee, Erick Ochoa, Haochen Paul Wang, Lee James O’Riordan, Mehrdad Malekmohammadi, Vincent Michaud-Rioux, Mudit Pandey, Raul Torres, Sergei Mironov, Tzung-Han Juang.
Release 0.6.0¶
New features
Catalyst now supports externally hosted callbacks with parameters and return values within qjit-compiled code. This provides the ability to insert native Python code into any qjit-compiled function, allowing for the capability to include subroutines that do not yet support qjit-compilation and enhancing the debugging experience. (#540) (#596) (#610) (#650) (#649) (#661) (#686) (#689)
The following two callback functions are available:
catalyst.pure_callbacksupports callbacks of pure functions. That is, functions with no side-effects that accept parameters and return values. However, the return type and shape of the function must be known in advance, and is provided as a type signature.@pure_callback def callback_fn(x) -> float: # here we call non-JAX compatible code, such # as standard NumPy return np.sin(x) @qjit def fn(x): return jnp.cos(callback_fn(x ** 2))
>>> fn(0.654) array(0.9151995)
catalyst.debug.callbacksupports callbacks of functions with no return values. This makes it an easy entry point for debugging, for example via printing or logging at runtime.@catalyst.debug.callback def callback_fn(y): print("Value of y =", y) @qjit def fn(x): y = jnp.sin(x) callback_fn(y) return y ** 2
>>> fn(0.54) Value of y = 0.5141359916531132 array(0.26433582) >>> fn(1.52) Value of y = 0.998710143975583 array(0.99742195)
Note that callbacks do not currently support differentiation, and cannot be used inside functions that
catalyst.gradis applied to.More flexible runtime printing through support for format strings. (#621)
The
catalyst.debug.printfunction has been updated to support Python-like format strings:@qjit def cir(a, b, c): debug.print("{c} {b} {a}", a=a, b=b, c=c)
>>> cir(1, 2, 3) 3 2 1
Note that previous functionality of the print function to print out memory reference information of variables has been moved to
catalyst.debug.print_memref.Catalyst now supports QNodes that execute on Oxford Quantum Circuits (OQC) superconducting hardware, via OQC Cloud. (#578) (#579) (#691)
To use OQC Cloud with Catalyst, simply ensure your credentials are set as environment variables, and load the
oqc.clouddevice to be used within your qjit-compiled workflows.import os os.environ["OQC_EMAIL"] = "your_email" os.environ["OQC_PASSWORD"] = "your_password" os.environ["OQC_URL"] = "oqc_url" dev = qml.device("oqc.cloud", backend="lucy", shots=2012, wires=2) @qjit @qml.qnode(dev) def circuit(a: float): qml.Hadamard(0) qml.CNOT(wires=[0, 1]) qml.RX(wires=0) return qml.counts(wires=[0, 1]) print(circuit(0.2))
Catalyst now ships with an instrumentation feature allowing to explore what steps are run during compilation and execution, and for how long. (#528) (#597)
Instrumentation can be enabled from the frontend with the
catalyst.debug.instrumentationcontext manager:>>> @qjit ... def expensive_function(a, b): ... return a + b >>> with debug.instrumentation("session_name", detailed=False): ... expensive_function(1, 2) [DIAGNOSTICS] Running capture walltime: 3.299 ms cputime: 3.294 ms programsize: 0 lines [DIAGNOSTICS] Running generate_ir walltime: 4.228 ms cputime: 4.225 ms programsize: 14 lines [DIAGNOSTICS] Running compile walltime: 57.182 ms cputime: 12.109 ms programsize: 121 lines [DIAGNOSTICS] Running run walltime: 1.075 ms cputime: 1.072 ms
The results will be appended to the provided file if the
filenameattribute is set, and printed to the console otherwise. The flagdetaileddetermines whether individual steps in the compiler and runtime are instrumented, or whether only high-level steps like “program capture” and “compilation” are reported.Measurements currently include wall time, CPU time, and (intermediate) program size.
Improvements
AutoGraph now supports return statements inside conditionals in qjit-compiled functions. (#583)
For example, the following pattern is now supported, as long as all return values have the same type:
@qjit(autograph=True) def fn(x): if x > 0: return jnp.sin(x) return jnp.cos(x)
>>> fn(0.1) array(0.09983342) >>> fn(-0.1) array(0.99500417)
This support extends to quantum circuits:
dev = qml.device("lightning.qubit", wires=1) @qjit(autograph=True) @qml.qnode(dev) def f(x: float): qml.RX(x, wires=0) m = catalyst.measure(0) if not m: return m, qml.expval(qml.PauliZ(0)) qml.RX(x ** 2, wires=0) return m, qml.expval(qml.PauliZ(0))
>>> f(1.4) (array(False), array(1.)) >>> f(1.4) (array(True), array(0.37945176))
Note that returning results with different types or shapes within the same function, such as different observables or differently shaped arrays, is not possible.
Errors are now raised at compile time if the gradient of an unsupported function is requested. (#204)
At the moment,
CompileErrorexceptions will be raised if at compile time it is found that code reachable from the gradient operation contains either a mid-circuit measurement, a callback, or a JAX-style custom call (which happens through the mitigation operation as well as certain JAX operations).Catalyst now supports devices built from the new PennyLane device API. (#565) (#598) (#599) (#636) (#638) (#664) (#687)
When using the new device API, Catalyst will discard the preprocessing from the original device, replacing it with Catalyst-specific preprocessing based on the TOML file provided by the device. Catalyst also requires that provided devices specify their wires upfront.
A new compiler optimization that removes redundant chains of self inverse operations has been added. This is done within a new MLIR pass called
remove-chained-self-inverse. Currently we only match redundant Hadamard operations, but the list of supported operations can be expanded. (#630)The
catalyst.measureoperation is now more lenient in the accepted type for thewiresparameter. In addition to a scalar, a 1D array is also accepted as long as it only contains one element. (#623)For example, the following is now supported:
catalyst.measure(wires=jnp.array([0]))
The compilation & execution of
@qjitcompiled functions can now be aborted using an interrupt signal (SIGINT). This includes usingCTRL-Cfrom a command line and theInterruptbutton in a Jupyter Notebook. (#642)The Catalyst Amazon Braket support has been updated to work with the latest version of the Amazon Braket PennyLane plugin (v1.25.0) and Amazon Braket Python SDK (v1.73.3) (#620) (#672) (#673)
Note that with this update, all declared qubits in a submitted program will always be measured, even if specific qubits were never used.
An updated quantum device specification format, TOML schema v2, is now supported by Catalyst. This allows device authors to specify properties such as native quantum control support, gate invertibility, and differentiability on a per-operation level. (#554)
For more details on the new TOML schema, please refer to the custom devices documentation.
An exception is now raised when OpenBLAS cannot be found by Catalyst during compilation. (#643)
Breaking changes
qml.sampleandqml.countsnow produce integer arrays for the sample array and basis state array when used without observables. (#648)The endianness of counts in Catalyst now matches the convention of PennyLane. (#601)
catalyst.debug.printno longer supports thememrefkeyword argument. Please usecatalyst.debug.print_memrefinstead. (#621)
Bug fixes
The QNode argument
diff_method=Noneis now supported for QNodes within a qjit-compiled function. (#658)A bug has been fixed where the C++ compiler driver was incorrectly being triggered twice. (#594)
Programs with
jnp.reshapeno longer fail. (#592)A bug in the quantum adjoint routine in the compiler has been fixed, which didn’t take into account control wires on operations in all instances. (#591)
A bug in the test suite causing stochastic autograph test failures has been fixed. (#652)
Running Catalyst tests should no longer raise
ResourceWarningfrom the use oftempfile.TemporaryDirectory. (#676)Raises an exception if the user has an incompatible CUDA Quantum version installed. (#707)
Internal changes
The deprecated
@qfuncdecorator, in use mainly by the LIT test suite, has been removed. (#679)Catalyst now publishes a revision string under
catalyst.__revision__, in addition to the existingcatalyst.__version__string. The revision contains the Git commit hash of the repository at the time of packaging, or for editable installations the active commit hash at the time of package import. (#560)The Python interpreter is now a shared resource across the runtime. (#615)
This change allows any part of the runtime to start executing Python code through pybind.
Contributors
This release contains contributions from (in alphabetical order):
Ali Asadi, David Ittah, Romain Moyard, Sergei Mironov, Erick Ochoa Lopez, Lee James O’Riordan, Muzammiluddin Syed.
Release 0.5.0¶
New features
Catalyst now provides a QJIT compatible
catalyst.vmapfunction, which makes it even easier to modify functions to map over inputs with additional batch dimensions. (#497) (#569)When working with tensor/array frameworks in Python, it can be important to ensure that code is written to minimize usage of Python for loops (which can be slow and inefficient), and instead push as much of the computation through to the array manipulation library, by taking advantage of extra batch dimensions.
For example, consider the following QNode:
dev = qml.device("lightning.qubit", wires=1) @qml.qnode(dev) def circuit(x, y): qml.RX(jnp.pi * x[0] + y, wires=0) qml.RY(x[1] ** 2, wires=0) qml.RX(x[1] * x[2], wires=0) return qml.expval(qml.PauliZ(0))
>>> circuit(jnp.array([0.1, 0.2, 0.3]), jnp.pi) Array(-0.93005586, dtype=float64)
We can use
catalyst.vmapto introduce additional batch dimensions to our input arguments, without needing to use a Python for loop:>>> x = jnp.array([[0.1, 0.2, 0.3], ... [0.4, 0.5, 0.6], ... [0.7, 0.8, 0.9]]) >>> y = jnp.array([jnp.pi, jnp.pi / 2, jnp.pi / 4]) >>> qjit(vmap(cost))(x, y) array([-0.93005586, -0.97165424, -0.6987465 ])
catalyst.vmap()has been implemented to match the same behaviour ofjax.vmap, so should be a drop-in replacement in most cases. Under-the-hood, it is automatically inserting Catalyst-compatible for loops, which will be compiled and executed outside of Python for increased performance.Catalyst now supports compiling and executing QJIT-compiled QNodes using the CUDA Quantum compiler toolchain. (#477) (#536) (#547)
Simply import the CUDA Quantum
@cudaqjitdecorator to use this functionality:from catalyst.cuda import cudaqjit
Or, if using Catalyst from PennyLane, simply specify
@qml.qjit(compiler="cuda_quantum").The following devices are available when compiling with CUDA Quantum:
softwareq.qpp: a modern C++ state-vector simulatornvidia.custatevec: The NVIDIA CuStateVec GPU simulator (with support for multi-gpu)nvidia.cutensornet: The NVIDIA CuTensorNet GPU simulator (with support for matrix product state)
For example:
dev = qml.device("softwareq.qpp", wires=2) @cudaqjit @qml.qnode(dev) def circuit(x): qml.RX(x[0], wires=0) qml.RY(x[1], wires=1) qml.CNOT(wires=[0, 1]) return qml.expval(qml.PauliY(0))
>>> circuit(jnp.array([0.5, 1.4])) -0.47244976756708373
Note that CUDA Quantum compilation currently does not have feature parity with Catalyst compilation; in particular, AutoGraph, control flow, differentiation, and various measurement statistics (such as probabilities and variance) are not yet supported. Classical code support is also limited.
Catalyst now supports just-in-time compilation of static (compile-time constant) arguments. (#476) (#550)
The
@qjitdecorator takes a new argumentstatic_argnums, which specifies positional arguments of the decorated function should be treated as compile-time static arguments.This allows any hashable Python object to be passed to the function during compilation; the function will only be re-compiled if the hash value of the static arguments change. Otherwise, re-using previous static argument values will result in no re-compilation.
@qjit(static_argnums=(1,)) def f(x, y): print(f"Compiling with y={y}") return x + y
>>> f(0.5, 0.3) Compiling with y=0.3 array(0.8) >>> f(0.1, 0.3) # no re-compilation occurs array(0.4) >>> f(0.1, 0.4) # y changes, re-compilation Compiling with y=0.4 array(0.5)
This functionality can be used to support passing arbitrary Python objects to QJIT-compiled functions, as long as they are hashable:
from dataclasses import dataclass @dataclass class MyClass: val: int def __hash__(self): return hash(str(self)) @qjit(static_argnums=(1,)) def f(x: int, y: MyClass): return x + y.val
>>> f(1, MyClass(5)) array(6) >>> f(1, MyClass(6)) # re-compilation array(7) >>> f(2, MyClass(5)) # no re-compilation array(7)
Mid-circuit measurements now support post-selection and qubit reset when used with the Lightning simulators. (#491) (#507)
To specify post-selection, simply pass the
postselectargument to thecatalyst.measurefunction:dev = qml.device("lightning.qubit", wires=1) @qjit @qml.qnode(dev) def f(): qml.Hadamard(0) m = measure(0, postselect=1) return qml.expval(qml.PauliZ(0))
Likewise, to reset a wire after mid-circuit measurement, simply specify
reset=True:dev = qml.device("lightning.qubit", wires=1) @qjit @qml.qnode(dev) def f(): qml.Hadamard(0) m = measure(0, reset=True) return qml.expval(qml.PauliZ(0))
Improvements
Catalyst now supports Python 3.12 (#532)
The JAX version used by Catalyst has been updated to
v0.4.23. (#428)Catalyst now supports the
qml.GlobalPhaseoperation. (#563)Native support for
qml.PSWAPandqml.ISWAPgates on Amazon Braket devices has been added. (#458)Specifically, a circuit like
dev = qml.device("braket.local.qubit", wires=2, shots=100) @qjit @qml.qnode(dev) def f(x: float): qml.Hadamard(0) qml.PSWAP(x, wires=[0, 1]) qml.ISWAP(wires=[1, 0]) return qml.probs()
Add support for
GlobalPhasegate in the runtime. (#563)would no longer decompose the
PSWAPandISWAPgates.The
qml.BlockEncodeoperator is now supported with Catalyst. (#483)Catalyst no longer relies on a TensorFlow installation for its AutoGraph functionality. Instead, the standalone
diastatic-maltpackage is used and automatically installed as a dependency. (#401)The
@qjitdecorator will remember previously compiled functions when the PyTree metadata of arguments changes, in addition to also remembering compiled functions when static arguments change. (#522)The following example will no longer trigger a third compilation:
@qjit def func(x): print("compiling") return x
>>> func([1,]); # list compiling >>> func((2,)); # tuple compiling >>> func([3,]); # list
Note however that in order to keep overheads low, changing the argument type or shape (in a promotion incompatible way) may override a previously stored function (with identical PyTree metadata and static argument values):
@qjit def func(x): print("compiling") return x
>>> func(jnp.array(1)); # scalar compiling >>> func(jnp.array([2.])); # 1-D array compiling >>> func(jnp.array(3)); # scalar compiling
Catalyst gradient functions (
grad,jacobian,vjp, andjvp) now support being applied to functions that use (nested) container types as inputs and outputs. This includes lists and dictionaries, as well as any data structure implementing the PyTree protocol. (#500) (#501) (#508) (#549)dev = qml.device("lightning.qubit", wires=1) @qml.qnode(dev) def circuit(phi, psi): qml.RY(phi, wires=0) qml.RX(psi, wires=0) return [{"expval0": qml.expval(qml.PauliZ(0))}, qml.expval(qml.PauliZ(0))] psi = 0.1 phi = 0.2
>>> qjit(jacobian(circuit, argnum=[0, 1]))(psi, phi) [{'expval0': (array(-0.0978434), array(-0.19767681))}, (array(-0.0978434), array(-0.19767681))]
Support has been added for linear algebra functions which depend on computing the eigenvalues of symmetric matrices, such as
np.sqrt_matrix(). (#488)For example, you can compile
qml.math.sqrt_matrix:@qml.qjit def workflow(A): B = qml.math.sqrt_matrix(A) return B @ A
Internally, this involves support for lowering the eigenvectors/values computation lapack method
lapack_dsyevdviastablehlo.custom_call.Additional debugging functions are now available in the
catalyst.debugdirectory. (#529) (#522)This includes:
filter_static_args(args, static_argnums)to remove static values from arguments using the provided index list.get_cmain(fn, *args)to return a C program that calls a jitted function with the provided arguments.print_compilation_stage(fn, stage)to print one of the recorded compilation stages for a JIT-compiled function.
For more details, please see the
catalyst.debugdocumentation.Remove redundant copies of TOML files for
lightning.kokkosandlightning.qubit. (#472)lightning.kokkosandlightning.qubitnow ship with their own TOML file. As such, we use the TOML file provided by them.Capturing quantum circuits with many gates prior to compilation is now quadratically faster (up to a factor), by removing
qextract_pandqinst_pfrom forced-order primitives. (#469)Update
AllocateQubitandAllocateQubitsinLightningKokkosSimulatorto preserve the current state-vector before qubit re-allocations in the runtime dynamic qubits management. (#479)The PennyLane custom compiler entry point name convention has changed, necessitating a change to the Catalyst entry points. (#493)
Breaking changes
Catalyst gradient functions now match the Jax convention for the returned axes of gradients, Jacobians, VJPs, and JVPs. As a result, the returned tensor shape from various Catalyst gradient functions may differ compared to previous versions of Catalyst. (#500) (#501) (#508)
The Catalyst Python frontend has been partially refactored. The impact on user-facing functionality is minimal, but the location of certain classes and methods used by the package may have changed. (#529) (#522)
The following changes have been made:
Some debug methods and features on the QJIT class have been turned into free functions and moved to the
catalyst.debugmodule, which will now appear in the public documention. This includes compiling a program from IR, obtaining a C program to invoke a compiled function from, and printing fine-grained MLIR compilation stages.The
compilation_pipelines.pymodule has been renamed tojit.py, and certain functionality has been moved out (see following items).A new module
compiled_functions.pynow manages low-level access to compiled functions.A new module
tracing/type_signatures.pyhandles functionality related managing arguments and type signatures during the tracing process.The
contexts.pymodule has been moved fromutilsto the newtracingsub-module.
Internal changes
Changes to the runtime QIR API and dependencies, to avoid symbol conflicts with other libraries that utilize QIR. (#464) (#470)
The existing Catalyst runtime implements QIR as a library that can be linked against a QIR module. This works great when Catalyst is the only implementor of QIR, however it may generate symbol conflicts when used alongside other QIR implementations.
To avoid this, two changes were necessary:
The Catalyst runtime now has a different API from QIR instructions.
The runtime has been modified such that QIR instructions are lowered to functions where the
__quantum__part of the function name is replaced with__catalyst__. This prevents the possibility of symbol conflicts with other libraries that implement QIR as a library.The Catalyst runtime no longer depends on QIR runner’s stdlib.
We no longer depend nor link against QIR runner’s stdlib. By linking against QIR runner’s stdlib, some definitions persisted that may be different than ones used by third party implementors. To prevent symbol conflicts QIR runner’s stdlib was removed and is no longer linked against. As a result, the following functions are now defined and implemented in Catalyst’s runtime:
int64_t __catalyst__rt__array_get_size_1d(QirArray *)int8_t *__catalyst__rt__array_get_element_ptr_1d(QirArray *, int64_t)
and the following functions were removed since the frontend does not generate them
QirString *__catalyst__rt__qubit_to_string(QUBIT *)QirString *__catalyst__rt__result_to_string(RESULT *)
Fix an issue when no qubit number was specified for the
qinstprimitive. The primitive now correctly deduces the number of qubits when no gate parameters are present. This change is not user facing. (#496)
Bug fixes
Fixed a bug where differentiation of sliced arrays would result in an error. (#552)
def f(x): return jax.numpy.sum(x[::2]) x = jax.numpy.array([0.1, 0.2, 0.3, 0.4])
>>> catalyst.qjit(catalyst.grad(f))(x) [1. 0. 1. 0.]
Fixed a bug where quantum control applied to a subcircuit was not correctly mapping wires, and the wires in the nested region remained unchanged. (#555)
Catalyst will no longer print a warning that recompilation is triggered when a
@qjitdecorated function with no arguments is invoke without having been compiled first, for example via the use oftarget="mlir". (#522)Fixes a bug in the configuration of dynamic shaped arrays that would cause certain program to error with
TypeError: cannot unpack non-iterable ShapedArray object. (#526)This is fixed by replacing the code which updates the
JAX_DYNAMIC_SHAPESoption with atransient_jax_config()context manager which temporarily sets the value ofJAX_DYNAMIC_SHAPESto True and then restores the original configuration value following the yield. The context manager is used bytrace_to_jaxpr()andlower_jaxpr_to_mlir().Exceptions encountered in the runtime when using the
@qjitoptionasync_qnodes=Tuewill now be properly propagated to the frontend. (#447) (#510)This is done by:
changeing
llvm.calltollvm.invokesetting async runtime tokens and values to be errors
deallocating live tokens and values
Fixes a bug when computing gradients with the indexing/slicing, by fixing the scatter operation lowering when
updatedWindowsDimis empty. (#475)Fix the issue in
LightningKokkos::AllocateQubitswith allocating too many qubit IDs on qubit re-allocation. (#473)Fixed an issue where wires was incorrectly set as
<Wires = [<WiresEnum.AnyWires: -1>]>when usingcatalyst.adjointandcatalyst.ctrl, by adding awiresproperty to these operations. (#480)Fix the issue with multiple lapack symbol definitions in the compiled program by updating the
stablehlo.custom_callconversion pass. (#488)
Contributors
This release contains contributions from (in alphabetical order):
Mikhail Andrenkov, Ali Asadi, David Ittah, Tzung-Han Juang, Erick Ochoa Lopez, Romain Moyard, Raul Torres, Haochen Paul Wang.
Release 0.4.1¶
Improvements
Catalyst wheels are now packaged with OpenMP and ZStd, which avoids installing additional requirements separately in order to use pre-packaged Catalyst binaries. (#457) (#478)
Note that OpenMP support for the
lightning.kokkosbackend has been disabled on macOS x86_64, due to memory issues in the computation of Lightning’s adjoint-jacobian in the presence of multiple OMP threads.
Bug fixes
Resolve an infinite recursion in the decomposition of the
Controlledoperator whenever computing a Unitary matrix for the operator fails. (#468)Resolve a failure to generate gradient code for specific input circuits. (#439)
In this case,
jnp.modwas used to compute wire values in a for loop, which prevented the gradient architecture from fully separating quantum and classical code. The following program is now supported:@qjit @grad @qml.qnode(dev) def f(x): def cnot_loop(j): qml.CNOT(wires=[j, jnp.mod((j + 1), 4)]) for_loop(0, 4, 1)(cnot_loop)() return qml.expval(qml.PauliZ(0))
Resolve unpredictable behaviour when importing libraries that share Catalyst’s LLVM dependency (e.g. TensorFlow). In some cases, both packages exporting the same symbols from their shared libraries can lead to process crashes and other unpredictable behaviour, since the wrong functions can be called if both libraries are loaded in the current process. The fix involves building shared libraries with hidden (macOS) or protected (linux) symbol visibility by default, exporting only what is necessary. (#465)
Resolve a failure to find the SciPy OpenBLAS library when running Catalyst, due to a different SciPy version being used to build Catalyst than to run it. (#471)
Resolve a memory leak in the runtime stemming from missing calls to device destructors at the end of programs. (#446)
Contributors
This release contains contributions from (in alphabetical order):
Ali Asadi, David Ittah.
Release 0.4.0¶
New features
Catalyst is now accessible directly within the PennyLane user interface, once Catalyst is installed, allowing easy access to Catalyst just-in-time functionality.
Through the use of the
qml.qjitdecorator, entire workflows can be JIT compiled down to a machine binary on first-function execution, including both quantum and classical processing. Subsequent calls to the compiled function will execute the previously-compiled binary, resulting in significant performance improvements.import pennylane as qml dev = qml.device("lightning.qubit", wires=2) @qml.qjit @qml.qnode(dev) def circuit(theta): qml.Hadamard(wires=0) qml.RX(theta, wires=1) qml.CNOT(wires=[0, 1]) return qml.expval(qml.PauliZ(wires=1))
>>> circuit(0.5) # the first call, compilation occurs here array(0.) >>> circuit(0.5) # the precompiled quantum function is called array(0.)
Currently, PennyLane supports the Catalyst hybrid compiler with the
qml.qjitdecorator, which directly aliases Catalyst’scatalyst.qjit.In addition to the above
qml.qjitintegration, the following native PennyLane functions can now be used with theqjitdecorator:qml.adjoint,qml.ctrl,qml.grad,qml.jacobian,qml.vjp,qml.jvp, andqml.adjoint,qml.while_loop,qml.for_loop,qml.cond. These will alias to the corresponding Catalyst functions when used within aqjitcontext.For more details on these functions, please refer to the PennyLane compiler documentation and compiler module documentation.
Just-in-time compiled functions now support asynchronuous execution of QNodes. (#374) (#381) (#420) (#424) (#433)
Simply specify
async_qnodes=Truewhen using the@qjitdecorator to enable the async execution of QNodes. Currently, asynchronous execution is only supported bylightning.qubitandlightning.kokkos.Asynchronous execution will be most beneficial for just-in-time compiled functions that contain — or generate — multiple QNodes.
For example,
dev = qml.device("lightning.qubit", wires=2) @qml.qnode(device=dev) def circuit(params): qml.RX(params[0], wires=0) qml.RY(params[1], wires=1) qml.CNOT(wires=[0, 1]) return qml.expval(qml.PauliZ(wires=0)) @qjit(async_qnodes=True) def multiple_qnodes(params): x = jnp.sin(params) y = jnp.cos(params) z = jnp.array([circuit(x), circuit(y)]) # will be executed in parallel return circuit(z)
>>> func(jnp.array([1.0, 2.0])) 1.0
Here, the first two circuit executions will occur in parallel across multiple threads, as their execution can occur indepdently.
Preliminary support for PennyLane transforms has been added. (#280)
@qjit @qml.transforms.split_non_commuting @qml.qnode(dev) def circuit(x): qml.RX(x,wires=0) return [qml.expval(qml.PauliY(0)), qml.expval(qml.PauliZ(0))]
>>> circuit(0.4) [array(-0.51413599), array(0.85770868)]
Currently, most PennyLane transforms will work with Catalyst as long as:
The circuit does not include any Catalyst-specific features, such as Catalyst control flow or measurement,
The QNode returns only lists of measurement processes,
AutoGraph is disabled, and
The transformation does not require or depend on the numeric value of dynamic variables.
Catalyst now supports just-in-time compilation of dynamically-shaped arrays. (#366) (#386) (#390) (#411)
The
@qjitdecorator can now be used to compile functions that accepts or contain tensors whose dimensions are not known at compile time; runtime execution with different shapes is supported without recompilation.In addition, standard tensor initialization functions
jax.numpy.ones,jnp.zeros, andjnp.emptynow accept dynamic variables (where the value is only known at runtime).@qjit def func(size: int): return jax.numpy.ones([size, size], dtype=float)
>>> func(3) [[1. 1. 1.] [1. 1. 1.] [1. 1. 1.]]
When passing tensors as arguments to compiled functions, the
abstracted_axeskeyword argument to the@qjitdecorator can be used to specify which axes of the input arguments should be treated as abstract (and thus avoid recompilation).For example, without specifying
abstracted_axes, the followingsumfunction would recompile each time an array of different size is passed as an argument:>>> @qjit >>> def sum_fn(x): >>> return jnp.sum(x) >>> sum_fn(jnp.array([1])) # Compilation happens here. >>> sum_fn(jnp.array([1, 1])) # And here!
By passing
abstracted_axes, we can specify that the first axes of the first argument is to be treated as dynamic during initial compilation:>>> @qjit(abstracted_axes={0: "n"}) >>> def sum_fn(x): >>> return jnp.sum(x) >>> sum_fn(jnp.array([1])) # Compilation happens here. >>> sum_fn(jnp.array([1, 1])) # No need to recompile.
Note that support for dynamic arrays in control-flow primitives (such as loops), is not yet supported.
Error mitigation using the zero-noise extrapolation method is now available through the
catalyst.mitigate_with_znetransform. (#324) (#414)For example, given a noisy device (such as noisy hardware available through Amazon Braket):
dev = qml.device("noisy.device", wires=2) @qml.qnode(device=dev) def circuit(x, n): @for_loop(0, n, 1) def loop_rx(i): qml.RX(x, wires=0) loop_rx() qml.Hadamard(wires=0) qml.RZ(x, wires=0) loop_rx() qml.RZ(x, wires=0) qml.CNOT(wires=[1, 0]) qml.Hadamard(wires=1) return qml.expval(qml.PauliY(wires=0)) @qjit def mitigated_circuit(args, n): s = jax.numpy.array([1, 2, 3]) return mitigate_with_zne(circuit, scale_factors=s)(args, n)
>>> mitigated_circuit(0.2, 5) 0.5655341100116512
In addition, a mitigation dialect has been added to the MLIR layer of Catalyst. It contains a Zero Noise Extrapolation (ZNE) operation, with a lowering to a global folded circuit.
Improvements
The three backend devices provided with Catalyst,
lightning.qubit,lightning.kokkos, andbraket.aws, are now dynamically loaded at runtime. (#343) (#400)This takes advantage of the new backend plugin system provided in Catalyst v0.3.2, and allows the devices to be packaged separately from the runtime CAPI. Provided backend devices are now loaded at runtime, instead of being linked at compile time.
For more details on the backend plugin system, see the custom devices documentation.
Finite-shot measurement statistics (
expval,var, andprobs) are now supported for thelightning.qubitandlightning.kokkosdevices. Previously, exact statistics were returned even when finite shots were specified. (#392) (#410)>>> dev = qml.device("lightning.qubit", wires=2, shots=100) >>> @qjit >>> @qml.qnode(dev) >>> def circuit(x): >>> qml.RX(x, wires=0) >>> return qml.probs(wires=0) >>> circuit(0.54) array([0.94, 0.06]) >>> circuit(0.54) array([0.93, 0.07])
Catalyst gradient functions
grad,jacobian,jvp, andvjpcan now be invoked from outside a@qjitcontext. (#375)This simplifies the process of writing functions where compilation can be turned on and off easily by adding or removing the decorator. The functions dispatch to their JAX equivalents when the compilation is turned off.
dev = qml.device("lightning.qubit", wires=2) @qml.qnode(dev) def circuit(x): qml.RX(x, wires=0) return qml.expval(qml.PauliZ(0))
>>> grad(circuit)(0.54) # dispatches to jax.grad Array(-0.51413599, dtype=float64, weak_type=True) >>> qjit(grad(circuit))(0.54). # differentiates using Catalyst array(-0.51413599)
New
lightning.qubitconfiguration options are now supported via theqml.deviceloader, including Markov Chain Monte Carlo sampling support. (#369)dev = qml.device("lightning.qubit", wires=2, shots=1000, mcmc=True) @qml.qnode(dev) def circuit(x): qml.RX(x, wires=0) return qml.expval(qml.PauliZ(0))
>>> circuit(0.54) array(0.856)
Improvements have been made to the runtime and quantum MLIR dialect in order to support asynchronous execution.
The runtime now supports multiple active devices managed via a device pool. The new
RTDevicedata-class andRTDeviceStatusalong with thethread_localdevice instance pointer enable the runtime to better scope the lifetime of device instances concurrently. With these changes, one can create multiple active devices and execute multiple programs in a multithreaded environment. (#381)The ability to dynamically release devices has been added via
DeviceReleaseOpin the Quantum MLIR dialect. This is lowered to the__quantum__rt__device_release()runtime instruction, which updates the status of the device instance fromActivetoInactive. The runtime will reuse this deactivated instance instead of creating a new one automatically at runtime in a multi-QNode workflow when another device with identical specifications is requested. (#381)The
DeviceOpdefinition in the Quantum MLIR dialect has been updated to lower a tuple of device information('lib', 'name', 'kwargs')to a single device initialization call__quantum__rt__device_init(int8_t *, int8_t *, int8_t *). This allows the runtime to initialize device instances without keeping partial information of the device (#396)
The quantum adjoint compiler routine has been extended to support function calls that affect the quantum state within an adjoint region. Note that the function may only provide a single result consisting of the quantum register. By itself this provides no user-facing changes, but compiler pass developers may now generate quantum adjoint operations around a block of code containing function calls as well as quantum operations and control flow operations. (#353)
The allocation and deallocation operations in MLIR (
AllocOp,DeallocOp) now follow simple value semantics for qubit register values, instead of modelling memory in the MLIR trait system. Similarly, the frontend generates proper value semantics by deallocating the final register value.The change enables functions at the MLIR level to accept and return quantum register values, which would otherwise not be correctly identified as aliases of existing register values by the bufferization system. (#360)
Breaking changes
Third party devices must now provide a configuration TOML file, in order to specify their supported operations, measurements, and features for Catalyst compatibility. For more information please visit the Custom Devices section in our documentation. (#369)
Bug fixes
Resolves a bug in the compiler’s differentiation engine that results in a segmentation fault when attempting to differentiate non-differentiable quantum operations. The fix ensures that all existing quantum operation types are removed during gradient passes that extract classical code from a QNode function. It also adds a verification step that will raise an error if a gradient pass cannot successfully eliminate all quantum operations for such functions. (#397)
Resolves a bug that caused unpredictable behaviour when printing string values with the
debug.printfunction. The issue was caused by non-null-terminated strings. (#418)
Contributors
This release contains contributions from (in alphabetical order):
Ali Asadi, David Ittah, Romain Moyard, Sergei Mironov, Erick Ochoa Lopez, Shuli Shu.
Release 0.3.2¶
New features
The experimental AutoGraph feature now supports Python
whileloops, allowing native Python loops to be captured and compiled with Catalyst. (#318)dev = qml.device("lightning.qubit", wires=4) @qjit(autograph=True) @qml.qnode(dev) def circuit(n: int, x: float): i = 0 while i < n: qml.RX(x, wires=i) i += 1 return qml.expval(qml.PauliZ(0))
>>> circuit(4, 0.32) array(0.94923542)
This feature extends the existing AutoGraph support for Python
forloops andifstatements introduced in v0.3. Note that TensorFlow must be installed for AutoGraph support.For more details, please see the AutoGraph guide.
In addition to loops and conditional branches, AutoGraph now supports native Python
and,orandnotoperators in Boolean expressions. (#325)dev = qml.device("lightning.qubit", wires=1) @qjit(autograph=True) @qml.qnode(dev) def circuit(x: float): if x >= 0 and x < jnp.pi: qml.RX(x, wires=0) return qml.probs()
>>> circuit(0.43) array([0.95448287, 0.04551713]) >>> circuit(4.54) array([1., 0.])
Note that logical Boolean operators will only be captured by AutoGraph if all operands are dynamic variables (that is, a value known only at runtime, such as a measurement result or function argument). For other use cases, it is recommended to use the
jax.numpy.logical_*set of functions where appropriate.Debug compiled programs and print dynamic values at runtime with
debug.print(#279) (#356)You can now print arbitrary values from your running program, whether they are arrays, constants, strings, or abitrary Python objects. Note that while non-array Python objects will be printed at runtime, their string representation is captured at compile time, and thus will always be the same regardless of program inputs. The output for arrays optionally includes a descriptor for how the data is stored in memory (“memref”).
@qjit def func(x: float): debug.print(x, memref=True) debug.print("exit")
>>> func(jnp.array(0.43)) MemRef: base@ = 0x5629ff2b6680 rank = 0 offset = 0 sizes = [] strides = [] data = 0.43 exit
Catalyst now officially supports macOS X86_64 devices, with macOS binary wheels available for both AARCH64 and X86_64. (#347) (#313)
It is now possible to dynamically load third-party Catalyst compatible devices directly into a pre-installed Catalyst runtime on Linux. (#327)
To take advantage of this, third-party devices must implement the
Catalyst::Runtime::QuantumDeviceinterface, in addition to defining the following method:extern "C" Catalyst::Runtime::QuantumDevice* getCustomDevice() { return new CustomDevice(); }
This support can also be integrated into existing PennyLane Python devices that inherit from the
QuantumDeviceclass, by defining theget_c_interfacestatic method.For more details, see the custom devices documentation.
Improvements
Return values of conditional functions no longer need to be of exactly the same type. Type promotion is automatically applied to branch return values if their types don’t match. (#333)
@qjit def func(i: int, f: float): @cond(i < 3) def cond_fn(): return i @cond_fn.otherwise def otherwise(): return f return cond_fn()
>>> func(1, 4.0) array(1.0)
Automatic type promotion across conditional branches also works with AutoGraph:
@qjit(autograph=True) def func(i: int, f: float): if i < 3: i = i else: i = f return i
>>> func(1, 4.0) array(1.0)
AutoGraph now supports converting functions even when they are invoked through functional wrappers such as
adjoint,ctrl,grad,jacobian, etc. (#336)For example, the following should now succeed:
def inner(n): for i in range(n): qml.T(i) @qjit(autograph=True) @qml.qnode(dev) def f(n: int): adjoint(inner)(n) return qml.state()
To prepare for Catalyst’s frontend being integrated with PennyLane, the appropriate plugin entry point interface has been added to Catalyst. (#331)
For any compiler packages seeking to be registered in PennyLane, the
entry_pointsmetadata under the the group namepennylane.compilersmust be added, with the following entry points:context: Path to the compilation evaluation context manager. This context manager should have the methodcontext.is_tracing(), which returns True if called within a program that is being traced or captured.ops: Path to the compiler operations module. This operations module may contain compiler specific versions of PennyLane operations. Within a JIT context, PennyLane operations may dispatch to these.qjit: Path to the JIT compiler decorator provided by the compiler. This decorator should have the signatureqjit(fn, *args, **kwargs), wherefnis the function to be compiled.
The compiler driver diagnostic output has been improved, and now includes failing IR as well as the names of failing passes. (#349)
The scatter operation in the Catalyst dialect now uses an SCF for loop to avoid ballooning the compiled code. (#307)
The
CopyGlobalMemRefPasspass of our MLIR processing pipeline now supports dynamically shaped arrays. (#348)The Catalyst utility dialect is now included in the Catalyst MLIR C-API. (#345)
Fix an issue with the AutoGraph conversion system that would prevent the fallback to Python from working correctly in certain instances. (#352)
The following type of code is now supported:
@qjit(autograph=True) def f(): l = jnp.array([1, 2]) for _ in range(2): l = jnp.kron(l, l) return l
Catalyst now supports
jax.numpy.polyfitinside a qjitted function. (#367)Catalyst now supports custom calls (including the one from HLO). We added support in MLIR (operation, bufferization and lowering). In the
lib_custom_calls, developers then implement their custom calls and use external functions directly (e.g. Lapack). The OpenBlas library is taken from Scipy and linked in Catalyst, therefore any function from it can be used. (#367)
Breaking changes
The axis ordering for
catalyst.jacobianis updated to matchjax.jacobian. Assuming we have parameters of shape[a,b]and results of shape[c,d], the returned Jacobian will now have shape[c, d, a, b]instead of[a, b, c, d]. (#283)
Bug fixes
An upstream change in the PennyLane-Lightning project was addressed to prevent compilation issues in the
StateVectorLQubitDynamicclass in the runtime. The issue was introduced in #499. (#322)The
requirements.txtfile to build Catalyst from source has been updated with a minimum pip version,>=22.3. Previous versions of pip are unable to perform editable installs when the system-wide site-packages are read-only, even when the--userflag is provided. (#311)The frontend has been updated to make it compatible with PennyLane
MeasurementProcessobjects now being PyTrees in PennyLane version 0.33. (#315)
Contributors
This release contains contributions from (in alphabetical order):
Ali Asadi, David Ittah, Sergei Mironov, Romain Moyard, Erick Ochoa Lopez.
Release 0.3.1¶
New features
The experimental AutoGraph feature, now supports Python
forloops, allowing native Python loops to be captured and compiled with Catalyst. (#258)dev = qml.device("lightning.qubit", wires=n) @qjit(autograph=True) @qml.qnode(dev) def f(n): for i in range(n): qml.Hadamard(wires=i) return qml.expval(qml.PauliZ(0))
This feature extends the existing AutoGraph support for Python
ifstatements introduced in v0.3. Note that TensorFlow must be installed for AutoGraph support.The quantum control operation can now be used in conjunction with Catalyst control flow, such as loops and conditionals, via the new
catalyst.ctrlfunction. (#282)Similar in behaviour to the
qml.ctrlcontrol modifier from PennyLane,catalyst.ctrlcan additionally wrap around quantum functions which contain control flow, such as the Catalystcond,for_loop, andwhile_loopprimitives.@qjit @qml.qnode(qml.device("lightning.qubit", wires=4)) def circuit(x): @for_loop(0, 3, 1) def repeat_rx(i): qml.RX(x / 2, wires=i) catalyst.ctrl(repeat_rx, control=3)() return qml.expval(qml.PauliZ(0))
>>> circuit(0.2) array(1.)
Catalyst now supports JAX’s
array.at[index]notation for array element assignment and updating. (#273)@qjit def add_multiply(l: jax.core.ShapedArray((3,), dtype=float), idx: int): res = l.at[idx].multiply(3) res2 = l.at[idx].add(2) return res + res2 res = add_multiply(jnp.array([0, 1, 2]), 2)
>>> res [0, 2, 10]
For more details on available methods, see the JAX documentation.
Improvements
The Lightning backend device has been updated to work with the new PL-Lightning monorepo. (#259) (#277)
A new compiler driver has been implemented in C++. This improves compile-time performance by avoiding round-tripping, which is when the entire program being compiled is dumped to a textual form and re-parsed by another tool.
This is also a requirement for providing custom metadata at the LLVM level, which is necessary for better integration with tools like Enzyme. Finally, this makes it more natural to improve error messages originating from C++ when compared to the prior subprocess-based approach. (#216)
Support the
braket.devices.Devicesenum class ands3_destination_folderdevice options for AWS Braket remote devices. (#278)Improvements have been made to the build process, including avoiding unnecessary processes such as removing
optand downloading the wheel. (#298)Remove a linker warning about duplicate
rpaths when Catalyst wheels are installed on macOS. (#314)
Bug fixes
Fix incompatibilities with GCC on Linux introduced in v0.3.0 when compiling user programs. Due to these, Catalyst v0.3.0 only works when clang is installed in the user environment.
Remove undocumented package dependency on the zlib/zstd compression library. (#308)
Fix filesystem issue when compiling multiple functions with the same name and
keep_intermediate=True. (#306)Add support for applying the
adjointoperation toQubitUnitarygates.QubitUnitarywas not able to beadjointed when the variable holding the unitary matrix might change. This can happen, for instance, inside of a for loop. To solve this issue, the unitary matrix gets stored in the array list via push and pops. The unitary matrix is later reconstructed from the array list andQubitUnitarycan be executed in theadjointed context. (#304) (#310)
Contributors
This release contains contributions from (in alphabetical order):
Ali Asadi, David Ittah, Erick Ochoa Lopez, Jacob Mai Peng, Sergei Mironov, Romain Moyard.
Release 0.3.0¶
New features
Catalyst now officially supports macOS ARM devices, such as Apple M1/M2 machines, with macOS binary wheels available on PyPI. For more details on the changes involved to support macOS, please see the improvements section. (#229) (#232) (#233) (#234)
Write Catalyst-compatible programs with native Python conditional statements. (#235)
AutoGraph is a new, experimental, feature that automatically converts Python conditional statements like
if,else, andelif, into their equivalent functional forms provided by Catalyst (such ascatalyst.cond).This feature is currently opt-in, and requires setting the
autograph=Trueflag in theqjitdecorator:dev = qml.device("lightning.qubit", wires=1) @qjit(autograph=True) @qml.qnode(dev) def f(x): if x < 0.5: qml.RY(jnp.sin(x), wires=0) else: qml.RX(jnp.cos(x), wires=0) return qml.expval(qml.PauliZ(0))
The implementation is based on the AutoGraph module from TensorFlow, and requires a working TensorFlow installation be available. In addition, Python loops (
forandwhile) are not yet supported, and do not work in AutoGraph mode.Note that there are some caveats when using this feature especially around the use of global variables or object mutation inside of methods. A functional style is always recommended when using
qjitor AutoGraph.The quantum adjoint operation can now be used in conjunction with Catalyst control flow, such as loops and conditionals. For this purpose a new instruction,
catalyst.adjoint, has been added. (#220)catalyst.adjointcan wrap around quantum functions which contain the Catalystcond,for_loop, andwhile_loopprimitives. Previously, the usage ofqml.adjointon functions with these primitives would result in decomposition errors. Note that a future release of Catalyst will merge the behaviour ofcatalyst.adjointintoqml.adjointfor convenience.dev = qml.device("lightning.qubit", wires=3) @qjit @qml.qnode(dev) def circuit(x): @for_loop(0, 3, 1) def repeat_rx(i): qml.RX(x / 2, wires=i) adjoint(repeat_rx)() return qml.expval(qml.PauliZ(0))
>>> circuit(0.2) array(0.99500417)
Additionally, the ability to natively represent the adjoint construct in Catalyst’s program representation (IR) was added.
QJIT-compiled programs now support (nested) container types as inputs and outputs of compiled functions. This includes lists and dictionaries, as well as any data structure implementing the PyTree protocol. (#215) (#221)
For example, a program that accepts and returns a mix of dictionaries, lists, and tuples:
@qjit def workflow(params1, params2): res1 = params1["a"][0][0] + params2[1] return {"y1": jnp.sin(res1), "y2": jnp.cos(res1)}
>>> params1 = {"a": [[0.1], 0.2]} >>> params2 = (0.6, 0.8) >>> workflow(params1, params2) array(0.78332691)
Compile-time backpropagation of arbitrary hybrid programs is now supported, via integration with Enzyme AD. (#158) (#193) (#224) (#225) (#239) (#244)
This allows
catalyst.gradto differentiate hybrid functions that contain both classical pre-processing (inside & outside of QNodes), QNodes, as well as classical post-processing (outside of QNodes) via a combination of backpropagation and quantum gradient methods.The new default for the differentiation
methodattribute incatalyst.gradhas been changed to"auto", which performs Enzyme-based reverse mode AD on classical code, in conjunction with the quantumdiff_methodspecified on each QNode:dev = qml.device("lightning.qubit", wires=1) @qml.qnode(dev, diff_method="parameter-shift") def circuit(theta): qml.RX(jnp.exp(theta ** 2) / jnp.cos(theta / 4), wires=0) return qml.expval(qml.PauliZ(wires=0))
>>> grad = qjit(catalyst.grad(circuit, method="auto")) >>> grad(jnp.pi) array(0.05938718)
The reworked differentiation pipeline means you can now compute exact derivatives of programs with both classical pre- and post-processing, as shown below:
@qml.qnode(qml.device("lightning.qubit", wires=1), diff_method="adjoint") def circuit(theta): qml.RX(jnp.exp(theta ** 2) / jnp.cos(theta / 4), wires=0) return qml.expval(qml.PauliZ(wires=0)) def loss(theta): return jnp.pi / jnp.tanh(circuit(theta)) @qjit def grad_loss(theta): return catalyst.grad(loss)(theta)
>>> grad_loss(1.0) array(-1.90958669)
You can also use multiple QNodes with different differentiation methods:
@qml.qnode(qml.device("lightning.qubit", wires=1), diff_method="parameter-shift") def circuit_A(params): qml.RX(jnp.exp(params[0] ** 2) / jnp.cos(params[1] / 4), wires=0) return qml.probs() @qml.qnode(qml.device("lightning.qubit", wires=1), diff_method="adjoint") def circuit_B(params): qml.RX(jnp.exp(params[1] ** 2) / jnp.cos(params[0] / 4), wires=0) return qml.expval(qml.PauliZ(wires=0)) def loss(params): return jnp.prod(circuit_A(params)) + circuit_B(params) @qjit def grad_loss(theta): return catalyst.grad(loss)(theta)
>>> grad_loss(jnp.array([1.0, 2.0])) array([ 0.57367285, 44.4911605 ])
And you can differentiate purely classical functions as well:
def square(x: float): return x ** 2 @qjit def dsquare(x: float): return catalyst.grad(square)(x)
>>> dsquare(2.3) array(4.6)
Note that the current implementation of reverse mode AD is restricted to 1st order derivatives, but you can still use
catalyst.grad(method="fd")is still available to perform a finite differences approximation of any differentiable function.Add support for the new PennyLane arithmetic operators. (#250)
PennyLane is in the process of replacing
HamiltonianandTensorobservables with a set of general arithmetic operators. These consist of Prod, Sum and SProd.By default, using dunder methods (eg.
+,-,@,*) to combine operators with scalars or other operators will createHamiltonianandTensorobjects. However, these two methods will be deprecated in coming releases of PennyLane.To enable the new arithmetic operators, one can use
Prod,Sum, andSproddirectly or activate them by calling enable_new_opmath at the beginning of your PennyLane program.dev = qml.device("lightning.qubit", wires=2) @qjit @qml.qnode(dev) def circuit(x: float, y: float): qml.RX(x, wires=0) qml.RX(y, wires=1) qml.CNOT(wires=[0, 1]) return qml.expval(0.2 * qml.PauliX(wires=0) - 0.4 * qml.PauliY(wires=1))
>>> qml.operation.enable_new_opmath() >>> qml.operation.active_new_opmath() True >>> circuit(np.pi / 4, np.pi / 2) array(0.28284271)
Improvements
Better support for Hamiltonian observables:
Allow Hamiltonian observables with integer coefficients. (#248)
For example, compiling the following circuit wasn’t previously allowed, but is now supported in Catalyst:
dev = qml.device("lightning.qubit", wires=2) @qjit @qml.qnode(dev) def circuit(x: float, y: float): qml.RX(x, wires=0) qml.RY(y, wires=1) coeffs = [1, 2] obs = [qml.PauliZ(0), qml.PauliZ(1)] return qml.expval(qml.Hamiltonian(coeffs, obs))
Allow nested Hamiltonian observables. (#255)
@qjit @qml.qnode(qml.device("lightning.qubit", wires=3)) def circuit(x, y, coeffs1, coeffs2): qml.RX(x, wires=0) qml.RX(y, wires=1) qml.RY(x + y, wires=2) obs = [ qml.PauliX(0) @ qml.PauliZ(1), qml.Hamiltonian(coeffs1, [qml.PauliZ(0) @ qml.Hadamard(2)]), ] return qml.var(qml.Hamiltonian(coeffs2, obs))
Various performance improvements:
The execution and compile time of programs has been reduced, by generating more efficient code and avoiding unnecessary optimizations. Specifically, a scalarization procedure was added to the MLIR pass pipeline, and LLVM IR compilation is now invoked with optimization level 0. (#217)
The execution time of compiled functions has been improved in the frontend. (#213)
Specifically, the following changes have been made, which leads to a small but measurable improvement when using larger matrices as inputs, or functions with many inputs:
only loading the user program library once per compilation,
generating return value types only once per compilation,
avoiding unnecessary type promotion, and
avoiding unnecessary array copies.
Peak memory utilization of a JIT compiled program has been reduced, by allowing tensors to be scheduled for deallocation. Previously, the tensors were not deallocated until the end of the call to the JIT compiled function. (#201)
Various improvements have been made to enable Catalyst to compile on macOS:
Remove unnecessary
reinterpret_castfromObsManager. Removal of thesereinterpret_castallows compilation of the runtime to succeed in macOS. macOS uses an ILP32 mode for Aarch64 where they use the full 64 bit mode but with 32 bit Integer, Long, and Pointers. This patch also changes a test file to prevent a mismatch in machines which compile using ILP32 mode. (#229)Allow runtime to be compiled on macOS. Substitute
nprocwith a call toos.cpu_count()and use correct flags forld.64. (#232)Improve portability on the frontend to be available on macOS. Use
.dylib, remove unnecessary flags, and address behaviour difference in flags. (#233)Small compatibility changes in order for all integration tests to succeed on macOS. (#234)
Dialects can compile with older versions of clang by avoiding type mismatches. (#228)
The runtime is now built against
qir-stdlibpre-build artifacts. (#236)Small improvements have been made to the CI/CD, including fixing the Enzyme cache, generalize caches to other operating systems, fix build wheel recipe, and remove references to QIR in runtime’s Makefile. (#243) (#247)
Breaking changes
Support for Python 3.8 has been removed. (#231)
The default differentiation method on
gradandjacobianis reverse-mode automatic differentiation instead of finite differences. When a QNode does not have adiff_methodspecified, it will default to using the parameter shift method instead of finite-differences. (#244) (#271)The JAX version used by Catalyst has been updated to
v0.4.14, the minimum PennyLane version required is nowv0.32. (#264)Due to the change allowing Python container objects as inputs to QJIT-compiled functions, Python lists are no longer automatically converted to JAX arrays. (#231)
This means that indexing on lists when the index is not static will cause a
TracerIntegerConversionError, consistent with JAX’s behaviour.That is, the following example is no longer support:
@qjit def f(x: list, index: int): return x[index]
However, if the parameter
xabove is a JAX or NumPy array, the compilation will continue to succeed.The
catalyst.gradfunction has been renamed tocatalyst.jacobianand supports differentiation of functions that return multiple or non-scalar outputs. A newcatalyst.gradfunction has been added that enforces that it is differentiating a function with a single scalar return value. (#254)
Bug fixes
Fixed an issue preventing the differentiation of
qml.probswith the parameter-shift method. (#211)Fixed the incorrect return value data-type with functions returning
qml.counts. (#221)Fix segmentation fault when differentiating a function where a quantum measurement is used multiple times by the same operation. (#242)
Contributors
This release contains contributions from (in alphabetical order):
Ali Asadi, David Ittah, Erick Ochoa Lopez, Jacob Mai Peng, Romain Moyard, Sergei Mironov.
Release 0.2.1¶
Bug fixes
Add missing OpenQASM backend in binary distribution, which relies on the latest version of the AWS Braket plugin for PennyLane to resolve dependency issues between the plugin, Catalyst, and PennyLane. The Lightning-Kokkos backend with Serial and OpenMP modes is also added to the binary distribution. #198
Return a list of decompositions when calling the decomposition method for control operations. This allows Catalyst to be compatible with upstream PennyLane. #241
Improvements
When using OpenQASM-based devices the string representation of the circuit is printed on exception. #199
Use
pybind11::moduleinterface library instead ofpybind11::embedin the runtime for OpenQasm backend to avoid linking to the python library at compile time. #200
Contributors
This release contains contributions from (in alphabetical order):
Ali Asadi, David Ittah.
Release 0.2.0¶
New features
Catalyst programs can now be used inside of a larger JAX workflow which uses JIT compilation, automatic differentiation, and other JAX transforms. #96 #123 #167 #192
For example, call a Catalyst qjit-compiled function from within a JAX jit-compiled function:
dev = qml.device("lightning.qubit", wires=1) @qjit @qml.qnode(dev) def circuit(x): qml.RX(jnp.pi * x[0], wires=0) qml.RY(x[1] ** 2, wires=0) qml.RX(x[1] * x[2], wires=0) return qml.probs(wires=0) @jax.jit def cost_fn(weights): x = jnp.sin(weights) return jnp.sum(jnp.cos(circuit(x)) ** 2)
>>> cost_fn(jnp.array([0.1, 0.2, 0.3])) Array(1.32269195, dtype=float64)
Catalyst-compiled functions can now also be automatically differentiated via JAX, both in forward and reverse mode to first-order,
>>> jax.grad(cost_fn)(jnp.array([0.1, 0.2, 0.3])) Array([0.49249037, 0.05197949, 0.02991883], dtype=float64)
as well as vectorized using
jax.vmap:>>> jax.vmap(cost_fn)(jnp.array([[0.1, 0.2, 0.3], [0.4, 0.5, 0.6]])) Array([1.32269195, 1.53905377], dtype=float64)
In particular, this allows for a reduction in boilerplate when using JAX-compatible optimizers such as
jaxopt:>>> opt = jaxopt.GradientDescent(cost_fn) >>> params = jnp.array([0.1, 0.2, 0.3]) >>> (final_params, _) = jax.jit(opt.run)(params) >>> final_params Array([-0.00320799, 0.03475223, 0.29362844], dtype=float64)
Note that, in general, best performance will be seen when the Catalyst
@qjitdecorator is used to JIT the entire hybrid workflow. However, there may be cases where you may want to delegate only the quantum part of your workflow to Catalyst, and let JAX handle classical components (for example, due to missing a feature or compatibility issue in Catalyst).Support for Amazon Braket devices provided via the PennyLane-Braket plugin. #118 #139 #179 #180
This enables quantum subprograms within a JIT-compiled Catalyst workflow to execute on Braket simulator and hardware devices, including remote cloud-based simulators such as SV1.
def circuit(x, y): qml.RX(y * x, wires=0) qml.RX(x * 2, wires=1) return qml.expval(qml.PauliY(0) @ qml.PauliZ(1)) @qjit def workflow(x: float, y: float): device = qml.device("braket.local.qubit", backend="braket_sv", wires=2) g = qml.qnode(device)(circuit) h = catalyst.grad(g) return h(x, y) workflow(1.0, 2.0)
For a list of available devices, please see the PennyLane-Braket documentation.
Internally, the quantum instructions are generating OpenQASM3 kernels at runtime; these are then executed on both local (
braket.local.qubit) and remote (braket.aws.qubit) devices backed by Amazon Braket Python SDK,with measurement results then propagated back to the frontend.
Note that at initial release, not all Catalyst features are supported with Braket. In particular, dynamic circuit features, such as mid-circuit measurements, will not work with Braket devices.
Catalyst conditional functions defined via
@catalyst.condnow support an arbitrary number of ‘else if’ chains. #104dev = qml.device("lightning.qubit", wires=1) @qjit @qml.qnode(dev) def circuit(x): @catalyst.cond(x > 2.7) def cond_fn(): qml.RX(x, wires=0) @cond_fn.else_if(x > 1.4) def cond_elif(): qml.RY(x, wires=0) @cond_fn.otherwise def cond_else(): qml.RX(x ** 2, wires=0) cond_fn() return qml.probs(wires=0)
Iterating in reverse is now supported with constant negative step sizes via
catalyst.for_loop. #129dev = qml.device("lightning.qubit", wires=1) @qjit @qml.qnode(dev) def circuit(n): @catalyst.for_loop(n, 0, -1) def loop_fn(_): qml.PauliX(0) loop_fn() return measure(0)
Additional gradient transforms for computing the vector-Jacobian product (VJP) and Jacobian-vector product (JVP) are now available in Catalyst. #98
Use
catalyst.vjpto compute the forward-pass value and VJP:@qjit def vjp(params, cotangent): def f(x): y = [jnp.sin(x[0]), x[1] ** 2, x[0] * x[1]] return jnp.stack(y) return catalyst.vjp(f, [params], [cotangent])
>>> x = jnp.array([0.1, 0.2]) >>> dy = jnp.array([-0.5, 0.1, 0.3]) >>> vjp(x, dy) [array([0.09983342, 0.04 , 0.02 ]), array([-0.43750208, 0.07000001])]
Use
catalyst.jvpto compute the forward-pass value and JVP:@qjit def jvp(params, tangent): def f(x): y = [jnp.sin(x[0]), x[1] ** 2, x[0] * x[1]] return jnp.stack(y) return catalyst.jvp(f, [params], [tangent])
>>> x = jnp.array([0.1, 0.2]) >>> tangent = jnp.array([0.3, 0.6]) >>> jvp(x, tangent) [array([0.09983342, 0.04 , 0.02 ]), array([0.29850125, 0.24000006, 0.12 ])]
Support for multiple backend devices within a single qjit-compiled function is now available. #86 #89
For example, if you compile the Catalyst runtime with
lightning.kokkossupport (via the compilation flagENABLE_LIGHTNING_KOKKOS=ON), you can uselightning.qubitandlightning.kokkoswithin a singular workflow:dev1 = qml.device("lightning.qubit", wires=1) dev2 = qml.device("lightning.kokkos", wires=1) @qml.qnode(dev1) def circuit1(x): qml.RX(jnp.pi * x[0], wires=0) qml.RY(x[1] ** 2, wires=0) qml.RX(x[1] * x[2], wires=0) return qml.var(qml.PauliZ(0)) @qml.qnode(dev2) def circuit2(x): @catalyst.cond(x > 2.7) def cond_fn(): qml.RX(x, wires=0) @cond_fn.otherwise def cond_else(): qml.RX(x ** 2, wires=0) cond_fn() return qml.probs(wires=0) @qjit def cost(x): return circuit2(circuit1(x))
>>> x = jnp.array([0.54, 0.31]) >>> cost(x) array([0.80842369, 0.19157631])
Support for returning the variance of Hamiltonians, Hermitian matrices, and Tensors via
qml.varhas been added. #124dev = qml.device("lightning.qubit", wires=2) @qjit @qml.qnode(dev) def circuit(x): qml.RX(jnp.pi * x[0], wires=0) qml.RY(x[1] ** 2, wires=1) qml.CNOT(wires=[0, 1]) qml.RX(x[1] * x[2], wires=0) return qml.var(qml.PauliZ(0) @ qml.PauliX(1))
>>> x = jnp.array([0.54, 0.31]) >>> circuit(x) array(0.98851544)
Breaking changes
The
catalyst.gradfunction now supports using the differentiation method defined on the QNode (via thediff_methodargument) rather than applying a global differentiation method. #163As part of this change, the
methodargument now accepts the following options:method="auto": Quantum components of the hybrid function are differentiated according to the corresponding QNodediff_method, while the classical computation is differentiated using traditional auto-diff.With this strategy, Catalyst only currently supports QNodes with
diff_method="param-shift" anddiff_method=”adjoint”`.method="fd": First-order finite-differences for the entire hybrid function. Thediff_methodargument for each QNode is ignored.
This is an intermediate step towards differentiating functions that internally call multiple QNodes, and towards supporting differentiation of classical postprocessing.
Improvements
Catalyst has been upgraded to work with JAX v0.4.13. #143 #185
Add a Backprop operation for using autodifferentiation (AD) at the LLVM level with Enzyme AD. The Backprop operations has a bufferization pattern and a lowering to LLVM. #107 #116
Error handling has been improved. The runtime now throws more descriptive and unified expressions for runtime errors and assertions. #92
In preparation for easier debugging, the compiler has been refactored to allow easy prototyping of new compilation pipelines. #38
In the future, this will allow the ability to generate MLIR or LLVM-IR by loading input from a string or file, rather than generating it from Python.
As part of this refactor, the following changes were made:
Passes are now classes. This allows developers/users looking to change flags to inherit from these passes and change the flags.
Passes are now passed as arguments to the compiler. Custom passes can just be passed to the compiler as an argument, as long as they implement a run method which takes an input and the output of this method can be fed to the next pass.
Improved Python compatibility by providing a stable signature for user generated functions. #106
Handle C++ exceptions without unwinding the whole stack. #99
Reduce the number of classical invocations by counting the number of gate parameters in the
argmapfunction. #136Prior to this, the computation of hybrid gradients executed all of the classical code being differentiated in a
pcountfunction that solely counted the number of gate parameters in the quantum circuit. This was soargmapand other downstream functions could allocate memrefs large enough to store all gate parameters.Now, instead of counting the number of parameters separately, a dynamically-resizable array is used in the
argmapfunction directly to store the gate parameters. This removes one invocation of all of the classical code being differentiated.Use Tablegen to define MLIR passes instead of C++ to reduce overhead of adding new passes. #157
Perform constant folding on wire indices for
quantum.insertandquantum.extractops, used when writing (resp. reading) qubits to (resp. from) quantum registers. #161Represent known named observables as members of an MLIR Enum rather than a raw integer. This improves IR readability. #165
Bug fixes
Fix a bug in the mapping from logical to concrete qubits for mid-circuit measurements. #80
Fix a bug in the way gradient result type is inferred. #84
Fix a memory regression and reduce memory footprint by removing unnecessary temporary buffers. #100
Provide a new abstraction to the
QuantumDeviceinterface in the runtime calledDataView. C++ implementations of the interface can iterate through and directly store results into theDataViewindependent of the underlying memory layout. This can eliminate redundant buffer copies at the interface boundaries, which has been applied to existing devices. #109Reduce memory utilization by transferring ownership of buffers from the runtime to Python instead of copying them. This includes adding a compiler pass that copies global buffers into the heap as global buffers cannot be transferred to Python. #112
Temporary fix of use-after-free and dependency of uninitialized memory. #121
Fix file renaming within pass pipelines. #126
Fix the issue with the
do_queuedeprecation warnings in PennyLane. #146Fix the issue with gradients failing to work with hybrid functions that contain constant
jnp.arrayobjects. This will enable PennyLane operators that have data in the form of ajnp.array, such as a Hamiltonian, to be included in a qjit-compiled function. #152An example of a newly supported workflow:
coeffs = jnp.array([0.1, 0.2]) terms = [qml.PauliX(0) @ qml.PauliZ(1), qml.PauliZ(0)] H = qml.Hamiltonian(coeffs, terms) @qjit @qml.qnode(qml.device("lightning.qubit", wires=2)) def circuit(x): qml.RX(x[0], wires=0) qml.RY(x[1], wires=0) qml.CNOT(wires=[0, 1]) return qml.expval(H) params = jnp.array([0.3, 0.4]) jax.grad(circuit)(params)
Contributors
This release contains contributions from (in alphabetical order):
Ali Asadi, David Ittah, Erick Ochoa Lopez, Jacob Mai Peng, Romain Moyard, Sergei Mironov.
Release 0.1.2¶
New features
Add an option to print verbose messages explaining the compilation process. #68
Allow
catalyst.gradto be used on any traceable function (within a qjit context). This means the operation is no longer restricted to acting onqml.qnodes only. #75
Improvements
Work in progress on a Lightning-Kokkos backend:
Bring feature parity to the Lightning-Kokkos backend simulator. #55
Add support for variance measurements for all observables. #70
Build the runtime against qir-stdlib v0.1.0. #58
Replace input-checking assertions with exceptions. #67
Perform function inlining to improve optimizations and memory management within the compiler. #72
Breaking changes
Bug fixes
Several fixes to address memory leaks in the compiled program:
Fix memory leaks from data that flows back into the Python environment. #54
Fix memory leaks resulting from partial bufferization at the MLIR level. This fix makes the necessary changes to reintroduce the
-buffer-deallocationpass into the MLIR pass pipeline. The pass guarantees that all allocations contained within a function (that is allocations that are not returned from a function) are also deallocated. #61Lift heap allocations for quantum op results from the runtime into the MLIR compiler core. This allows all memref buffers to be memory managed in MLIR using the MLIR bufferization infrastructure. #63
Eliminate all memory leaks by tracking memory allocations at runtime. The memory allocations which are still alive when the compiled function terminates, will be freed in the finalization / teardown function. #78
Fix returning complex scalars from the compiled function. #77
Contributors
This release contains contributions from (in alphabetical order):
Ali Asadi, David Ittah, Erick Ochoa Lopez, Sergei Mironov.
Release 0.1.1¶
New features
Adds support for interpreting control flow operations. #31
Improvements
Adds fallback compiler drivers to increase reliability during linking phase. Also adds support for a CATALYST_CC environment variable for manual specification of the compiler driver used for linking. #30
Breaking changes
Bug fixes
Fixes the Catalyst image path in the readme to properly render on PyPI.
Contributors
This release contains contributions from (in alphabetical order):
Ali Asadi, Erick Ochoa Lopez.
Release 0.1.0¶
Initial public release.
Contributors
This release contains contributions from (in alphabetical order):
Ali Asadi, Sam Banning, David Ittah, Josh Izaac, Erick Ochoa Lopez, Sergei Mironov, Isidor Schoch.