Release notes¶
This page contains the release notes for Catalyst.
Release 0.9.0 (current release)¶
New features
Catalyst now supports the specification of shot-vectors when used with
qml.sample
measurements on thelightning.qubit
device. (#1051)Shot-vectors allow shots to be specified as a list of shots,
[20, 1, 100]
, or as a tuple of the form((num_shots, repetitions), ...)
such that((20, 3), (1, 100))
is equivalent toshots=[20, 20, 20, 1, 1, ..., 1]
.This can result in more efficient quantum execution, as a single job representing the total number of shots is executed on the quantum device, with the measurement post-processing then coarse-grained with respect to the shot-vector.
For example,
dev = qml.device("lightning.qubit", wires=1, shots=((5, 2), 7)) @qjit @qml.qnode(dev) def circuit(): qml.Hadamard(0) return qml.sample()
>>> circuit() (Array([[0], [1], [0], [1], [1]], dtype=int64), Array([[0], [1], [1], [0], [1]], dtype=int64), Array([[1], [0], [1], [1], [0], [1], [0]], dtype=int64))
Note that other measurement types, such as
expval
andprobs
, currently do not support shot-vectors.A new function
catalyst.pipeline
allows the quantum-circuit-transformation pass pipeline for QNodes within a qjit-compiled workflow to be configured. (#1131) (#1240)import pennylane as qml from catalyst import pipeline, qjit my_passes = { "cancel_inverses": {}, "my_circuit_transformation_pass": {"my-option" : "my-option-value"}, } dev = qml.device("lightning.qubit", wires=2) @pipeline(my_passes) @qml.qnode(dev) def circuit(x): qml.RX(x, wires=0) return qml.expval(qml.PauliZ(0)) @qjit def fn(x): return jnp.sin(circuit(x ** 2))
pipeline
can also be used to specify different pass pipelines for different parts of the same qjit-compiled workflow:my_pipeline = { "cancel_inverses": {}, "my_circuit_transformation_pass": {"my-option" : "my-option-value"}, } my_other_pipeline = {"cancel_inverses": {}} @qjit def fn(x): circuit_pipeline = pipeline(my_pipeline)(circuit) circuit_other = pipeline(my_other_pipeline)(circuit) return jnp.abs(circuit_pipeline(x) - circuit_other(x))
The pass pipeline order and options can be configured globally for a qjit-compiled function, by using the
circuit_transform_pipeline
argument of theqjit()
decorator.my_passes = { "cancel_inverses": {}, "my_circuit_transformation_pass": {"my-option" : "my-option-value"}, } @qjit(circuit_transform_pipeline=my_passes) def fn(x): return jnp.sin(circuit(x ** 2))
Global and local (via
@pipeline
) configurations can coexist, however local pass pipelines will always take precedence over global pass pipelines.The available MLIR passes are listed and documented in the passes module documentation.
A peephole merge rotations pass, which acts similarly to the Python-based PennyLane merge rotations transform, is now available in MLIR and can be applied to QNodes within a qjit-compiled function. (#1162) (#1205) (#1206)
The
merge_rotations
pass can be provided to thecatalyst.pipeline
decorator:from catalyst import pipeline, qjit my_passes = { "merge_rotations": {} } dev = qml.device("lightning.qubit", wires=1) @qjit @pipeline(my_passes) @qml.qnode(dev) def g(x: float): qml.RX(x, wires=0) qml.RX(x, wires=0) qml.Hadamard(wires=0) return qml.expval(qml.PauliX(0))
It can also be applied directly to qjit-compiled QNodes via the
catalyst.passes.merge_rotations
Python decorator:from catalyst.passes import merge_rotations @qjit @merge_rotations @qml.qnode(dev) def g(x: float): qml.RX(x, wires=0) qml.RX(x, wires=0) qml.Hadamard(wires=0) return qml.expval(qml.PauliX(0))
Static arguments of a qjit-compiled function can now be indicated by name via a
static_argnames
argument to theqjit
decorator. (#1158)Specified static argument names will be treated as compile-time static values, allowing any hashable Python object to be passed to this function argument during compilation.
>>> @qjit(static_argnames="y") ... def f(x, y): ... print(f"Compiling with y={y}") ... return x + y >>> f(0.5, 0.3) Compiling with y=0.3
The function will only be re-compiled if the hash values of the static arguments change. Otherwise, re-using previous static argument values will result in no re-compilation:
Array(0.8, dtype=float64) >>> f(0.1, 0.3) # no re-compilation occurs Array(0.4, dtype=float64) >>> f(0.1, 0.4) # y changes, re-compilation Compiling with y=0.4 Array(0.5, dtype=float64)
Catalyst Autograph now supports updating a single index or a slice of JAX arrays using Python’s array assignment operator syntax. (#769) (#1143)
Using operator assignment syntax in favor of
at...op
expressions is now possible for the following operations:x[i] += y
in favor ofx.at[i].add(y)
x[i] -= y
in favor ofx.at[i].add(-y)
x[i] *= y
in favor ofx.at[i].multiply(y)
x[i] /= y
in favor ofx.at[i].divide(y)
x[i] **= y
in favor ofx.at[i].power(y)
@qjit(autograph=True) def f(x): first_dim = x.shape[0] result = jnp.copy(x) for i in range(first_dim): result[i] *= 2 # This is now supported return result
>>> f(jnp.array([1, 2, 3])) Array([2, 4, 6], dtype=int64)
Catalyst now has a standalone compiler tool called
catalyst-cli
that quantum-compiles MLIR input files into an object file independent of the Python frontend. (#1208) (#1255)This compiler tool combines three stages of compilation:
quantum-opt
: Performs the MLIR-level optimizations and lowers the input dialect to the LLVM dialect.mlir-translate
: Translates the input in the LLVM dialect into LLVM IR.llc
: Performs lower-level optimizations and creates the object file.
catalyst-cli
runs all three stages under the hood by default, but it also has the ability to run each stage individually. For example:# Creates both the optimized IR and an object file catalyst-cli input.mlir -o output.o # Only performs MLIR optimizations catalyst-cli --tool=opt input.mlir -o llvm-dialect.mlir # Only lowers LLVM dialect MLIR input to LLVM IR catalyst-cli --tool=translate llvm-dialect.mlir -o llvm-ir.ll # Only performs lower-level optimizations and creates object file catalyst-cli --tool=llc llvm-ir.ll -o output.o
Note that
catalyst-cli
is only available when Catalyst is built from source, and is not included when installing Catalyst via pip or from wheels.Experimental integration of the PennyLane capture module is available. It currently only supports quantum gates, without control flow. (#1109)
To trigger the PennyLane pipeline for capturing the program as a Jaxpr, simply set
experimental_capture=True
in the qjit decorator.import pennylane as qml from catalyst import qjit dev = qml.device("lightning.qubit", wires=1) @qjit(experimental_capture=True) @qml.qnode(dev) def circuit(): qml.Hadamard(0) qml.CNOT([0, 1]) return qml.expval(qml.Z(0))
Improvements
Multiple
qml.sample
calls can now be returned from the same program, and can be structured using Python containers. For example, a program can return a dictionary of the formreturn {"first": qml.sample(), "second": qml.sample()}
. (#1051)Catalyst now ships with
null.qubit
, a Catalyst runtime plugin that mocks out all functions in the QuantumDevice interface. This device is provided as a convenience for testing and benchmarking purposes. (#1179)qml.device("null.qubit", wires=1) @qml.qjit @qml.qnode(dev) def g(x): qml.RX(x, wires=0) return qml.probs(wires=[0])
Setting the
seed
argument in theqjit
decorator will now seed sampled results, in addition to mid-circuit measurement results. (#1164)dev = qml.device("lightning.qubit", wires=1, shots=10) @qml.qnode(dev) def circuit(x): qml.RX(x, wires=0) m = catalyst.measure(0) if m: qml.Hadamard(0) return qml.sample() @qml.qjit(seed=37, autograph=True) def workflow(x): return jnp.squeeze(jnp.stack([circuit(x) for i in range(4)]))
>>> workflow(1.8) Array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [1, 1, 0, 0, 1, 1, 0, 0, 1, 0], [0, 0, 1, 0, 1, 1, 0, 0, 1, 1], [1, 1, 1, 0, 0, 1, 1, 0, 1, 1]], dtype=int64) >>> workflow(1.8) Array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [1, 1, 0, 0, 1, 1, 0, 0, 1, 0], [0, 0, 1, 0, 1, 1, 0, 0, 1, 1], [1, 1, 1, 0, 0, 1, 1, 0, 1, 1]], dtype=int64)
Note that statistical measurement processes such as
expval
,var
, andprobs
are currently not affected by seeding when shot noise is present.The
cancel_inverses
MLIR compilation pass (-remove-chained-self-inverse
) now supports cancelling all Hermitian gates, as well as adjoints of arbitrary unitary operations. (#1136) (#1186) (#1211)For the full list of supported Hermitian gates please see the
cancel_inverses
documentation incatalyst.passes
.Support is expanded for backend devices that exclusively return samples in the measurement basis. Pre- and post-processing now allows
qjit
to be used on these devices withqml.expval
,qml.var
andqml.probs
measurements in addition toqml.sample
, using themeasurements_from_samples
transform. (#1106)Scalar tensors are eliminated from control flow operations in the program, and are replaced with bare scalars instead. This improves compilation time and memory usage at runtime by avoiding heap allocations and reducing the amount of instructions. (#1075)
Compiling QNodes to asynchronous functions will no longer print to
stderr
in case of an error. (#645)Gradient computations have been made more efficient, as calling gradients twice (with the same gradient parameters) will now only lower to a single MLIR function. (#1172)
qml.sample()
andqml.counts()
onlightning.qubit
/kokkos
can now be seeded withqjit(seed=...)
. (#1164) (#1248)The compiler pass
-remove-chained-self-inverse
can now also cancel adjoints of arbitrary unitary operations (in addition to the named Hermitian gates). (#1186) (#1211)Add Lightning-GPU support to Catalyst docs and update tests. (#1254)
Breaking changes
The
static_size
field in theAbstractQreg
class has been removed. (#1113)This reverts a previous breaking change.
Nesting QNodes within one another now raises an error. (#1176)
The
debug.compile_from_mlir
function has been removed; please usedebug.replace_ir
instead. (#1181)The
compiler.last_compiler_output
function has been removed; please usecompiler.get_output_of("last", workspace)
instead. (#1208)
Bug fixes
Fixes a bug where the second execution of a function with abstracted axes is failing. (#1247)
Fixes a bug in
catalyst.mitigate_with_zne
that would lead to incorrectly extrapolated results. (#1213)Fixes a bug preventing the target of
qml.adjoint
andqml.ctrl
calls from being transformed by AutoGraph. (#1212)Resolves a bug where
mitigate_with_zne
does not work properly with shots and devices supporting only counts and samples (e.g., Qrack). (#1165)Resolves a bug in the
vmap
function when passing shapeless values to the target. (#1150)Fixes a bug that resulted in an error message when using
qml.cond
on callables with arguments. (#1151)Fixes a bug that prevented taking the gradient of nested accelerate callbacks. (#1156)
Fixes some small issues with scatter lowering: (#1216) (#1217)
Registers the func dialect as a requirement for running the scatter lowering pass.
Emits error if
%input
,%update
and%result
are not of length 1 instead of segfaulting.
Fixes a performance issue with
catalyst.vmap
, where the root cause was in the lowering of the scatter operation. (#1214)Fixes a bug where conditional-ed single gates cannot be used in qjit, e.g.
qml.cond(x > 1, qml.Hadamard)(wires=0)
. (#1232)
Internal changes
Removes deprecated PennyLane code across the frontend. (#1168)
Updates Enzyme to version
v0.0.149
. (#1142)Adjoint canonicalization is now available in MLIR for
CustomOp
andMultiRZOp
. It can be used with the--canonicalize
pass inquantum-opt
. (#1205)Removes the
MemMemCpyOptPass
in llvm O2 (applied for Enzyme), which reduces bugs when running gradient-like functions. (#1063)Bufferization of
gradient.ForwardOp
andgradient.ReverseOp
now requires three steps:gradient-preprocessing
,gradient-bufferize
, andgradient-postprocessing
.gradient-bufferize
has a new rewrite forgradient.ReturnOp
. (#1139)A new MLIR pass
detensorize-scf
is added that works in conjunction with the existinglinalg-detensorize
pass to detensorize input programs. The IR generated by JAX wraps all values in the program in tensors, including scalars, leading to unnecessary memory allocations for programs compiled to CPU via the MLIR-to-LLVM pipeline. (#1075)Importing Catalyst will now pollute less of JAX’s global variables by using
LoweringParameters
. (#1152)Cached primitive lowerings is used instead of a custom cache structure. (#1159)
Functions with multiple tapes are now split with a new mlir pass
--split-multiple-tapes
, with one tape per function. The reset routine that makes a measurement between tapes and inserts an X gate if measured one is no longer used. (#1017) (#1130)Prefer creating new
qml.devices.ExecutionConfig
objects over using the globalqml.devices.DefaultExecutionConfig
. Doing so helps avoid unexpected bugs and test failures in case theDefaultExecutionConfig
object becomes modified from its original state. (#1137)Remove the old
QJITDevice
API. (#1138)The device-capability loading mechanism has been moved into the
QJITDevice
constructor. (#1141)Several functions related to device capabilities have been refactored. (#1149)
In particular, the signatures of
get_device_capability
,catalyst_decompose
,catalyst_acceptance
, andQJITDevice.__init__
have changed, and thepennylane_operation_set
function has been removed entirely.Catalyst now generates nested modules denoting quantum programs. (#1144)
Similar to MLIR’s
gpu.launch_kernel
function, Catalyst, now supports acall_function_in_module
. This allows Catalyst to call functions in modules and have modules denote a quantum kernel. This will allow for device-specific optimizations and compilation pipelines.At the moment, no one is using this. This is just the necessary scaffolding to support device-specific transformations. As such, the module will be inlined to preserve current semantics. However, in the future, we will explore lowering this nested module into other IRs/binary formats and lowering
call_function_in_module
to something that can dispatch calls to another runtime/VM.
Contributors
This release contains contributions from (in alphabetical order):
Joey Carter, Spencer Comin, Amintor Dusko, Lillian M.A. Frederiksen, Sengthai Heng, David Ittah, Mehrdad Malekmohammadi, Vincent Michaud-Rioux, Romain Moyard, Erick Ochoa Lopez, Daniel Strano, Raul Torres, Paul Haochen Wang.
Release 0.8.0¶
New features
JAX-compatible functions that run on classical accelerators, such as GPUs, via
catalyst.accelerate
now support autodifferentiation. (#920)For example,
from catalyst import qjit, grad @qjit @grad def f(x): expm = catalyst.accelerate(jax.scipy.linalg.expm) return jnp.sum(expm(jnp.sin(x)) ** 2)
>>> x = jnp.array([[0.1, 0.2], [0.3, 0.4]]) >>> f(x) Array([[2.80120452, 1.67518663], [1.61605839, 4.42856163]], dtype=float64)
Assertions can now be raised at runtime via the
catalyst.debug_assert
function. (#925)Python-based exceptions (via
raise
) and assertions (viaassert
) will always be evaluated at program capture time, before certain runtime information may be available.Use
debug_assert
to instead raise assertions at runtime, including assertions that depend on values of dynamic variables.For example,
from catalyst import debug_assert @qjit def f(x): debug_assert(x < 5, "x was greater than 5") return x * 8
>>> f(4) Array(32, dtype=int64) >>> f(6) RuntimeError: x was greater than 5
Assertions can be disabled globally for a qjit-compiled function via the
disable_assertions
keyword argument:@qjit(disable_assertions=True) def g(x): debug_assert(x < 5, "x was greater than 5") return x * 8
>>> g(6) Array(48, dtype=int64)
Mid-circuit measurement results when using
lightning.qubit
andlightning.kokkos
can now be seeded via the newseed
argument of theqjit
decorator. (#936)The seed argument accepts an unsigned 32-bit integer, which is used to initialize the pseudo-random state at the beginning of each execution of the compiled function. Therefor, different
qjit
objects with the same seed (including repeated calls to the sameqjit
) will always return the same sequence of mid-circuit measurement results.dev = qml.device("lightning.qubit", wires=1) @qml.qnode(dev) def circuit(x): qml.RX(x, wires=0) m = measure(0) if m: qml.Hadamard(0) return qml.probs() @qjit(seed=37, autograph=True) def workflow(x): return jnp.stack([circuit(x) for i in range(4)])
Repeatedly calling the
workflow
function above will always result in the same values:>>> workflow(1.8) Array([[1. , 0. ], [1. , 0. ], [1. , 0. ], [0.5, 0.5]], dtype=float64) >>> workflow(1.8) Array([[1. , 0. ], [1. , 0. ], [1. , 0. ], [0.5, 0.5]], dtype=float64)
Note that setting the seed will not avoid shot-noise stochasticity in terminal measurement statistics such as
sample
orexpval
:dev = qml.device("lightning.qubit", wires=1, shots=10) @qml.qnode(dev) def circuit(x): qml.RX(x, wires=0) m = measure(0) if m: qml.Hadamard(0) return qml.expval(qml.PauliZ(0)) @qjit(seed=37, autograph=True) def workflow(x): return jnp.stack([circuit(x) for i in range(4)])
>>> workflow(1.8) Array([1. , 1. , 1. , 0.4], dtype=float64) >>> workflow(1.8) Array([ 1. , 1. , 1. , -0.2], dtype=float64)
Exponential fitting is now a supported method of zero-noise extrapolation when performing error mitigation in Catalyst using
mitigate_with_zne
. (#953)This new functionality fits the data from noise-scaled circuits with an exponential function, and returns the zero-noise value:
from pennylane.transforms import exponential_extrapolate from catalyst import mitigate_with_zne dev = qml.device("lightning.qubit", wires=2, shots=100000) @qml.qnode(dev) def circuit(weights): qml.StronglyEntanglingLayers(weights, wires=[0, 1]) return qml.expval(qml.PauliZ(0) @ qml.PauliZ(1)) @qjit def workflow(weights, s): zne_circuit = mitigate_with_zne(circuit, scale_factors=s, extrapolate=exponential_extrapolate) return zne_circuit(weights)
>>> weights = jnp.ones([3, 2, 3]) >>> scale_factors = jnp.array([1, 2, 3]) >>> workflow(weights, scale_factors) Array(-0.19946598, dtype=float64)
A new module is available,
catalyst.passes
, which provides Python decorators for enabling and configuring Catalyst MLIR compiler passes. (#911) (#1037)The first pass available is
catalyst.passes.cancel_inverses
, which enables the-removed-chained-self-inverse
MLIR pass that cancels two neighbouring Hadamard gates.from catalyst.debug import get_compilation_stage from catalyst.passes import cancel_inverses dev = qml.device("lightning.qubit", wires=1) @qml.qnode(dev) def circuit(x: float): qml.RX(x, wires=0) qml.Hadamard(wires=0) qml.Hadamard(wires=0) return qml.expval(qml.PauliZ(0)) @qjit(keep_intermediate=True) def workflow(x): optimized_circuit = cancel_inverses(circuit) return circuit(x), optimized_circuit(x)
Catalyst now has debug functions
get_compilation_stage
andreplace_ir
to acquire and recompile the IR from a given pipeline pass for functions compiled withkeep_intermediate=True
. (#981)For example, consider the following function:
@qjit(keep_intermediate=True) def f(x): return x**2
>>> f(2.0) 4.0
Here we use
get_compilation_stage
to acquire the IR, and then modify%2 = arith.mulf %in, %in_0 : f64
to turn the square function into a cubic one viareplace_ir
:from catalyst.debug import get_compilation_stage, replace_ir old_ir = get_compilation_stage(f, "HLOLoweringPass") new_ir = old_ir.replace( "%2 = arith.mulf %in, %in_0 : f64\n", "%t = arith.mulf %in, %in_0 : f64\n %2 = arith.mulf %t, %in_0 : f64\n" ) replace_ir(f, "HLOLoweringPass", new_ir)
The recompilation starts after the given checkpoint stage:
>>> f(2.0) 8.0
Either function can also be used independently of each other. Note that
get_compilation_stage
replaces theprint_compilation_stage
function; please see the Breaking Changes section for more details.Catalyst now supports generating executables from compiled functions for the native host architecture using
catalyst.debug.compile_executable
. (#1003)>>> @qjit ... def f(x): ... y = x * x ... catalyst.debug.print_memref(y) ... return y >>> f(5) MemRef: base@ = 0x31ac22580 rank = 0 offset = 0 sizes = [] strides = [] data = 25 Array(25, dtype=int64)
We can use
compile_executable
to compile this function to a binary:>>> from catalyst.debug import compile_executable >>> binary = compile_executable(f, 5) >>> print(binary) /path/to/executable
Executing this function from a shell environment:
$ /path/to/executable MemRef: base@ = 0x64fc9dd5ffc0 rank = 0 offset = 0 sizes = [] strides = [] data = 25
Improvements
Catalyst has been updated to work with JAX v0.4.28 (exact version match required). (#931) (#995)
Catalyst now supports keyword arguments for qjit-compiled functions. (#1004)
>>> @qjit ... @grad ... def f(x, y): ... return x * y >>> f(3., y=2.) Array(2., dtype=float64)
Note that the
static_argnums
argument to theqjit
decorator is not supported when passing argument values as keyword arguments.Support has been added for the
jax.numpy.argsort
function within qjit-compiled functions. (#901)Autograph now supports in-place array assignments with static slices. (#843)
For example,
@qjit(autograph=True) def f(x, y): y[1:10:2] = x return y
>>> f(jnp.ones(5), jnp.zeros(10)) Array([0., 1., 0., 1., 0., 1., 0., 1., 0., 1.], dtype=float64)
Autograph now works when
qjit
is applied to a function decorated withvmap
,cond
,for_loop
orwhile_loop
. Previously, stacking the autograph-enabled qjit decorator directly on top of other Catalyst decorators would lead to errors. (#835) (#938) (#942)from catalyst import vmap, qjit dev = qml.device("lightning.qubit", wires=2) @qml.qnode(dev) def circuit(x): qml.RX(x, wires=0) return qml.expval(qml.PauliZ(0))
>>> x = jnp.array([0.1, 0.2, 0.3]) >>> qjit(vmap(circuit), autograph=True)(x) Array([0.99500417, 0.98006658, 0.95533649], dtype=float64)
Runtime memory usage, and compilation complexity, has been reduced by eliminating some scalar tensors from the IR. This has been done by adding a
linalg-detensorize
pass at the end of the HLO lowering pipeline. (#1010)Program verification is extended to confirm that the measurements included in QNodes are compatible with the specified device and settings. (#945) (#962)
>>> dev = qml.device("lightning.qubit", wires=2, shots=None) >>> @qjit ... @qml.qnode(dev) ... def circuit(params): ... qml.RX(params[0], wires=0) ... qml.RX(params[1], wires=1) ... return { ... "sample": qml.sample(wires=[0, 1]), ... "expval": qml.expval(qml.PauliZ(0)) ... } >>> circuit([0.1, 0.2]) CompileError: Sample-based measurements like sample(wires=[0, 1]) cannot work with shots=None. Please specify a finite number of shots.
On devices that support it, initial state preparation routines
qml.StatePrep
andqml.BasisState
are no longer decomposed when using Catalyst, improving compilation and runtime performance. (#955) (#1047) (#1062) (#1073)Improved type validation and error messaging has been added to both the
catalyst.jvp
andcatalyst.vjp
functions to ensure that the (co)tangent and parameter types are compatible. (#1020) (#1030) (#1031)For example, providing an integer tangent for a function with float64 parameters will result in an error:
>>> f = lambda x: (2 * x, x * x) >>> f_jvp = lambda x: catalyst.jvp(f, params=(x,), tangents=(1,)) >>> qjit(f_jvp)(0.5) TypeError: function params and tangents arguments to catalyst.jvp do not match; dtypes must be equal. Got function params dtype float64 and so expected tangent dtype float64, but got tangent dtype int64 instead.
Ensuring that the types match will resolve the error:
>>> f_jvp = lambda x: catalyst.jvp(f, params=(x,), tangents=(1.0,)) >>> qjit(f_jvp)(0.5) ((Array(1., dtype=float64), Array(0.25, dtype=float64)), (Array(2., dtype=float64), Array(1., dtype=float64)))
Add a script for setting up a Frontend-Only Development Environment that does not require compilation, as it uses the TestPyPI wheel shared libraries. (#1022)
Breaking changes
The
argnum
keyword argument in thegrad
,jacobian
,value_and_grad
,vjp
, andjvp
functions has been renamed toargnums
to better match JAX. (#1036)Return values of qjit-compiled functions that were previously
numpy.ndarray
are now of typejax.Array
instead. This should have minimal impact, but code that depends on the output of qjit-compiled function being NumPy arrays will need to be updated. (#895)The
print_compilation_stage
function has been renamedget_compilation_stage
. It no longer prints the IR to the standard output, instead it simply returns the IR as a string. (#981)>>> @qjit(keep_intermediate=True) ... def func(x: float): ... return x >>> print(get_compilation_stage(func, "HLOLoweringPass")) module @func { func.func public @jit_func(%arg0: tensor<f64>) -> tensor<f64> attributes {llvm.emit_c_interface} { return %arg0 : tensor<f64> } func.func @setup() { quantum.init return } func.func @teardown() { quantum.finalize return } }
Support for TOML files in Schema 1 has been disabled. (#960)
The
mitigate_with_zne
function no longer accepts adegree
parameter for polynomial fitting and instead accepts a callable to perform extrapolation. Any qjit-compatible extrapolation function is valid. Keyword arguments can be passed to this function using theextrapolate_kwargs
keyword argument inmitigate_with_zne
. (#806)The QuantumDevice API has now added the functions
SetState
andSetBasisState
for simulators that may benefit from instructions that directly set the state. Implementing these methods is optional, and device support can be indicated via theinitial_state_prep
flag in the TOML configuration file. (#955)
Bug fixes
Catalyst no longer silently converts complex parameters to floats where floats are expected, instead an error is raised. (#1008)
Fixes a bug where dynamic one-shot did not work when no mid-circuit measurements are present and when the return type is an iterable. (#1060)
Fixes a bug finding the quantum function jaxpr when using quantum primitives with dynamic one-shot (#1041)
Fix a bug where LegacyDevice number of shots is not correctly extracted when using the legacyDeviceFacade. (#1035)
Catalyst no longer generates a
QubitUnitary
operation during decomposition if a device doesn’t support it. Instead, the operation that would lead to aQubitUnitary
is either decomposed or raises an error. (#1002)Correctly errors out when user uses
qml.density_matrix
(#1118)Catalyst now preserves output PyTrees in QNodes executed with
mcm_method="one-shot"
. (#957)For example:
dev = qml.device("lightning.qubit", wires=1, shots=20) @qml.qjit @qml.qnode(dev, mcm_method="one-shot") def func(x): qml.RX(x, wires=0) m_0 = catalyst.measure(0, postselect=1) return {"hi": qml.expval(qml.Z(0))}
>>> func(0.9) {'hi': Array(-1., dtype=float64)}
Fixes a bug where scatter did not work correctly with list indices. (#982)
A = jnp.ones([3, 3]) * 2 def update(A): A = A.at[[0, 1], :].set(jnp.ones([2, 3]), indices_are_sorted=True, unique_indices=True) return A
>>> update [[1. 1. 1.] [1. 1. 1.] [2. 2. 2.]]
Static arguments can now be passed through a QNode when specified with the
static_argnums
keyword argument. (#932)dev = qml.device("lightning.qubit", wires=1) @qjit(static_argnums=(1,)) @qml.qnode(dev) def circuit(x, c): print("Inside QNode:", c) qml.RY(c, 0) qml.RX(x, 0) return qml.expval(qml.PauliZ(0))
When executing the qjit-compiled function above,
c
will be a static variable with value known at compile time:>>> circuit(0.5, 0.5) "Inside QNode: 0.5" Array(0.77015115, dtype=float64)
Changing the value of
c
will result in re-compilation:>>> circuit(0.5, 0.8) "Inside QNode: 0.8" Array(0.61141766, dtype=float64)
Fixes a bug where Catalyst would fail to apply quantum transforms and preserve QNode configuration settings when Autograph was enabled. (#900)
pure_callback
will no longer cause a crash in the compiler if the return type signature is declared incorrectly and the callback function is differentiated. (#916)Instead, this is caught early and a useful error message returned:
@catalyst.pure_callback def callback_fn(x) -> jax.ShapeDtypeStruct((2,), jnp.float32): return np.array([np.sin(x), np.cos(x)]) callback_fn.fwd(lambda x: (callback_fn(x), x)) callback_fn.bwd(lambda x, dy: (jnp.array([jnp.cos(x), -jnp.sin(x)]) @ dy,)) @qjit @catalyst.grad def f(x): return jnp.sum(callback_fn(jnp.sin(x)))
>>> f(0.54) TypeError: Callback callback_fn expected type ShapedArray(float32[2]) but observed ShapedArray(float64[2]) in its return value
AutoGraph will now correctly convert conditional statements where the condition is a non-boolean static value. (#944)
Internally, statically known non-boolean predicates (such as
1
) will be converted tobool
:@qml.qjit(autograph=True) def workflow(x): n = 1 if n: y = x ** 2 else: y = x return y
value_and_grad
will now correctly differentiate functions with multiple arguments. Previously, attempting to differentiate functions with multiple arguments, or pass theargnums
argument, would result in an error. (#1034)@qjit def g(x, y, z): def f(x, y, z): return x * y ** 2 * jnp.sin(z) return catalyst.value_and_grad(f, argnums=[1, 2])(x, y, z)
>>> g(0.4, 0.2, 0.6) (Array(0.00903428, dtype=float64), (Array(0.0903428, dtype=float64), Array(0.01320537, dtype=float64)))
A bug is fixed in
catalyst.debug.get_cmain
to support multi-dimensional arrays as function inputs. (#1003)Bug fixed when parameter annotations return strings. (#1078)
In certain cases,
jax.scipy.linalg.expm
may return incorrect numerical results when used within a qjit-compiled function. A warning will now be raised whenjax.scipy.linalg.expm
is used to inform of this issue.In the meantime, we strongly recommend the catalyst.accelerate function within qjit-compiled function to call
jax.scipy.linalg.expm
directly.@qjit def f(A): B = catalyst.accelerate(jax.scipy.linalg.expm)(A) return B
Note that this PR doesn’t actually fix the aforementioned numerical errors, and just raises a warning. (#1082)
Documentation
A page has been added to the documentation, listing devices that are Catalyst compatible. (#966)
Internal changes
Adds
catalyst.from_plxpr.from_plxpr
for converting a PennyLane variant jaxpr into a Catalyst variant jaxpr. (#837)Catalyst now uses Enzyme
v0.0.130
(#898)When memrefs have no identity layout, memrefs copy operations are replaced by the linalg copy operation. It does not use a runtime function but instead lowers to scf and standard dialects. It also ensures a better compatibility with Enzyme. (#917)
LLVM’s O2 optimization pipeline and Enzyme’s AD transformations are now only run in the presence of gradients, significantly improving compilation times for programs without derivatives. Similarly, LLVM’s coroutine lowering passes only run when
async_qnodes
is enabled in the QJIT decorator. (#968)The function
inactive_callback
was renamed__catalyst_inactive_callback
. (#899)The function
__catalyst_inactive_callback
has the nofree attribute. (#898)catalyst.dynamic_one_shot
usespostselect_mode="pad-invalid-samples"
in favour ofinterface="jax"
when processing results. (#956)Callbacks now have nicer identifiers in their MLIR representation. The identifiers include the name of the Python function being called back into. (#919)
Fix tracing of
SProd
operations to bring Catalyst in line with PennyLane v0.38. (#935)After some changes in PennyLane,
Sprod.terms()
returns the terms as leaves instead of a tree. This means that we need to manually trace each term and finally multiply it with the coefficients to create a Hamiltonian.The function
mitigate_with_zne
accomodates afolding
input argument for specifying the type of circuit folding technique to be used by the error-mitigation routine (onlyglobal
value is supported to date.) (#946)Catalyst’s implementation of Lightning Kokkos plugin has been removed in favor of Lightning’s one. (#974)
The
validate_device_capabilities
function is considered obsolete. Hence, it has been removed. (#1045)
Contributors
This release contains contributions from (in alphabetical order):
Joey Carter, Alessandro Cosentino, Lillian M. A. Frederiksen, David Ittah, Josh Izaac, Christina Lee, Kunwar Maheep Singh, Mehrdad Malekmohammadi, Romain Moyard, Erick Ochoa Lopez, Mudit Pandey, Nate Stemen, Raul Torres, Tzung-Han Juang, Paul Haochen Wang,
Release 0.7.0¶
New features
Add support for accelerating classical processing via JAX with
catalyst.accelerate
. (#805)Classical code that can be just-in-time compiled with JAX can now be seamlessly executed on GPUs or other accelerators with
catalyst.accelerate
, right inside of QJIT-compiled functions.@accelerate(dev=jax.devices("gpu")[0]) def classical_fn(x): return jnp.sin(x) ** 2 @qjit def hybrid_fn(x): y = classical_fn(jnp.sqrt(x)) # will be executed on a GPU return jnp.cos(y)
Available devices can be retrieved via
jax.devices()
. If not provided, the default value ofjax.devices()[0]
as determined by JAX will be used.Catalyst callback functions, such as
pure_callback
,debug.callback
, anddebug.print
, now all support auto-differentiation. (#706) (#782) (#822) (#834) (#882) (#907)When using callbacks that do not return any values, such as
catalyst.debug.callback
andcatalyst.debug.print
, these functions are marked as ‘inactive’ and do not contribute to or affect the derivative of the function:import logging log = logging.getLogger(__name__) log.setLevel(logging.INFO) @qml.qjit @catalyst.grad def f(x): y = jnp.cos(x) catalyst.debug.print("Debug print: y = {0:.4f}", y) catalyst.debug.callback(lambda _: log.info("Value of y = %s", _))(y) return y ** 2
>>> f(0.54) INFO:__main__:Value of y = 0.8577086813638242 Debug print: y = 0.8577 array(-0.88195781)
Callbacks that do return values and may affect the qjit-compiled functions computation, such as
pure_callback
, may have custom derivatives manually registered with the Catalyst compiler in order to support differentiation.This can be done via the
pure_callback.fwd
andpure_callback.bwd
methods, to specify how the forwards and backwards pass (the vector-Jacobian product) of the callback should be computed:@catalyst.pure_callback def callback_fn(x) -> float: return np.sin(x[0]) * x[1] @callback_fn.fwd def callback_fn_fwd(x): # returns the evaluated function as well as residual # values that may be useful for the backwards pass return callback_fn(x), x @callback_fn.bwd def callback_fn_vjp(res, dy): # Accepts residuals from the forward pass, as well # as (one or more) cotangent vectors dy, and returns # a tuple of VJPs corresponding to each input parameter. def vjp(x, dy) -> (jax.ShapeDtypeStruct((2,), jnp.float64),): return (np.array([np.cos(x[0]) * dy * x[1], np.sin(x[0]) * dy]),) # The VJP function can also be a pure callback return catalyst.pure_callback(vjp)(res, dy) @qml.qjit @catalyst.grad def f(x): y = jnp.array([jnp.cos(x[0]), x[1]]) return jnp.sin(callback_fn(y))
>>> x = jnp.array([0.1, 0.2]) >>> f(x) array([-0.01071923, 0.82698717])
Catalyst now supports the ‘dynamic one shot’ method for simulating circuits with mid-circuit measurements, which compared to other methods, may be advantageous for circuits with many mid-circuit measurements executed for few shots. (#5617) (#798)
The dynamic one shot method evaluates dynamic circuits by executing them one shot at a time via
catalyst.vmap
, sampling a dynamic execution path for each shot. This method only works for a QNode executing with finite shots, and it requires the device to support mid-circuit measurements natively.This new mode can be specified by using the
mcm_method
argument of the QNode:dev = qml.device("lightning.qubit", wires=5, shots=20) @qml.qjit(autograph=True) @qml.qnode(dev, mcm_method="one-shot") def circuit(x): for i in range(10): qml.RX(x, 0) m = catalyst.measure(0) if m: qml.RY(x ** 2, 1) x = jnp.sin(x) return qml.expval(qml.Z(1))
Catalyst’s existing method for simulating mid-circuit measurements remains available via
mcm_method="single-branch-statistics"
.When using
mcm_method="one-shot"
, thepostselect_mode
keyword argument can also be used to specify whether the returned result should includeshots
-number of postselected measurements ("fill-shots"
), or whether results should include all results, including invalid postselections ("hw_like"
):@qml.qjit @qml.qnode(dev, mcm_method="one-shot", postselect_mode="hw-like") def func(x): qml.RX(x, wires=0) m_0 = catalyst.measure(0, postselect=1) return qml.sample(wires=0)
>>> res = func(0.9) >>> res array([-2147483648, -2147483648, 1, -2147483648, -2147483648, -2147483648, -2147483648, 1, -2147483648, -2147483648, -2147483648, -2147483648, 1, -2147483648, -2147483648, -2147483648, -2147483648, -2147483648, -2147483648, -2147483648]) >>> jnp.delete(res, jnp.where(res == np.iinfo(np.int32).min)[0]) Array([1, 1, 1], dtype=int64)
Note that invalid shots will not be discarded, but will be replaced by
np.iinfo(np.int32).min
. They will not be used for processing final results (like expectation values), but they will appear in the output of QNodes that return samples directly.For more details, see the dynamic quantum circuit documentation.
Catalyst now has support for returning
qml.sample(m)
wherem
is the result of a mid-circuit measurement. (#731)When used with
mcm_method="one-shot"
, this will return an array with one measurement result for each shot:dev = qml.device("lightning.qubit", wires=2, shots=10) @qml.qjit @qml.qnode(dev, mcm_method="one-shot") def func(x): qml.RX(x, wires=0) m = catalyst.measure(0) qml.RX(x ** 2, wires=0) return qml.sample(m), qml.expval(qml.PauliZ(0))
>>> func(0.9) (array([0, 1, 0, 0, 0, 0, 1, 0, 0, 0]), array(0.4))
In
mcm_method="single-branch-statistics"
mode, it will be equivalent to returningm
directly from the quantum function — that is, it will return a single boolean corresponding to the measurement in the branch selected:@qml.qjit @qml.qnode(dev, mcm_method="single-branch-statistics") def func(x): qml.RX(x, wires=0) m = catalyst.measure(0) qml.RX(x ** 2, wires=0) return qml.sample(m), qml.expval(qml.PauliZ(0))
>>> func(0.9) (array(False), array(0.8))
A new function,
catalyst.value_and_grad
, returns both the result of a function and its gradient with a single forward and backwards pass. (#804) (#859)This can be more efficient, and reduce overall quantum executions, compared to separately executing the function and then computing its gradient.
For example:
dev = qml.device("lightning.qubit", wires=3) @qml.qnode(dev) def circuit(x): qml.RX(x, wires=0) qml.CNOT(wires=[0, 1]) qml.RX(x, wires=2) return qml.probs() @qml.qjit @catalyst.value_and_grad def cost(x): return jnp.sum(jnp.cos(circuit(x)))
>>> cost(0.543) (array(7.64695856), array(0.33413963))
Autograph now supports single index JAX array assignments (#717)
When using Autograph, syntax of the form
x[i] = y
wherei
is a single integer will now be automatically converted to the JAX equivalent ofx = x.at(i).set(y)
:@qml.qjit(autograph=True) def f(array): result = jnp.ones(array.shape, dtype=array.dtype) for i, x in enumerate(array): result[i] = result[i] + x * 3 return result
>>> f(jnp.array([-0.1, 0.12, 0.43, 0.54])) array([0.7 , 1.36, 2.29, 2.62])
Catalyst now supports dynamically-shaped arrays in control-flow primitives. Arrays with dynamic shapes can now be used with
for_loop
,while_loop
, andcond
primitives. (#775) (#777) (#830)@qjit def f(shape): a = jnp.ones([shape], dtype=float) @for_loop(0, 10, 2) def loop(i, a): return a + i return loop(a)
>>> f(3) array([21., 21., 21.])
Support has been added for disabling Autograph for specific functions. (#705) (#710)
The decorator
catalyst.disable_autograph
allows one to disable Autograph from auto-converting specific external functions when called within a qjit-compiled function withautograph=True
:def approximate_e(n): num = 1. fac = 1. for i in range(1, n + 1): fac *= i num += 1. / fac return num @qml.qjit(autograph=True) def g(x: float, N: int): for i in range(N): x = x + catalyst.disable_autograph(approximate_e)(10) / x ** i return x
>>> g(0.1, 10) array(4.02997319)
Note that for Autograph to be disabled, the decorated function must be defined outside the qjit-compiled function. If it is defined within the qjit-compiled function, it will continue to be converted with Autograph.
In addition, Autograph can also be disabled for all externally defined functions within a qjit-compiled function via the context manager syntax:
@qml.qjit(autograph=True) def g(x: float, N: int): for i in range(N): with catalyst.disable_autograph: x = x + approximate_e(10) / x ** i return x
Support for including a list of (sub)modules to be allowlisted for autograph conversion. (#725)
Although library code is not meant to be targeted by Autograph conversion, it sometimes make sense to enable it for specific submodules that might benefit from such conversion:
@qjit(autograph=True, autograph_include=["excluded_module.submodule"]) def f(x): return excluded_module.submodule.func(x)
For example, this might be useful if importing functionality from PennyLane (such as a transform or decomposition), and would like to have Autograph capture and convert associated control flow.
Controlled operations that do not have a matrix representation defined are now supported via applying PennyLane’s decomposition. (#831)
@qjit @qml.qnode(qml.device("lightning.qubit", wires=2)) def circuit(): qml.Hadamard(0) qml.ctrl(qml.TrotterProduct(H, time=2.4, order=2), control=[1]) return qml.state()
Catalyst has now officially support on Linux aarch64, with pre-built binaries available on PyPI; simply
pip install pennylane-catalyst
on Linux aarch64 systems. (#767)
Improvements
Validation is now performed for observables and operations to ensure that provided circuits are compatible with the devices for execution. (#626) (#783)
dev = qml.device("lightning.qubit", wires=2, shots=10000) @qjit @qml.qnode(dev) def circuit(x): qml.Hadamard(wires=0) qml.CRX(x, wires=[0, 1]) return qml.var(qml.PauliZ(1))
>>> circuit(0.43) DifferentiableCompileError: Variance returns are forbidden in gradients
Catalyst’s adjoint and ctrl methods are now fully compatible with the PennyLane equivalent when applied to a single Operator. This should lead to improved compatibility with PennyLane library code, as well when reusing quantum functions with both Catalyst and PennyLane. (#768) (#771) (#802)
Controlled operations defined via specialized classes (like
Toffoli
orControlledQubitUnitary
) are now implemented as controlled versions of their base operation if the device supports it. In particular,MultiControlledX
is no longer executed as aQubitUnitary
with Lightning. (#792)The Catalyst frontend now supports Python logging through PennyLane’s
qml.logging
module. For more details, please see the logging documentation. (#660)Catalyst now performs a stricter validation of the wire requirements for devices. In particular, only integer, continuous wire labels starting at 0 are allowed. (#784)
Catalyst no longer disallows quantum circuits with 0 qubits. (#784)
Added support for
IsingZZ
as a native gate in Catalyst. Previously, the IsingZZ gate would be decomposed into a CNOT and RZ gates, even if a device supported it. (#730)All decorators in Catalyst, including
vmap
,qjit
,mitigate_with_zne
, as well as gradient decoratorsgrad
,jacobian
,jvp
, andvjp
, can now be used both with and without keyword arguments as a decorator without the need forfunctools.partial
: (#758) (#761) (#762) (#763)@qjit @grad(method="fd") def fn1(x): return x ** 2 @qjit(autograph=True) @grad def fn2(x): return jnp.sin(x)
>>> fn1(0.43) array(0.8600001) >>> fn2(0.12) array(0.99280864)
The built-in instrumentation with
detailed
output will no longer report the cumulative time for MLIR pipelines, since the cumulative time was being reported as just another step alongside individual timings for each pipeline. (#772)Raise a better error message when no shots are specified and
qml.sample
orqml.counts
is used. (#786)The finite difference method for differentiation is now always allowed, even on functions with mid-circuit measurements, callbacks without custom derivates, or other operations that cannot be differentiated via traditional autodiff. (#789)
A
non_commuting_observables
flag has been added to the device TOML schema, indicating whether or not the device supports measuring non-commuting observables. Iffalse
, non-commuting measurements will be split into multiple executions. (#821)The underlying PennyLane
Operation
objects forcond
,for_loop
, andwhile_loop
can now be accessed directly viabody_function.operation
. (#711)This can be beneficial when, among other things, writing transforms without using the queuing mechanism:
@qml.transform def my_quantum_transform(tape): ops = tape.operations.copy() @for_loop(0, 4, 1) def f(i, sum): qml.Hadamard(0) return sum+1 res = f(0) ops.append(f.operation) # This is now supported! def post_processing_fn(results): return results modified_tape = qml.tape.QuantumTape(ops, tape.measurements) print(res) print(modified_tape.operations) return [modified_tape], post_processing_fn @qml.qjit @my_quantum_transform @qml.qnode(qml.device("lightning.qubit", wires=2)) def main(): qml.Hadamard(0) return qml.probs()
>>> main() Traced<ShapedArray(int64[], weak_type=True)>with<DynamicJaxprTrace(level=2/1)> [Hadamard(wires=[0]), ForLoop(tapes=[[Hadamard(wires=[0])]])] (array([0.5, 0. , 0.5, 0. ]),)
Breaking changes
Binary distributions for Linux are now based on
manylinux_2_28
instead ofmanylinux_2014
. As a result, Catalyst will only be compatible on systems withglibc
versions2.28
and above (e.g., Ubuntu 20.04 and above). (#663)
Bug fixes
Functions that have been annotated with return type annotations will now correctly compile with
@qjit
. (#751)An issue in the Lightning backend for the Catalyst runtime has been fixed that would only compute approximate probabilities when implementing mid-circuit measurements. As a result, low shot numbers would lead to unexpected behaviours or projections on zero probability states. Probabilities for mid-circuit measurements are now always computed analytically. (#801)
The Catalyst runtime now raises an error if a qubit is accessed out of bounds from the allocated register. (#784)
jax.scipy.linalg.expm
is now supported within qjit-compiled functions. (#733) (#752)This required correctly linking openblas routines necessary for
jax.scipy.linalg.expm
. In this bug fix, four openblas routines were newly linked and are now discoverable bystablehlo.custom_call@<blas_routine>
. They areblas_dtrsm
,blas_ztrsm
,lapack_dgetrf
,lapack_zgetrf
.Fixes a bug where QNodes that contained
QubitUnitary
with a complex matrix would error during gradient computation. (#778)Callbacks can now return types which can be flattened and unflattened. (#812)
catalyst.qjit
andcatalyst.grad
now work correctly on functions that have been wrapped withfunctools.partial
. (#820)
Internal changes
Catalyst uses the
collapse
method of Lightning simulators inMeasure
to select a state vector branch and normalize. (#801)Measurement process primitives for Catalyst’s JAXPR representation now have a standardized call signature so that
shots
andshape
can both be provided as keyword arguments. (#790)The
QCtrl
class in Catalyst has been renamed toHybridCtrl
, indicating its capability to contain a nested scope of both quantum and classical operations. Usingctrl
on a single operation will now directly dispatch to the equivalent PennyLane class. (#771)The
Adjoint
class in Catalyst has been renamed toHybridAdjoint
, indicating its capability to contain a nested scope of both quantum and classical operations. Usingadjoint
on a single operation will now directly dispatch to the equivalent PennyLane class. (#768) (#802)Add support to use a locally cloned PennyLane Lightning repository with the runtime. (#732)
The
qjit_device.py
andpreprocessing.py
modules have been refactored into the sub-packagecatalyst.device
. (#721)The
ag_autograph.py
andautograph.py
modules have been refactored into the sub-packagecatalyst.autograph
. (#722)Callback refactoring. This refactoring creates the classes
FlatCallable
andMemrefCallable
. (#742)The
FlatCallable
class is aCallable
that is initialized by providing some parameters and kwparameters that match the the expected shapes that will be received at the callsite. Instead of taking shaped*args
and**kwargs
, it receives flattened arguments. The flattened arguments are unflattened with the shapes with which the function was initialized. TheFlatCallable
return values will allways be flattened before returning to the caller.The
MemrefCallable
is a subclass ofFlatCallable
. It takes a result type parameter during initialization that corresponds to the expected return type. This class is expected to be called only from the Catalyst runtime. It expects all arguments to bevoid*
to memrefs. Thesevoid*
are casted to MemrefStructDescriptors using ctypes, numpy arrays, and finally jax arrays. These flat jax arrays are then sent to theFlatCallable
.MemrefCallable
is again expected to be called only from within the Catalyst runtime. And the return values match those expected by Catalyst runtime.This separation allows for a better separation of concerns, provides a nicer interface and allows for multiple
MemrefCallable
to be defined for a single callback, which is necessary for custom gradient ofpure_callbacks
.A new
catalyst::gradient::GradientOpInterface
is available when querying the gradient method in the mlir c++ api. (#800)catalyst::gradient::GradOp
,ValueAndGradOp
,JVPOp
, andVJPOp
now inherits traits in this newGradientOpInterface
. The supported attributes are nowgetMethod()
,getCallee()
,getDiffArgIndices()
,getDiffArgIndicesAttr()
,getFiniteDiffParam()
, andgetFiniteDiffParamAttr()
.There are operations that could potentially be used as
GradOp
,ValueAndGradOp
,JVPOp
orVJPOp
. When trying to get the gradient method, instead of doingauto gradOp = dyn_cast<GradOp>(op); auto jvpOp = dyn_cast<JVPOp>(op); auto vjpOp = dyn_cast<VJPOp>(op); llvm::StringRef MethodName; if (gradOp) MethodName = gradOp.getMethod(); else if (jvpOp) MethodName = jvpOp.getMethod(); else if (vjpOp) MethodName = vjpOp.getMethod();
to identify which op it actually is and protect against segfaults (calling
nullptr.getMethod()
), in the new interface we just doauto gradOpInterface = cast<GradientOpInterface>(op); llvm::StringRef MethodName = gradOpInterface.getMethod();
Another advantage is that any concrete gradient operation object can behave like a
GradientOpInterface
:GradOp op; // or ValueAndGradOp op, ... auto foo = [](GradientOpInterface op){ llvm::errs() << op.getCallee(); }; foo(op); // this works!
Finally, concrete op specific methods can still be called by “reinterpret”-casting the interface back to a concrete op (provided the concrete op type is correct):
auto foo = [](GradientOpInterface op){ size_t numGradients = cast<ValueAndGradOp>(&op)->getGradients().size(); }; ValueAndGradOp op; foo(op); // this works!
Contributors
This release contains contributions from (in alphabetical order):
Ali Asadi, Lillian M.A. Frederiksen, David Ittah, Christina Lee, Erick Ochoa, Haochen Paul Wang, Lee James O’Riordan, Mehrdad Malekmohammadi, Vincent Michaud-Rioux, Mudit Pandey, Raul Torres, Sergei Mironov, Tzung-Han Juang.
Release 0.6.0¶
New features
Catalyst now supports externally hosted callbacks with parameters and return values within qjit-compiled code. This provides the ability to insert native Python code into any qjit-compiled function, allowing for the capability to include subroutines that do not yet support qjit-compilation and enhancing the debugging experience. (#540) (#596) (#610) (#650) (#649) (#661) (#686) (#689)
The following two callback functions are available:
catalyst.pure_callback
supports callbacks of pure functions. That is, functions with no side-effects that accept parameters and return values. However, the return type and shape of the function must be known in advance, and is provided as a type signature.@pure_callback def callback_fn(x) -> float: # here we call non-JAX compatible code, such # as standard NumPy return np.sin(x) @qjit def fn(x): return jnp.cos(callback_fn(x ** 2))
>>> fn(0.654) array(0.9151995)
catalyst.debug.callback
supports callbacks of functions with no return values. This makes it an easy entry point for debugging, for example via printing or logging at runtime.@catalyst.debug.callback def callback_fn(y): print("Value of y =", y) @qjit def fn(x): y = jnp.sin(x) callback_fn(y) return y ** 2
>>> fn(0.54) Value of y = 0.5141359916531132 array(0.26433582) >>> fn(1.52) Value of y = 0.998710143975583 array(0.99742195)
Note that callbacks do not currently support differentiation, and cannot be used inside functions that
catalyst.grad
is applied to.More flexible runtime printing through support for format strings. (#621)
The
catalyst.debug.print
function has been updated to support Python-like format strings:@qjit def cir(a, b, c): debug.print("{c} {b} {a}", a=a, b=b, c=c)
>>> cir(1, 2, 3) 3 2 1
Note that previous functionality of the print function to print out memory reference information of variables has been moved to
catalyst.debug.print_memref
.Catalyst now supports QNodes that execute on Oxford Quantum Circuits (OQC) superconducting hardware, via OQC Cloud. (#578) (#579) (#691)
To use OQC Cloud with Catalyst, simply ensure your credentials are set as environment variables, and load the
oqc.cloud
device to be used within your qjit-compiled workflows.import os os.environ["OQC_EMAIL"] = "your_email" os.environ["OQC_PASSWORD"] = "your_password" os.environ["OQC_URL"] = "oqc_url" dev = qml.device("oqc.cloud", backend="lucy", shots=2012, wires=2) @qjit @qml.qnode(dev) def circuit(a: float): qml.Hadamard(0) qml.CNOT(wires=[0, 1]) qml.RX(wires=0) return qml.counts(wires=[0, 1]) print(circuit(0.2))
Catalyst now ships with an instrumentation feature allowing to explore what steps are run during compilation and execution, and for how long. (#528) (#597)
Instrumentation can be enabled from the frontend with the
catalyst.debug.instrumentation
context manager:>>> @qjit ... def expensive_function(a, b): ... return a + b >>> with debug.instrumentation("session_name", detailed=False): ... expensive_function(1, 2) [DIAGNOSTICS] Running capture walltime: 3.299 ms cputime: 3.294 ms programsize: 0 lines [DIAGNOSTICS] Running generate_ir walltime: 4.228 ms cputime: 4.225 ms programsize: 14 lines [DIAGNOSTICS] Running compile walltime: 57.182 ms cputime: 12.109 ms programsize: 121 lines [DIAGNOSTICS] Running run walltime: 1.075 ms cputime: 1.072 ms
The results will be appended to the provided file if the
filename
attribute is set, and printed to the console otherwise. The flagdetailed
determines whether individual steps in the compiler and runtime are instrumented, or whether only high-level steps like “program capture” and “compilation” are reported.Measurements currently include wall time, CPU time, and (intermediate) program size.
Improvements
AutoGraph now supports return statements inside conditionals in qjit-compiled functions. (#583)
For example, the following pattern is now supported, as long as all return values have the same type:
@qjit(autograph=True) def fn(x): if x > 0: return jnp.sin(x) return jnp.cos(x)
>>> fn(0.1) array(0.09983342) >>> fn(-0.1) array(0.99500417)
This support extends to quantum circuits:
dev = qml.device("lightning.qubit", wires=1) @qjit(autograph=True) @qml.qnode(dev) def f(x: float): qml.RX(x, wires=0) m = catalyst.measure(0) if not m: return m, qml.expval(qml.PauliZ(0)) qml.RX(x ** 2, wires=0) return m, qml.expval(qml.PauliZ(0))
>>> f(1.4) (array(False), array(1.)) >>> f(1.4) (array(True), array(0.37945176))
Note that returning results with different types or shapes within the same function, such as different observables or differently shaped arrays, is not possible.
Errors are now raised at compile time if the gradient of an unsupported function is requested. (#204)
At the moment,
CompileError
exceptions will be raised if at compile time it is found that code reachable from the gradient operation contains either a mid-circuit measurement, a callback, or a JAX-style custom call (which happens through the mitigation operation as well as certain JAX operations).Catalyst now supports devices built from the new PennyLane device API. (#565) (#598) (#599) (#636) (#638) (#664) (#687)
When using the new device API, Catalyst will discard the preprocessing from the original device, replacing it with Catalyst-specific preprocessing based on the TOML file provided by the device. Catalyst also requires that provided devices specify their wires upfront.
A new compiler optimization that removes redundant chains of self inverse operations has been added. This is done within a new MLIR pass called
remove-chained-self-inverse
. Currently we only match redundant Hadamard operations, but the list of supported operations can be expanded. (#630)The
catalyst.measure
operation is now more lenient in the accepted type for thewires
parameter. In addition to a scalar, a 1D array is also accepted as long as it only contains one element. (#623)For example, the following is now supported:
catalyst.measure(wires=jnp.array([0]))
The compilation & execution of
@qjit
compiled functions can now be aborted using an interrupt signal (SIGINT). This includes usingCTRL-C
from a command line and theInterrupt
button in a Jupyter Notebook. (#642)The Catalyst Amazon Braket support has been updated to work with the latest version of the Amazon Braket PennyLane plugin (v1.25.0) and Amazon Braket Python SDK (v1.73.3) (#620) (#672) (#673)
Note that with this update, all declared qubits in a submitted program will always be measured, even if specific qubits were never used.
An updated quantum device specification format, TOML schema v2, is now supported by Catalyst. This allows device authors to specify properties such as native quantum control support, gate invertibility, and differentiability on a per-operation level. (#554)
For more details on the new TOML schema, please refer to the custom devices documentation.
An exception is now raised when OpenBLAS cannot be found by Catalyst during compilation. (#643)
Breaking changes
qml.sample
andqml.counts
now produce integer arrays for the sample array and basis state array when used without observables. (#648)The endianness of counts in Catalyst now matches the convention of PennyLane. (#601)
catalyst.debug.print
no longer supports thememref
keyword argument. Please usecatalyst.debug.print_memref
instead. (#621)
Bug fixes
The QNode argument
diff_method=None
is now supported for QNodes within a qjit-compiled function. (#658)A bug has been fixed where the C++ compiler driver was incorrectly being triggered twice. (#594)
Programs with
jnp.reshape
no longer fail. (#592)A bug in the quantum adjoint routine in the compiler has been fixed, which didn’t take into account control wires on operations in all instances. (#591)
A bug in the test suite causing stochastic autograph test failures has been fixed. (#652)
Running Catalyst tests should no longer raise
ResourceWarning
from the use oftempfile.TemporaryDirectory
. (#676)Raises an exception if the user has an incompatible CUDA Quantum version installed. (#707)
Internal changes
The deprecated
@qfunc
decorator, in use mainly by the LIT test suite, has been removed. (#679)Catalyst now publishes a revision string under
catalyst.__revision__
, in addition to the existingcatalyst.__version__
string. The revision contains the Git commit hash of the repository at the time of packaging, or for editable installations the active commit hash at the time of package import. (#560)The Python interpreter is now a shared resource across the runtime. (#615)
This change allows any part of the runtime to start executing Python code through pybind.
Contributors
This release contains contributions from (in alphabetical order):
Ali Asadi, David Ittah, Romain Moyard, Sergei Mironov, Erick Ochoa Lopez, Lee James O’Riordan, Muzammiluddin Syed.
Release 0.5.0¶
New features
Catalyst now provides a QJIT compatible
catalyst.vmap
function, which makes it even easier to modify functions to map over inputs with additional batch dimensions. (#497) (#569)When working with tensor/array frameworks in Python, it can be important to ensure that code is written to minimize usage of Python for loops (which can be slow and inefficient), and instead push as much of the computation through to the array manipulation library, by taking advantage of extra batch dimensions.
For example, consider the following QNode:
dev = qml.device("lightning.qubit", wires=1) @qml.qnode(dev) def circuit(x, y): qml.RX(jnp.pi * x[0] + y, wires=0) qml.RY(x[1] ** 2, wires=0) qml.RX(x[1] * x[2], wires=0) return qml.expval(qml.PauliZ(0))
>>> circuit(jnp.array([0.1, 0.2, 0.3]), jnp.pi) Array(-0.93005586, dtype=float64)
We can use
catalyst.vmap
to introduce additional batch dimensions to our input arguments, without needing to use a Python for loop:>>> x = jnp.array([[0.1, 0.2, 0.3], ... [0.4, 0.5, 0.6], ... [0.7, 0.8, 0.9]]) >>> y = jnp.array([jnp.pi, jnp.pi / 2, jnp.pi / 4]) >>> qjit(vmap(cost))(x, y) array([-0.93005586, -0.97165424, -0.6987465 ])
catalyst.vmap()
has been implemented to match the same behaviour ofjax.vmap
, so should be a drop-in replacement in most cases. Under-the-hood, it is automatically inserting Catalyst-compatible for loops, which will be compiled and executed outside of Python for increased performance.Catalyst now supports compiling and executing QJIT-compiled QNodes using the CUDA Quantum compiler toolchain. (#477) (#536) (#547)
Simply import the CUDA Quantum
@cudaqjit
decorator to use this functionality:from catalyst.cuda import cudaqjit
Or, if using Catalyst from PennyLane, simply specify
@qml.qjit(compiler="cuda_quantum")
.The following devices are available when compiling with CUDA Quantum:
softwareq.qpp
: a modern C++ state-vector simulatornvidia.custatevec
: The NVIDIA CuStateVec GPU simulator (with support for multi-gpu)nvidia.cutensornet
: The NVIDIA CuTensorNet GPU simulator (with support for matrix product state)
For example:
dev = qml.device("softwareq.qpp", wires=2) @cudaqjit @qml.qnode(dev) def circuit(x): qml.RX(x[0], wires=0) qml.RY(x[1], wires=1) qml.CNOT(wires=[0, 1]) return qml.expval(qml.PauliY(0))
>>> circuit(jnp.array([0.5, 1.4])) -0.47244976756708373
Note that CUDA Quantum compilation currently does not have feature parity with Catalyst compilation; in particular, AutoGraph, control flow, differentiation, and various measurement statistics (such as probabilities and variance) are not yet supported. Classical code support is also limited.
Catalyst now supports just-in-time compilation of static (compile-time constant) arguments. (#476) (#550)
The
@qjit
decorator takes a new argumentstatic_argnums
, which specifies positional arguments of the decorated function should be treated as compile-time static arguments.This allows any hashable Python object to be passed to the function during compilation; the function will only be re-compiled if the hash value of the static arguments change. Otherwise, re-using previous static argument values will result in no re-compilation.
@qjit(static_argnums=(1,)) def f(x, y): print(f"Compiling with y={y}") return x + y
>>> f(0.5, 0.3) Compiling with y=0.3 array(0.8) >>> f(0.1, 0.3) # no re-compilation occurs array(0.4) >>> f(0.1, 0.4) # y changes, re-compilation Compiling with y=0.4 array(0.5)
This functionality can be used to support passing arbitrary Python objects to QJIT-compiled functions, as long as they are hashable:
from dataclasses import dataclass @dataclass class MyClass: val: int def __hash__(self): return hash(str(self)) @qjit(static_argnums=(1,)) def f(x: int, y: MyClass): return x + y.val
>>> f(1, MyClass(5)) array(6) >>> f(1, MyClass(6)) # re-compilation array(7) >>> f(2, MyClass(5)) # no re-compilation array(7)
Mid-circuit measurements now support post-selection and qubit reset when used with the Lightning simulators. (#491) (#507)
To specify post-selection, simply pass the
postselect
argument to thecatalyst.measure
function:dev = qml.device("lightning.qubit", wires=1) @qjit @qml.qnode(dev) def f(): qml.Hadamard(0) m = measure(0, postselect=1) return qml.expval(qml.PauliZ(0))
Likewise, to reset a wire after mid-circuit measurement, simply specify
reset=True
:dev = qml.device("lightning.qubit", wires=1) @qjit @qml.qnode(dev) def f(): qml.Hadamard(0) m = measure(0, reset=True) return qml.expval(qml.PauliZ(0))
Improvements
Catalyst now supports Python 3.12 (#532)
The JAX version used by Catalyst has been updated to
v0.4.23
. (#428)Catalyst now supports the
qml.GlobalPhase
operation. (#563)Native support for
qml.PSWAP
andqml.ISWAP
gates on Amazon Braket devices has been added. (#458)Specifically, a circuit like
dev = qml.device("braket.local.qubit", wires=2, shots=100) @qjit @qml.qnode(dev) def f(x: float): qml.Hadamard(0) qml.PSWAP(x, wires=[0, 1]) qml.ISWAP(wires=[1, 0]) return qml.probs()
Add support for
GlobalPhase
gate in the runtime. (#563)would no longer decompose the
PSWAP
andISWAP
gates.The
qml.BlockEncode
operator is now supported with Catalyst. (#483)Catalyst no longer relies on a TensorFlow installation for its AutoGraph functionality. Instead, the standalone
diastatic-malt
package is used and automatically installed as a dependency. (#401)The
@qjit
decorator will remember previously compiled functions when the PyTree metadata of arguments changes, in addition to also remembering compiled functions when static arguments change. (#522)The following example will no longer trigger a third compilation:
@qjit def func(x): print("compiling") return x
>>> func([1,]); # list compiling >>> func((2,)); # tuple compiling >>> func([3,]); # list
Note however that in order to keep overheads low, changing the argument type or shape (in a promotion incompatible way) may override a previously stored function (with identical PyTree metadata and static argument values):
@qjit def func(x): print("compiling") return x
>>> func(jnp.array(1)); # scalar compiling >>> func(jnp.array([2.])); # 1-D array compiling >>> func(jnp.array(3)); # scalar compiling
Catalyst gradient functions (
grad
,jacobian
,vjp
, andjvp
) now support being applied to functions that use (nested) container types as inputs and outputs. This includes lists and dictionaries, as well as any data structure implementing the PyTree protocol. (#500) (#501) (#508) (#549)dev = qml.device("lightning.qubit", wires=1) @qml.qnode(dev) def circuit(phi, psi): qml.RY(phi, wires=0) qml.RX(psi, wires=0) return [{"expval0": qml.expval(qml.PauliZ(0))}, qml.expval(qml.PauliZ(0))] psi = 0.1 phi = 0.2
>>> qjit(jacobian(circuit, argnum=[0, 1]))(psi, phi) [{'expval0': (array(-0.0978434), array(-0.19767681))}, (array(-0.0978434), array(-0.19767681))]
Support has been added for linear algebra functions which depend on computing the eigenvalues of symmetric matrices, such as
np.sqrt_matrix()
. (#488)For example, you can compile
qml.math.sqrt_matrix
:@qml.qjit def workflow(A): B = qml.math.sqrt_matrix(A) return B @ A
Internally, this involves support for lowering the eigenvectors/values computation lapack method
lapack_dsyevd
viastablehlo.custom_call
.Additional debugging functions are now available in the
catalyst.debug
directory. (#529) (#522)This includes:
filter_static_args(args, static_argnums)
to remove static values from arguments using the provided index list.get_cmain(fn, *args)
to return a C program that calls a jitted function with the provided arguments.print_compilation_stage(fn, stage)
to print one of the recorded compilation stages for a JIT-compiled function.
For more details, please see the
catalyst.debug
documentation.Remove redundant copies of TOML files for
lightning.kokkos
andlightning.qubit
. (#472)lightning.kokkos
andlightning.qubit
now ship with their own TOML file. As such, we use the TOML file provided by them.Capturing quantum circuits with many gates prior to compilation is now quadratically faster (up to a factor), by removing
qextract_p
andqinst_p
from forced-order primitives. (#469)Update
AllocateQubit
andAllocateQubits
inLightningKokkosSimulator
to preserve the current state-vector before qubit re-allocations in the runtime dynamic qubits management. (#479)The PennyLane custom compiler entry point name convention has changed, necessitating a change to the Catalyst entry points. (#493)
Breaking changes
Catalyst gradient functions now match the Jax convention for the returned axes of gradients, Jacobians, VJPs, and JVPs. As a result, the returned tensor shape from various Catalyst gradient functions may differ compared to previous versions of Catalyst. (#500) (#501) (#508)
The Catalyst Python frontend has been partially refactored. The impact on user-facing functionality is minimal, but the location of certain classes and methods used by the package may have changed. (#529) (#522)
The following changes have been made:
Some debug methods and features on the QJIT class have been turned into free functions and moved to the
catalyst.debug
module, which will now appear in the public documention. This includes compiling a program from IR, obtaining a C program to invoke a compiled function from, and printing fine-grained MLIR compilation stages.The
compilation_pipelines.py
module has been renamed tojit.py
, and certain functionality has been moved out (see following items).A new module
compiled_functions.py
now manages low-level access to compiled functions.A new module
tracing/type_signatures.py
handles functionality related managing arguments and type signatures during the tracing process.The
contexts.py
module has been moved fromutils
to the newtracing
sub-module.
Internal changes
Changes to the runtime QIR API and dependencies, to avoid symbol conflicts with other libraries that utilize QIR. (#464) (#470)
The existing Catalyst runtime implements QIR as a library that can be linked against a QIR module. This works great when Catalyst is the only implementor of QIR, however it may generate symbol conflicts when used alongside other QIR implementations.
To avoid this, two changes were necessary:
The Catalyst runtime now has a different API from QIR instructions.
The runtime has been modified such that QIR instructions are lowered to functions where the
__quantum__
part of the function name is replaced with__catalyst__
. This prevents the possibility of symbol conflicts with other libraries that implement QIR as a library.The Catalyst runtime no longer depends on QIR runner’s stdlib.
We no longer depend nor link against QIR runner’s stdlib. By linking against QIR runner’s stdlib, some definitions persisted that may be different than ones used by third party implementors. To prevent symbol conflicts QIR runner’s stdlib was removed and is no longer linked against. As a result, the following functions are now defined and implemented in Catalyst’s runtime:
int64_t __catalyst__rt__array_get_size_1d(QirArray *)
int8_t *__catalyst__rt__array_get_element_ptr_1d(QirArray *, int64_t)
and the following functions were removed since the frontend does not generate them
QirString *__catalyst__rt__qubit_to_string(QUBIT *)
QirString *__catalyst__rt__result_to_string(RESULT *)
Fix an issue when no qubit number was specified for the
qinst
primitive. The primitive now correctly deduces the number of qubits when no gate parameters are present. This change is not user facing. (#496)
Bug fixes
Fixed a bug where differentiation of sliced arrays would result in an error. (#552)
def f(x): return jax.numpy.sum(x[::2]) x = jax.numpy.array([0.1, 0.2, 0.3, 0.4])
>>> catalyst.qjit(catalyst.grad(f))(x) [1. 0. 1. 0.]
Fixed a bug where quantum control applied to a subcircuit was not correctly mapping wires, and the wires in the nested region remained unchanged. (#555)
Catalyst will no longer print a warning that recompilation is triggered when a
@qjit
decorated function with no arguments is invoke without having been compiled first, for example via the use oftarget="mlir"
. (#522)Fixes a bug in the configuration of dynamic shaped arrays that would cause certain program to error with
TypeError: cannot unpack non-iterable ShapedArray object
. (#526)This is fixed by replacing the code which updates the
JAX_DYNAMIC_SHAPES
option with atransient_jax_config()
context manager which temporarily sets the value ofJAX_DYNAMIC_SHAPES
to True and then restores the original configuration value following the yield. The context manager is used bytrace_to_jaxpr()
andlower_jaxpr_to_mlir()
.Exceptions encountered in the runtime when using the
@qjit
optionasync_qnodes=Tue
will now be properly propagated to the frontend. (#447) (#510)This is done by:
changeing
llvm.call
tollvm.invoke
setting async runtime tokens and values to be errors
deallocating live tokens and values
Fixes a bug when computing gradients with the indexing/slicing, by fixing the scatter operation lowering when
updatedWindowsDim
is empty. (#475)Fix the issue in
LightningKokkos::AllocateQubits
with allocating too many qubit IDs on qubit re-allocation. (#473)Fixed an issue where wires was incorrectly set as
<Wires = [<WiresEnum.AnyWires: -1>]>
when usingcatalyst.adjoint
andcatalyst.ctrl
, by adding awires
property to these operations. (#480)Fix the issue with multiple lapack symbol definitions in the compiled program by updating the
stablehlo.custom_call
conversion pass. (#488)
Contributors
This release contains contributions from (in alphabetical order):
Mikhail Andrenkov, Ali Asadi, David Ittah, Tzung-Han Juang, Erick Ochoa Lopez, Romain Moyard, Raul Torres, Haochen Paul Wang.
Release 0.4.1¶
Improvements
Catalyst wheels are now packaged with OpenMP and ZStd, which avoids installing additional requirements separately in order to use pre-packaged Catalyst binaries. (#457) (#478)
Note that OpenMP support for the
lightning.kokkos
backend has been disabled on macOS x86_64, due to memory issues in the computation of Lightning’s adjoint-jacobian in the presence of multiple OMP threads.
Bug fixes
Resolve an infinite recursion in the decomposition of the
Controlled
operator whenever computing a Unitary matrix for the operator fails. (#468)Resolve a failure to generate gradient code for specific input circuits. (#439)
In this case,
jnp.mod
was used to compute wire values in a for loop, which prevented the gradient architecture from fully separating quantum and classical code. The following program is now supported:@qjit @grad @qml.qnode(dev) def f(x): def cnot_loop(j): qml.CNOT(wires=[j, jnp.mod((j + 1), 4)]) for_loop(0, 4, 1)(cnot_loop)() return qml.expval(qml.PauliZ(0))
Resolve unpredictable behaviour when importing libraries that share Catalyst’s LLVM dependency (e.g. TensorFlow). In some cases, both packages exporting the same symbols from their shared libraries can lead to process crashes and other unpredictable behaviour, since the wrong functions can be called if both libraries are loaded in the current process. The fix involves building shared libraries with hidden (macOS) or protected (linux) symbol visibility by default, exporting only what is necessary. (#465)
Resolve a failure to find the SciPy OpenBLAS library when running Catalyst, due to a different SciPy version being used to build Catalyst than to run it. (#471)
Resolve a memory leak in the runtime stemming from missing calls to device destructors at the end of programs. (#446)
Contributors
This release contains contributions from (in alphabetical order):
Ali Asadi, David Ittah.
Release 0.4.0¶
New features
Catalyst is now accessible directly within the PennyLane user interface, once Catalyst is installed, allowing easy access to Catalyst just-in-time functionality.
Through the use of the
qml.qjit
decorator, entire workflows can be JIT compiled down to a machine binary on first-function execution, including both quantum and classical processing. Subsequent calls to the compiled function will execute the previously-compiled binary, resulting in significant performance improvements.import pennylane as qml dev = qml.device("lightning.qubit", wires=2) @qml.qjit @qml.qnode(dev) def circuit(theta): qml.Hadamard(wires=0) qml.RX(theta, wires=1) qml.CNOT(wires=[0, 1]) return qml.expval(qml.PauliZ(wires=1))
>>> circuit(0.5) # the first call, compilation occurs here array(0.) >>> circuit(0.5) # the precompiled quantum function is called array(0.)
Currently, PennyLane supports the Catalyst hybrid compiler with the
qml.qjit
decorator, which directly aliases Catalyst’scatalyst.qjit
.In addition to the above
qml.qjit
integration, the following native PennyLane functions can now be used with theqjit
decorator:qml.adjoint
,qml.ctrl
,qml.grad
,qml.jacobian
,qml.vjp
,qml.jvp
, andqml.adjoint
,qml.while_loop
,qml.for_loop
,qml.cond
. These will alias to the corresponding Catalyst functions when used within aqjit
context.For more details on these functions, please refer to the PennyLane compiler documentation and compiler module documentation.
Just-in-time compiled functions now support asynchronuous execution of QNodes. (#374) (#381) (#420) (#424) (#433)
Simply specify
async_qnodes=True
when using the@qjit
decorator to enable the async execution of QNodes. Currently, asynchronous execution is only supported bylightning.qubit
andlightning.kokkos
.Asynchronous execution will be most beneficial for just-in-time compiled functions that contain — or generate — multiple QNodes.
For example,
dev = qml.device("lightning.qubit", wires=2) @qml.qnode(device=dev) def circuit(params): qml.RX(params[0], wires=0) qml.RY(params[1], wires=1) qml.CNOT(wires=[0, 1]) return qml.expval(qml.PauliZ(wires=0)) @qjit(async_qnodes=True) def multiple_qnodes(params): x = jnp.sin(params) y = jnp.cos(params) z = jnp.array([circuit(x), circuit(y)]) # will be executed in parallel return circuit(z)
>>> func(jnp.array([1.0, 2.0])) 1.0
Here, the first two circuit executions will occur in parallel across multiple threads, as their execution can occur indepdently.
Preliminary support for PennyLane transforms has been added. (#280)
@qjit @qml.transforms.split_non_commuting @qml.qnode(dev) def circuit(x): qml.RX(x,wires=0) return [qml.expval(qml.PauliY(0)), qml.expval(qml.PauliZ(0))]
>>> circuit(0.4) [array(-0.51413599), array(0.85770868)]
Currently, most PennyLane transforms will work with Catalyst as long as:
The circuit does not include any Catalyst-specific features, such as Catalyst control flow or measurement,
The QNode returns only lists of measurement processes,
AutoGraph is disabled, and
The transformation does not require or depend on the numeric value of dynamic variables.
Catalyst now supports just-in-time compilation of dynamically-shaped arrays. (#366) (#386) (#390) (#411)
The
@qjit
decorator can now be used to compile functions that accepts or contain tensors whose dimensions are not known at compile time; runtime execution with different shapes is supported without recompilation.In addition, standard tensor initialization functions
jax.numpy.ones
,jnp.zeros
, andjnp.empty
now accept dynamic variables (where the value is only known at runtime).@qjit def func(size: int): return jax.numpy.ones([size, size], dtype=float)
>>> func(3) [[1. 1. 1.] [1. 1. 1.] [1. 1. 1.]]
When passing tensors as arguments to compiled functions, the
abstracted_axes
keyword argument to the@qjit
decorator can be used to specify which axes of the input arguments should be treated as abstract (and thus avoid recompilation).For example, without specifying
abstracted_axes
, the followingsum
function would recompile each time an array of different size is passed as an argument:>>> @qjit >>> def sum_fn(x): >>> return jnp.sum(x) >>> sum_fn(jnp.array([1])) # Compilation happens here. >>> sum_fn(jnp.array([1, 1])) # And here!
By passing
abstracted_axes
, we can specify that the first axes of the first argument is to be treated as dynamic during initial compilation:>>> @qjit(abstracted_axes={0: "n"}) >>> def sum_fn(x): >>> return jnp.sum(x) >>> sum_fn(jnp.array([1])) # Compilation happens here. >>> sum_fn(jnp.array([1, 1])) # No need to recompile.
Note that support for dynamic arrays in control-flow primitives (such as loops), is not yet supported.
Error mitigation using the zero-noise extrapolation method is now available through the
catalyst.mitigate_with_zne
transform. (#324) (#414)For example, given a noisy device (such as noisy hardware available through Amazon Braket):
dev = qml.device("noisy.device", wires=2) @qml.qnode(device=dev) def circuit(x, n): @for_loop(0, n, 1) def loop_rx(i): qml.RX(x, wires=0) loop_rx() qml.Hadamard(wires=0) qml.RZ(x, wires=0) loop_rx() qml.RZ(x, wires=0) qml.CNOT(wires=[1, 0]) qml.Hadamard(wires=1) return qml.expval(qml.PauliY(wires=0)) @qjit def mitigated_circuit(args, n): s = jax.numpy.array([1, 2, 3]) return mitigate_with_zne(circuit, scale_factors=s)(args, n)
>>> mitigated_circuit(0.2, 5) 0.5655341100116512
In addition, a mitigation dialect has been added to the MLIR layer of Catalyst. It contains a Zero Noise Extrapolation (ZNE) operation, with a lowering to a global folded circuit.
Improvements
The three backend devices provided with Catalyst,
lightning.qubit
,lightning.kokkos
, andbraket.aws
, are now dynamically loaded at runtime. (#343) (#400)This takes advantage of the new backend plugin system provided in Catalyst v0.3.2, and allows the devices to be packaged separately from the runtime CAPI. Provided backend devices are now loaded at runtime, instead of being linked at compile time.
For more details on the backend plugin system, see the custom devices documentation.
Finite-shot measurement statistics (
expval
,var
, andprobs
) are now supported for thelightning.qubit
andlightning.kokkos
devices. Previously, exact statistics were returned even when finite shots were specified. (#392) (#410)>>> dev = qml.device("lightning.qubit", wires=2, shots=100) >>> @qjit >>> @qml.qnode(dev) >>> def circuit(x): >>> qml.RX(x, wires=0) >>> return qml.probs(wires=0) >>> circuit(0.54) array([0.94, 0.06]) >>> circuit(0.54) array([0.93, 0.07])
Catalyst gradient functions
grad
,jacobian
,jvp
, andvjp
can now be invoked from outside a@qjit
context. (#375)This simplifies the process of writing functions where compilation can be turned on and off easily by adding or removing the decorator. The functions dispatch to their JAX equivalents when the compilation is turned off.
dev = qml.device("lightning.qubit", wires=2) @qml.qnode(dev) def circuit(x): qml.RX(x, wires=0) return qml.expval(qml.PauliZ(0))
>>> grad(circuit)(0.54) # dispatches to jax.grad Array(-0.51413599, dtype=float64, weak_type=True) >>> qjit(grad(circuit))(0.54). # differentiates using Catalyst array(-0.51413599)
New
lightning.qubit
configuration options are now supported via theqml.device
loader, including Markov Chain Monte Carlo sampling support. (#369)dev = qml.device("lightning.qubit", wires=2, shots=1000, mcmc=True) @qml.qnode(dev) def circuit(x): qml.RX(x, wires=0) return qml.expval(qml.PauliZ(0))
>>> circuit(0.54) array(0.856)
Improvements have been made to the runtime and quantum MLIR dialect in order to support asynchronous execution.
The runtime now supports multiple active devices managed via a device pool. The new
RTDevice
data-class andRTDeviceStatus
along with thethread_local
device instance pointer enable the runtime to better scope the lifetime of device instances concurrently. With these changes, one can create multiple active devices and execute multiple programs in a multithreaded environment. (#381)The ability to dynamically release devices has been added via
DeviceReleaseOp
in the Quantum MLIR dialect. This is lowered to the__quantum__rt__device_release()
runtime instruction, which updates the status of the device instance fromActive
toInactive
. The runtime will reuse this deactivated instance instead of creating a new one automatically at runtime in a multi-QNode workflow when another device with identical specifications is requested. (#381)The
DeviceOp
definition in the Quantum MLIR dialect has been updated to lower a tuple of device information('lib', 'name', 'kwargs')
to a single device initialization call__quantum__rt__device_init(int8_t *, int8_t *, int8_t *)
. This allows the runtime to initialize device instances without keeping partial information of the device (#396)
The quantum adjoint compiler routine has been extended to support function calls that affect the quantum state within an adjoint region. Note that the function may only provide a single result consisting of the quantum register. By itself this provides no user-facing changes, but compiler pass developers may now generate quantum adjoint operations around a block of code containing function calls as well as quantum operations and control flow operations. (#353)
The allocation and deallocation operations in MLIR (
AllocOp
,DeallocOp
) now follow simple value semantics for qubit register values, instead of modelling memory in the MLIR trait system. Similarly, the frontend generates proper value semantics by deallocating the final register value.The change enables functions at the MLIR level to accept and return quantum register values, which would otherwise not be correctly identified as aliases of existing register values by the bufferization system. (#360)
Breaking changes
Third party devices must now provide a configuration TOML file, in order to specify their supported operations, measurements, and features for Catalyst compatibility. For more information please visit the Custom Devices section in our documentation. (#369)
Bug fixes
Resolves a bug in the compiler’s differentiation engine that results in a segmentation fault when attempting to differentiate non-differentiable quantum operations. The fix ensures that all existing quantum operation types are removed during gradient passes that extract classical code from a QNode function. It also adds a verification step that will raise an error if a gradient pass cannot successfully eliminate all quantum operations for such functions. (#397)
Resolves a bug that caused unpredictable behaviour when printing string values with the
debug.print
function. The issue was caused by non-null-terminated strings. (#418)
Contributors
This release contains contributions from (in alphabetical order):
Ali Asadi, David Ittah, Romain Moyard, Sergei Mironov, Erick Ochoa Lopez, Shuli Shu.
Release 0.3.2¶
New features
The experimental AutoGraph feature now supports Python
while
loops, allowing native Python loops to be captured and compiled with Catalyst. (#318)dev = qml.device("lightning.qubit", wires=4) @qjit(autograph=True) @qml.qnode(dev) def circuit(n: int, x: float): i = 0 while i < n: qml.RX(x, wires=i) i += 1 return qml.expval(qml.PauliZ(0))
>>> circuit(4, 0.32) array(0.94923542)
This feature extends the existing AutoGraph support for Python
for
loops andif
statements introduced in v0.3. Note that TensorFlow must be installed for AutoGraph support.For more details, please see the AutoGraph guide.
In addition to loops and conditional branches, AutoGraph now supports native Python
and
,or
andnot
operators in Boolean expressions. (#325)dev = qml.device("lightning.qubit", wires=1) @qjit(autograph=True) @qml.qnode(dev) def circuit(x: float): if x >= 0 and x < jnp.pi: qml.RX(x, wires=0) return qml.probs()
>>> circuit(0.43) array([0.95448287, 0.04551713]) >>> circuit(4.54) array([1., 0.])
Note that logical Boolean operators will only be captured by AutoGraph if all operands are dynamic variables (that is, a value known only at runtime, such as a measurement result or function argument). For other use cases, it is recommended to use the
jax.numpy.logical_*
set of functions where appropriate.Debug compiled programs and print dynamic values at runtime with
debug.print
(#279) (#356)You can now print arbitrary values from your running program, whether they are arrays, constants, strings, or abitrary Python objects. Note that while non-array Python objects will be printed at runtime, their string representation is captured at compile time, and thus will always be the same regardless of program inputs. The output for arrays optionally includes a descriptor for how the data is stored in memory (“memref”).
@qjit def func(x: float): debug.print(x, memref=True) debug.print("exit")
>>> func(jnp.array(0.43)) MemRef: base@ = 0x5629ff2b6680 rank = 0 offset = 0 sizes = [] strides = [] data = 0.43 exit
Catalyst now officially supports macOS X86_64 devices, with macOS binary wheels available for both AARCH64 and X86_64. (#347) (#313)
It is now possible to dynamically load third-party Catalyst compatible devices directly into a pre-installed Catalyst runtime on Linux. (#327)
To take advantage of this, third-party devices must implement the
Catalyst::Runtime::QuantumDevice
interface, in addition to defining the following method:extern "C" Catalyst::Runtime::QuantumDevice* getCustomDevice() { return new CustomDevice(); }
This support can also be integrated into existing PennyLane Python devices that inherit from the
QuantumDevice
class, by defining theget_c_interface
static method.For more details, see the custom devices documentation.
Improvements
Return values of conditional functions no longer need to be of exactly the same type. Type promotion is automatically applied to branch return values if their types don’t match. (#333)
@qjit def func(i: int, f: float): @cond(i < 3) def cond_fn(): return i @cond_fn.otherwise def otherwise(): return f return cond_fn()
>>> func(1, 4.0) array(1.0)
Automatic type promotion across conditional branches also works with AutoGraph:
@qjit(autograph=True) def func(i: int, f: float): if i < 3: i = i else: i = f return i
>>> func(1, 4.0) array(1.0)
AutoGraph now supports converting functions even when they are invoked through functional wrappers such as
adjoint
,ctrl
,grad
,jacobian
, etc. (#336)For example, the following should now succeed:
def inner(n): for i in range(n): qml.T(i) @qjit(autograph=True) @qml.qnode(dev) def f(n: int): adjoint(inner)(n) return qml.state()
To prepare for Catalyst’s frontend being integrated with PennyLane, the appropriate plugin entry point interface has been added to Catalyst. (#331)
For any compiler packages seeking to be registered in PennyLane, the
entry_points
metadata under the the group namepennylane.compilers
must be added, with the following entry points:context
: Path to the compilation evaluation context manager. This context manager should have the methodcontext.is_tracing()
, which returns True if called within a program that is being traced or captured.ops
: Path to the compiler operations module. This operations module may contain compiler specific versions of PennyLane operations. Within a JIT context, PennyLane operations may dispatch to these.qjit
: Path to the JIT compiler decorator provided by the compiler. This decorator should have the signatureqjit(fn, *args, **kwargs)
, wherefn
is the function to be compiled.
The compiler driver diagnostic output has been improved, and now includes failing IR as well as the names of failing passes. (#349)
The scatter operation in the Catalyst dialect now uses an SCF for loop to avoid ballooning the compiled code. (#307)
The
CopyGlobalMemRefPass
pass of our MLIR processing pipeline now supports dynamically shaped arrays. (#348)The Catalyst utility dialect is now included in the Catalyst MLIR C-API. (#345)
Fix an issue with the AutoGraph conversion system that would prevent the fallback to Python from working correctly in certain instances. (#352)
The following type of code is now supported:
@qjit(autograph=True) def f(): l = jnp.array([1, 2]) for _ in range(2): l = jnp.kron(l, l) return l
Catalyst now supports
jax.numpy.polyfit
inside a qjitted function. (#367)Catalyst now supports custom calls (including the one from HLO). We added support in MLIR (operation, bufferization and lowering). In the
lib_custom_calls
, developers then implement their custom calls and use external functions directly (e.g. Lapack). The OpenBlas library is taken from Scipy and linked in Catalyst, therefore any function from it can be used. (#367)
Breaking changes
The axis ordering for
catalyst.jacobian
is updated to matchjax.jacobian
. Assuming we have parameters of shape[a,b]
and results of shape[c,d]
, the returned Jacobian will now have shape[c, d, a, b]
instead of[a, b, c, d]
. (#283)
Bug fixes
An upstream change in the PennyLane-Lightning project was addressed to prevent compilation issues in the
StateVectorLQubitDynamic
class in the runtime. The issue was introduced in #499. (#322)The
requirements.txt
file to build Catalyst from source has been updated with a minimum pip version,>=22.3
. Previous versions of pip are unable to perform editable installs when the system-wide site-packages are read-only, even when the--user
flag is provided. (#311)The frontend has been updated to make it compatible with PennyLane
MeasurementProcess
objects now being PyTrees in PennyLane version 0.33. (#315)
Contributors
This release contains contributions from (in alphabetical order):
Ali Asadi, David Ittah, Sergei Mironov, Romain Moyard, Erick Ochoa Lopez.
Release 0.3.1¶
New features
The experimental AutoGraph feature, now supports Python
for
loops, allowing native Python loops to be captured and compiled with Catalyst. (#258)dev = qml.device("lightning.qubit", wires=n) @qjit(autograph=True) @qml.qnode(dev) def f(n): for i in range(n): qml.Hadamard(wires=i) return qml.expval(qml.PauliZ(0))
This feature extends the existing AutoGraph support for Python
if
statements introduced in v0.3. Note that TensorFlow must be installed for AutoGraph support.The quantum control operation can now be used in conjunction with Catalyst control flow, such as loops and conditionals, via the new
catalyst.ctrl
function. (#282)Similar in behaviour to the
qml.ctrl
control modifier from PennyLane,catalyst.ctrl
can additionally wrap around quantum functions which contain control flow, such as the Catalystcond
,for_loop
, andwhile_loop
primitives.@qjit @qml.qnode(qml.device("lightning.qubit", wires=4)) def circuit(x): @for_loop(0, 3, 1) def repeat_rx(i): qml.RX(x / 2, wires=i) catalyst.ctrl(repeat_rx, control=3)() return qml.expval(qml.PauliZ(0))
>>> circuit(0.2) array(1.)
Catalyst now supports JAX’s
array.at[index]
notation for array element assignment and updating. (#273)@qjit def add_multiply(l: jax.core.ShapedArray((3,), dtype=float), idx: int): res = l.at[idx].multiply(3) res2 = l.at[idx].add(2) return res + res2 res = add_multiply(jnp.array([0, 1, 2]), 2)
>>> res [0, 2, 10]
For more details on available methods, see the JAX documentation.
Improvements
The Lightning backend device has been updated to work with the new PL-Lightning monorepo. (#259) (#277)
A new compiler driver has been implemented in C++. This improves compile-time performance by avoiding round-tripping, which is when the entire program being compiled is dumped to a textual form and re-parsed by another tool.
This is also a requirement for providing custom metadata at the LLVM level, which is necessary for better integration with tools like Enzyme. Finally, this makes it more natural to improve error messages originating from C++ when compared to the prior subprocess-based approach. (#216)
Support the
braket.devices.Devices
enum class ands3_destination_folder
device options for AWS Braket remote devices. (#278)Improvements have been made to the build process, including avoiding unnecessary processes such as removing
opt
and downloading the wheel. (#298)Remove a linker warning about duplicate
rpath
s when Catalyst wheels are installed on macOS. (#314)
Bug fixes
Fix incompatibilities with GCC on Linux introduced in v0.3.0 when compiling user programs. Due to these, Catalyst v0.3.0 only works when clang is installed in the user environment.
Remove undocumented package dependency on the zlib/zstd compression library. (#308)
Fix filesystem issue when compiling multiple functions with the same name and
keep_intermediate=True
. (#306)Add support for applying the
adjoint
operation toQubitUnitary
gates.QubitUnitary
was not able to beadjoint
ed when the variable holding the unitary matrix might change. This can happen, for instance, inside of a for loop. To solve this issue, the unitary matrix gets stored in the array list via push and pops. The unitary matrix is later reconstructed from the array list andQubitUnitary
can be executed in theadjoint
ed context. (#304) (#310)
Contributors
This release contains contributions from (in alphabetical order):
Ali Asadi, David Ittah, Erick Ochoa Lopez, Jacob Mai Peng, Sergei Mironov, Romain Moyard.
Release 0.3.0¶
New features
Catalyst now officially supports macOS ARM devices, such as Apple M1/M2 machines, with macOS binary wheels available on PyPI. For more details on the changes involved to support macOS, please see the improvements section. (#229) (#232) (#233) (#234)
Write Catalyst-compatible programs with native Python conditional statements. (#235)
AutoGraph is a new, experimental, feature that automatically converts Python conditional statements like
if
,else
, andelif
, into their equivalent functional forms provided by Catalyst (such ascatalyst.cond
).This feature is currently opt-in, and requires setting the
autograph=True
flag in theqjit
decorator:dev = qml.device("lightning.qubit", wires=1) @qjit(autograph=True) @qml.qnode(dev) def f(x): if x < 0.5: qml.RY(jnp.sin(x), wires=0) else: qml.RX(jnp.cos(x), wires=0) return qml.expval(qml.PauliZ(0))
The implementation is based on the AutoGraph module from TensorFlow, and requires a working TensorFlow installation be available. In addition, Python loops (
for
andwhile
) are not yet supported, and do not work in AutoGraph mode.Note that there are some caveats when using this feature especially around the use of global variables or object mutation inside of methods. A functional style is always recommended when using
qjit
or AutoGraph.The quantum adjoint operation can now be used in conjunction with Catalyst control flow, such as loops and conditionals. For this purpose a new instruction,
catalyst.adjoint
, has been added. (#220)catalyst.adjoint
can wrap around quantum functions which contain the Catalystcond
,for_loop
, andwhile_loop
primitives. Previously, the usage ofqml.adjoint
on functions with these primitives would result in decomposition errors. Note that a future release of Catalyst will merge the behaviour ofcatalyst.adjoint
intoqml.adjoint
for convenience.dev = qml.device("lightning.qubit", wires=3) @qjit @qml.qnode(dev) def circuit(x): @for_loop(0, 3, 1) def repeat_rx(i): qml.RX(x / 2, wires=i) adjoint(repeat_rx)() return qml.expval(qml.PauliZ(0))
>>> circuit(0.2) array(0.99500417)
Additionally, the ability to natively represent the adjoint construct in Catalyst’s program representation (IR) was added.
QJIT-compiled programs now support (nested) container types as inputs and outputs of compiled functions. This includes lists and dictionaries, as well as any data structure implementing the PyTree protocol. (#215) (#221)
For example, a program that accepts and returns a mix of dictionaries, lists, and tuples:
@qjit def workflow(params1, params2): res1 = params1["a"][0][0] + params2[1] return {"y1": jnp.sin(res1), "y2": jnp.cos(res1)}
>>> params1 = {"a": [[0.1], 0.2]} >>> params2 = (0.6, 0.8) >>> workflow(params1, params2) array(0.78332691)
Compile-time backpropagation of arbitrary hybrid programs is now supported, via integration with Enzyme AD. (#158) (#193) (#224) (#225) (#239) (#244)
This allows
catalyst.grad
to differentiate hybrid functions that contain both classical pre-processing (inside & outside of QNodes), QNodes, as well as classical post-processing (outside of QNodes) via a combination of backpropagation and quantum gradient methods.The new default for the differentiation
method
attribute incatalyst.grad
has been changed to"auto"
, which performs Enzyme-based reverse mode AD on classical code, in conjunction with the quantumdiff_method
specified on each QNode:dev = qml.device("lightning.qubit", wires=1) @qml.qnode(dev, diff_method="parameter-shift") def circuit(theta): qml.RX(jnp.exp(theta ** 2) / jnp.cos(theta / 4), wires=0) return qml.expval(qml.PauliZ(wires=0))
>>> grad = qjit(catalyst.grad(circuit, method="auto")) >>> grad(jnp.pi) array(0.05938718)
The reworked differentiation pipeline means you can now compute exact derivatives of programs with both classical pre- and post-processing, as shown below:
@qml.qnode(qml.device("lightning.qubit", wires=1), diff_method="adjoint") def circuit(theta): qml.RX(jnp.exp(theta ** 2) / jnp.cos(theta / 4), wires=0) return qml.expval(qml.PauliZ(wires=0)) def loss(theta): return jnp.pi / jnp.tanh(circuit(theta)) @qjit def grad_loss(theta): return catalyst.grad(loss)(theta)
>>> grad_loss(1.0) array(-1.90958669)
You can also use multiple QNodes with different differentiation methods:
@qml.qnode(qml.device("lightning.qubit", wires=1), diff_method="parameter-shift") def circuit_A(params): qml.RX(jnp.exp(params[0] ** 2) / jnp.cos(params[1] / 4), wires=0) return qml.probs() @qml.qnode(qml.device("lightning.qubit", wires=1), diff_method="adjoint") def circuit_B(params): qml.RX(jnp.exp(params[1] ** 2) / jnp.cos(params[0] / 4), wires=0) return qml.expval(qml.PauliZ(wires=0)) def loss(params): return jnp.prod(circuit_A(params)) + circuit_B(params) @qjit def grad_loss(theta): return catalyst.grad(loss)(theta)
>>> grad_loss(jnp.array([1.0, 2.0])) array([ 0.57367285, 44.4911605 ])
And you can differentiate purely classical functions as well:
def square(x: float): return x ** 2 @qjit def dsquare(x: float): return catalyst.grad(square)(x)
>>> dsquare(2.3) array(4.6)
Note that the current implementation of reverse mode AD is restricted to 1st order derivatives, but you can still use
catalyst.grad(method="fd")
is still available to perform a finite differences approximation of any differentiable function.Add support for the new PennyLane arithmetic operators. (#250)
PennyLane is in the process of replacing
Hamiltonian
andTensor
observables with a set of general arithmetic operators. These consist of Prod, Sum and SProd.By default, using dunder methods (eg.
+
,-
,@
,*
) to combine operators with scalars or other operators will createHamiltonian
andTensor
objects. However, these two methods will be deprecated in coming releases of PennyLane.To enable the new arithmetic operators, one can use
Prod
,Sum
, andSprod
directly or activate them by calling enable_new_opmath at the beginning of your PennyLane program.dev = qml.device("lightning.qubit", wires=2) @qjit @qml.qnode(dev) def circuit(x: float, y: float): qml.RX(x, wires=0) qml.RX(y, wires=1) qml.CNOT(wires=[0, 1]) return qml.expval(0.2 * qml.PauliX(wires=0) - 0.4 * qml.PauliY(wires=1))
>>> qml.operation.enable_new_opmath() >>> qml.operation.active_new_opmath() True >>> circuit(np.pi / 4, np.pi / 2) array(0.28284271)
Improvements
Better support for Hamiltonian observables:
Allow Hamiltonian observables with integer coefficients. (#248)
For example, compiling the following circuit wasn’t previously allowed, but is now supported in Catalyst:
dev = qml.device("lightning.qubit", wires=2) @qjit @qml.qnode(dev) def circuit(x: float, y: float): qml.RX(x, wires=0) qml.RY(y, wires=1) coeffs = [1, 2] obs = [qml.PauliZ(0), qml.PauliZ(1)] return qml.expval(qml.Hamiltonian(coeffs, obs))
Allow nested Hamiltonian observables. (#255)
@qjit @qml.qnode(qml.device("lightning.qubit", wires=3)) def circuit(x, y, coeffs1, coeffs2): qml.RX(x, wires=0) qml.RX(y, wires=1) qml.RY(x + y, wires=2) obs = [ qml.PauliX(0) @ qml.PauliZ(1), qml.Hamiltonian(coeffs1, [qml.PauliZ(0) @ qml.Hadamard(2)]), ] return qml.var(qml.Hamiltonian(coeffs2, obs))
Various performance improvements:
The execution and compile time of programs has been reduced, by generating more efficient code and avoiding unnecessary optimizations. Specifically, a scalarization procedure was added to the MLIR pass pipeline, and LLVM IR compilation is now invoked with optimization level 0. (#217)
The execution time of compiled functions has been improved in the frontend. (#213)
Specifically, the following changes have been made, which leads to a small but measurable improvement when using larger matrices as inputs, or functions with many inputs:
only loading the user program library once per compilation,
generating return value types only once per compilation,
avoiding unnecessary type promotion, and
avoiding unnecessary array copies.
Peak memory utilization of a JIT compiled program has been reduced, by allowing tensors to be scheduled for deallocation. Previously, the tensors were not deallocated until the end of the call to the JIT compiled function. (#201)
Various improvements have been made to enable Catalyst to compile on macOS:
Remove unnecessary
reinterpret_cast
fromObsManager
. Removal of thesereinterpret_cast
allows compilation of the runtime to succeed in macOS. macOS uses an ILP32 mode for Aarch64 where they use the full 64 bit mode but with 32 bit Integer, Long, and Pointers. This patch also changes a test file to prevent a mismatch in machines which compile using ILP32 mode. (#229)Allow runtime to be compiled on macOS. Substitute
nproc
with a call toos.cpu_count()
and use correct flags forld.64
. (#232)Improve portability on the frontend to be available on macOS. Use
.dylib
, remove unnecessary flags, and address behaviour difference in flags. (#233)Small compatibility changes in order for all integration tests to succeed on macOS. (#234)
Dialects can compile with older versions of clang by avoiding type mismatches. (#228)
The runtime is now built against
qir-stdlib
pre-build artifacts. (#236)Small improvements have been made to the CI/CD, including fixing the Enzyme cache, generalize caches to other operating systems, fix build wheel recipe, and remove references to QIR in runtime’s Makefile. (#243) (#247)
Breaking changes
Support for Python 3.8 has been removed. (#231)
The default differentiation method on
grad
andjacobian
is reverse-mode automatic differentiation instead of finite differences. When a QNode does not have adiff_method
specified, it will default to using the parameter shift method instead of finite-differences. (#244) (#271)The JAX version used by Catalyst has been updated to
v0.4.14
, the minimum PennyLane version required is nowv0.32
. (#264)Due to the change allowing Python container objects as inputs to QJIT-compiled functions, Python lists are no longer automatically converted to JAX arrays. (#231)
This means that indexing on lists when the index is not static will cause a
TracerIntegerConversionError
, consistent with JAX’s behaviour.That is, the following example is no longer support:
@qjit def f(x: list, index: int): return x[index]
However, if the parameter
x
above is a JAX or NumPy array, the compilation will continue to succeed.The
catalyst.grad
function has been renamed tocatalyst.jacobian
and supports differentiation of functions that return multiple or non-scalar outputs. A newcatalyst.grad
function has been added that enforces that it is differentiating a function with a single scalar return value. (#254)
Bug fixes
Fixed an issue preventing the differentiation of
qml.probs
with the parameter-shift method. (#211)Fixed the incorrect return value data-type with functions returning
qml.counts
. (#221)Fix segmentation fault when differentiating a function where a quantum measurement is used multiple times by the same operation. (#242)
Contributors
This release contains contributions from (in alphabetical order):
Ali Asadi, David Ittah, Erick Ochoa Lopez, Jacob Mai Peng, Romain Moyard, Sergei Mironov.
Release 0.2.1¶
Bug fixes
Add missing OpenQASM backend in binary distribution, which relies on the latest version of the AWS Braket plugin for PennyLane to resolve dependency issues between the plugin, Catalyst, and PennyLane. The Lightning-Kokkos backend with Serial and OpenMP modes is also added to the binary distribution. #198
Return a list of decompositions when calling the decomposition method for control operations. This allows Catalyst to be compatible with upstream PennyLane. #241
Improvements
When using OpenQASM-based devices the string representation of the circuit is printed on exception. #199
Use
pybind11::module
interface library instead ofpybind11::embed
in the runtime for OpenQasm backend to avoid linking to the python library at compile time. #200
Contributors
This release contains contributions from (in alphabetical order):
Ali Asadi, David Ittah.
Release 0.2.0¶
New features
Catalyst programs can now be used inside of a larger JAX workflow which uses JIT compilation, automatic differentiation, and other JAX transforms. #96 #123 #167 #192
For example, call a Catalyst qjit-compiled function from within a JAX jit-compiled function:
dev = qml.device("lightning.qubit", wires=1) @qjit @qml.qnode(dev) def circuit(x): qml.RX(jnp.pi * x[0], wires=0) qml.RY(x[1] ** 2, wires=0) qml.RX(x[1] * x[2], wires=0) return qml.probs(wires=0) @jax.jit def cost_fn(weights): x = jnp.sin(weights) return jnp.sum(jnp.cos(circuit(x)) ** 2)
>>> cost_fn(jnp.array([0.1, 0.2, 0.3])) Array(1.32269195, dtype=float64)
Catalyst-compiled functions can now also be automatically differentiated via JAX, both in forward and reverse mode to first-order,
>>> jax.grad(cost_fn)(jnp.array([0.1, 0.2, 0.3])) Array([0.49249037, 0.05197949, 0.02991883], dtype=float64)
as well as vectorized using
jax.vmap
:>>> jax.vmap(cost_fn)(jnp.array([[0.1, 0.2, 0.3], [0.4, 0.5, 0.6]])) Array([1.32269195, 1.53905377], dtype=float64)
In particular, this allows for a reduction in boilerplate when using JAX-compatible optimizers such as
jaxopt
:>>> opt = jaxopt.GradientDescent(cost_fn) >>> params = jnp.array([0.1, 0.2, 0.3]) >>> (final_params, _) = jax.jit(opt.run)(params) >>> final_params Array([-0.00320799, 0.03475223, 0.29362844], dtype=float64)
Note that, in general, best performance will be seen when the Catalyst
@qjit
decorator is used to JIT the entire hybrid workflow. However, there may be cases where you may want to delegate only the quantum part of your workflow to Catalyst, and let JAX handle classical components (for example, due to missing a feature or compatibility issue in Catalyst).Support for Amazon Braket devices provided via the PennyLane-Braket plugin. #118 #139 #179 #180
This enables quantum subprograms within a JIT-compiled Catalyst workflow to execute on Braket simulator and hardware devices, including remote cloud-based simulators such as SV1.
def circuit(x, y): qml.RX(y * x, wires=0) qml.RX(x * 2, wires=1) return qml.expval(qml.PauliY(0) @ qml.PauliZ(1)) @qjit def workflow(x: float, y: float): device = qml.device("braket.local.qubit", backend="braket_sv", wires=2) g = qml.qnode(device)(circuit) h = catalyst.grad(g) return h(x, y) workflow(1.0, 2.0)
For a list of available devices, please see the PennyLane-Braket documentation.
Internally, the quantum instructions are generating OpenQASM3 kernels at runtime; these are then executed on both local (
braket.local.qubit
) and remote (braket.aws.qubit
) devices backed by Amazon Braket Python SDK,with measurement results then propagated back to the frontend.
Note that at initial release, not all Catalyst features are supported with Braket. In particular, dynamic circuit features, such as mid-circuit measurements, will not work with Braket devices.
Catalyst conditional functions defined via
@catalyst.cond
now support an arbitrary number of ‘else if’ chains. #104dev = qml.device("lightning.qubit", wires=1) @qjit @qml.qnode(dev) def circuit(x): @catalyst.cond(x > 2.7) def cond_fn(): qml.RX(x, wires=0) @cond_fn.else_if(x > 1.4) def cond_elif(): qml.RY(x, wires=0) @cond_fn.otherwise def cond_else(): qml.RX(x ** 2, wires=0) cond_fn() return qml.probs(wires=0)
Iterating in reverse is now supported with constant negative step sizes via
catalyst.for_loop
. #129dev = qml.device("lightning.qubit", wires=1) @qjit @qml.qnode(dev) def circuit(n): @catalyst.for_loop(n, 0, -1) def loop_fn(_): qml.PauliX(0) loop_fn() return measure(0)
Additional gradient transforms for computing the vector-Jacobian product (VJP) and Jacobian-vector product (JVP) are now available in Catalyst. #98
Use
catalyst.vjp
to compute the forward-pass value and VJP:@qjit def vjp(params, cotangent): def f(x): y = [jnp.sin(x[0]), x[1] ** 2, x[0] * x[1]] return jnp.stack(y) return catalyst.vjp(f, [params], [cotangent])
>>> x = jnp.array([0.1, 0.2]) >>> dy = jnp.array([-0.5, 0.1, 0.3]) >>> vjp(x, dy) [array([0.09983342, 0.04 , 0.02 ]), array([-0.43750208, 0.07000001])]
Use
catalyst.jvp
to compute the forward-pass value and JVP:@qjit def jvp(params, tangent): def f(x): y = [jnp.sin(x[0]), x[1] ** 2, x[0] * x[1]] return jnp.stack(y) return catalyst.jvp(f, [params], [tangent])
>>> x = jnp.array([0.1, 0.2]) >>> tangent = jnp.array([0.3, 0.6]) >>> jvp(x, tangent) [array([0.09983342, 0.04 , 0.02 ]), array([0.29850125, 0.24000006, 0.12 ])]
Support for multiple backend devices within a single qjit-compiled function is now available. #86 #89
For example, if you compile the Catalyst runtime with
lightning.kokkos
support (via the compilation flagENABLE_LIGHTNING_KOKKOS=ON
), you can uselightning.qubit
andlightning.kokkos
within a singular workflow:dev1 = qml.device("lightning.qubit", wires=1) dev2 = qml.device("lightning.kokkos", wires=1) @qml.qnode(dev1) def circuit1(x): qml.RX(jnp.pi * x[0], wires=0) qml.RY(x[1] ** 2, wires=0) qml.RX(x[1] * x[2], wires=0) return qml.var(qml.PauliZ(0)) @qml.qnode(dev2) def circuit2(x): @catalyst.cond(x > 2.7) def cond_fn(): qml.RX(x, wires=0) @cond_fn.otherwise def cond_else(): qml.RX(x ** 2, wires=0) cond_fn() return qml.probs(wires=0) @qjit def cost(x): return circuit2(circuit1(x))
>>> x = jnp.array([0.54, 0.31]) >>> cost(x) array([0.80842369, 0.19157631])
Support for returning the variance of Hamiltonians, Hermitian matrices, and Tensors via
qml.var
has been added. #124dev = qml.device("lightning.qubit", wires=2) @qjit @qml.qnode(dev) def circuit(x): qml.RX(jnp.pi * x[0], wires=0) qml.RY(x[1] ** 2, wires=1) qml.CNOT(wires=[0, 1]) qml.RX(x[1] * x[2], wires=0) return qml.var(qml.PauliZ(0) @ qml.PauliX(1))
>>> x = jnp.array([0.54, 0.31]) >>> circuit(x) array(0.98851544)
Breaking changes
The
catalyst.grad
function now supports using the differentiation method defined on the QNode (via thediff_method
argument) rather than applying a global differentiation method. #163As part of this change, the
method
argument now accepts the following options:method="auto"
: Quantum components of the hybrid function are differentiated according to the corresponding QNodediff_method
, while the classical computation is differentiated using traditional auto-diff.With this strategy, Catalyst only currently supports QNodes with
diff_method="param-shift" and
diff_method=”adjoint”`.method="fd"
: First-order finite-differences for the entire hybrid function. Thediff_method
argument for each QNode is ignored.
This is an intermediate step towards differentiating functions that internally call multiple QNodes, and towards supporting differentiation of classical postprocessing.
Improvements
Catalyst has been upgraded to work with JAX v0.4.13. #143 #185
Add a Backprop operation for using autodifferentiation (AD) at the LLVM level with Enzyme AD. The Backprop operations has a bufferization pattern and a lowering to LLVM. #107 #116
Error handling has been improved. The runtime now throws more descriptive and unified expressions for runtime errors and assertions. #92
In preparation for easier debugging, the compiler has been refactored to allow easy prototyping of new compilation pipelines. #38
In the future, this will allow the ability to generate MLIR or LLVM-IR by loading input from a string or file, rather than generating it from Python.
As part of this refactor, the following changes were made:
Passes are now classes. This allows developers/users looking to change flags to inherit from these passes and change the flags.
Passes are now passed as arguments to the compiler. Custom passes can just be passed to the compiler as an argument, as long as they implement a run method which takes an input and the output of this method can be fed to the next pass.
Improved Python compatibility by providing a stable signature for user generated functions. #106
Handle C++ exceptions without unwinding the whole stack. #99
Reduce the number of classical invocations by counting the number of gate parameters in the
argmap
function. #136Prior to this, the computation of hybrid gradients executed all of the classical code being differentiated in a
pcount
function that solely counted the number of gate parameters in the quantum circuit. This was soargmap
and other downstream functions could allocate memrefs large enough to store all gate parameters.Now, instead of counting the number of parameters separately, a dynamically-resizable array is used in the
argmap
function directly to store the gate parameters. This removes one invocation of all of the classical code being differentiated.Use Tablegen to define MLIR passes instead of C++ to reduce overhead of adding new passes. #157
Perform constant folding on wire indices for
quantum.insert
andquantum.extract
ops, used when writing (resp. reading) qubits to (resp. from) quantum registers. #161Represent known named observables as members of an MLIR Enum rather than a raw integer. This improves IR readability. #165
Bug fixes
Fix a bug in the mapping from logical to concrete qubits for mid-circuit measurements. #80
Fix a bug in the way gradient result type is inferred. #84
Fix a memory regression and reduce memory footprint by removing unnecessary temporary buffers. #100
Provide a new abstraction to the
QuantumDevice
interface in the runtime calledDataView
. C++ implementations of the interface can iterate through and directly store results into theDataView
independent of the underlying memory layout. This can eliminate redundant buffer copies at the interface boundaries, which has been applied to existing devices. #109Reduce memory utilization by transferring ownership of buffers from the runtime to Python instead of copying them. This includes adding a compiler pass that copies global buffers into the heap as global buffers cannot be transferred to Python. #112
Temporary fix of use-after-free and dependency of uninitialized memory. #121
Fix file renaming within pass pipelines. #126
Fix the issue with the
do_queue
deprecation warnings in PennyLane. #146Fix the issue with gradients failing to work with hybrid functions that contain constant
jnp.array
objects. This will enable PennyLane operators that have data in the form of ajnp.array
, such as a Hamiltonian, to be included in a qjit-compiled function. #152An example of a newly supported workflow:
coeffs = jnp.array([0.1, 0.2]) terms = [qml.PauliX(0) @ qml.PauliZ(1), qml.PauliZ(0)] H = qml.Hamiltonian(coeffs, terms) @qjit @qml.qnode(qml.device("lightning.qubit", wires=2)) def circuit(x): qml.RX(x[0], wires=0) qml.RY(x[1], wires=0) qml.CNOT(wires=[0, 1]) return qml.expval(H) params = jnp.array([0.3, 0.4]) jax.grad(circuit)(params)
Contributors
This release contains contributions from (in alphabetical order):
Ali Asadi, David Ittah, Erick Ochoa Lopez, Jacob Mai Peng, Romain Moyard, Sergei Mironov.
Release 0.1.2¶
New features
Add an option to print verbose messages explaining the compilation process. #68
Allow
catalyst.grad
to be used on any traceable function (within a qjit context). This means the operation is no longer restricted to acting onqml.qnode
s only. #75
Improvements
Work in progress on a Lightning-Kokkos backend:
Bring feature parity to the Lightning-Kokkos backend simulator. #55
Add support for variance measurements for all observables. #70
Build the runtime against qir-stdlib v0.1.0. #58
Replace input-checking assertions with exceptions. #67
Perform function inlining to improve optimizations and memory management within the compiler. #72
Breaking changes
Bug fixes
Several fixes to address memory leaks in the compiled program:
Fix memory leaks from data that flows back into the Python environment. #54
Fix memory leaks resulting from partial bufferization at the MLIR level. This fix makes the necessary changes to reintroduce the
-buffer-deallocation
pass into the MLIR pass pipeline. The pass guarantees that all allocations contained within a function (that is allocations that are not returned from a function) are also deallocated. #61Lift heap allocations for quantum op results from the runtime into the MLIR compiler core. This allows all memref buffers to be memory managed in MLIR using the MLIR bufferization infrastructure. #63
Eliminate all memory leaks by tracking memory allocations at runtime. The memory allocations which are still alive when the compiled function terminates, will be freed in the finalization / teardown function. #78
Fix returning complex scalars from the compiled function. #77
Contributors
This release contains contributions from (in alphabetical order):
Ali Asadi, David Ittah, Erick Ochoa Lopez, Sergei Mironov.
Release 0.1.1¶
New features
Adds support for interpreting control flow operations. #31
Improvements
Adds fallback compiler drivers to increase reliability during linking phase. Also adds support for a CATALYST_CC environment variable for manual specification of the compiler driver used for linking. #30
Breaking changes
Bug fixes
Fixes the Catalyst image path in the readme to properly render on PyPI.
Contributors
This release contains contributions from (in alphabetical order):
Ali Asadi, Erick Ochoa Lopez.
Release 0.1.0¶
Initial public release.
Contributors
This release contains contributions from (in alphabetical order):
Ali Asadi, Sam Banning, David Ittah, Josh Izaac, Erick Ochoa Lopez, Sergei Mironov, Isidor Schoch.