Quantum Datasets

PennyLane provides the data subpackage to download, create, store and manipulate quantum datasets, where the quantum dataset is a collection of quantum data obtained from various quantum systems that describe it and its evolution.


The packages zstd and dill are required to use the data module. These can be installed with pip install zstd dill.


PennyLane datasets use the dill module to compress, store, and read data. Since dill is built on the pickle module, we reproduce an important warning from the pickle module: it is possible to construct malicious pickle data which will execute arbitrary code during unpickling. Never unpickle data that could have come from an untrusted source, or that could have been tampered with.

Loading Datasets in PennyLane

We can access data of a desired type with the load() or load_interactive() functions. These download the desired datasets or load them from local storage if previously downloaded.

To specify the dataset to be loaded, the data category (data_name) must be specified, alongside category-specific keyword arguments. For the full list of available datasets, please see the datasets website. The load() function returns a list with the desired data.

>>> H2datasets = qml.data.load("qchem", molname="H2", basis="STO-3G", bondlength=1.1)
>>> print(H2datasets)
[<Dataset = description: qchem/H2/STO-3G/1.1, attributes: ['molecule', 'hamiltonian', ...]>]
>>> H2data = H2datasets[0]

We can load datasets for multiple parameter values by providing a list of values instead of a single value. To load all possible values, use the special keyword “full”.

>>> H2datasets = qml.data.load("qchem", molname="H2", basis="full", bondlength=[0.5, 1.1])
>>> print(H2datasets)
[<Dataset = description: qchem/H2/6-31G/0.5, attributes: ['molecule', 'hamiltonian', ...]>,
 <Dataset = description: qchem/H2/6-31G/1.1, attributes: ['molecule', 'hamiltonian', ...]>,
 <Dataset = description: qchem/H2/STO-3G/0.5, attributes: ['molecule', 'hamiltonian', ...]>,
 <Dataset = description: qchem/H2/STO-3G/1.1, attributes: ['molecule', 'hamiltonian', ...]>]

When we only want to download portions of a large dataset, we can specify the desired properties (referred to as attributes). For example, we can download or load only the molecule and energy of a dataset as follows:

>>> part = qml.data.load("qchem", molname="H2", basis="STO-3G", bondlength=1.1,
...                      attributes=["molecule", "fci_energy"])[0]
>>> part.molecule
<Molecule = H2, Charge: 0, Basis: STO-3G, Orbitals: 2, Electrons: 2>
>>> part.fci_energy

To determine what attributes are available for a type of dataset, we can use the function list_attributes():

>>> qml.data.list_attributes(data_name="qchem")


“full” is the default value for attributes, and it means that all available attributes for the Dataset will be downloaded.

Using Datasets in PennyLane

Once loaded, one can access properties of the datasets:

>>> H2data.molecule
<Molecule = H2, Charge: 0, Basis: STO-3G, Orbitals: 2, Electrons: 2>
>>> print(H2data.hf_state)
[1 1 0 0]

The loaded data items are fully compatible with PennyLane. We can therefore use them directly in a PennyLane circuits as follows:

>>> dev = qml.device("default.qubit",wires=4)
>>> @qml.qnode(dev)
... def circuit():
...     qml.BasisState(H2data.hf_state, wires = [0, 1, 2, 3])
...     for op in H2data.vqe_gates:
...         qml.apply(op)
...     return qml.expval(H2data.hamiltonian)
>>> print(circuit())

Dataset Structure

You can call the list_datasets() function to get a snapshot of the currently available data. This function returns a nested dictionary as we show below.

>>> available_data = qml.data.list_datasets()
>>> available_data.keys()
dict_keys(["qspin", "qchem"])
>>> available_data["qchem"].keys()
dict_keys(["H2", "LiH", ...])
>>> available_data['qchem']['H2'].keys()
dict_keys(["6-31G", "STO-3G"])
>>> print(available_data['qchem']['H2']['STO-3G'])
["0.5", "0.54", "0.62", "0.66", ...]

Note that this example limits the results of the function calls for clarity and that as more data becomes available, the results of these function calls will change.

Creating Custom Datasets

The functionality in data also includes creating and reading custom-made datasets. We can use custom datasets to store any data generated in PennyLane and its supporting data. To create a dataset, we can do the following:

>>> coeffs = [1, 0.5]
>>> observables = [qml.PauliZ(wires=0), qml.PauliX(wires=1)]
>>> H = qml.Hamiltonian(coeffs, observables)
>>> energies, _ = np.linalg.eigh(qml.matrix(H)) #Calculate the energies
>>> dataset = qml.data.Dataset(data_name = "Example", hamiltonian=H, energies=energies)
>>> dataset.data_name
>>> dataset.hamiltonian
(0.5) [X1]
+ (1) [Z0]
>>> dataset.energies
array([-1.5, -0.5,  0.5,  1.5])

We can then write this Dataset to storage and read it as follows:

>>> dataset.write("./path/to/dataset.dat")
>>> read_dataset = qml.data.Dataset()
>>> read_dataset.read("./path/to/dataset.dat")
>>> read_dataset.data_name
>>> read_dataset.hamiltonian
(0.5) [X1]
+ (1) [Z0]
>>> read_dataset.energies
array([-1.5, -0.5,  0.5,  1.5])

Quantum Datasets Functions and Classes



Create a dataset object to store a collection of information describing a physical system and its evolution.



Returns a dictionary of the available datasets.


List the attributes that exist for a specific data_name.


Downloads the data if it is not already present in the directory and return it to user as a Dataset object.


Download a dataset using an interactive load prompt.