qml.RotosolveOptimizer¶
- class RotosolveOptimizer(substep_optimizer='brute', substep_kwargs=None)[source]¶
Bases:
object
Rotosolve gradient-free optimizer.
The Rotosolve optimizer minimizes an objective function with respect to the parameters of a quantum circuit without the need for calculating the gradient of the function. The algorithm updates the parameters \(\boldsymbol{\theta} = \theta_1, \dots, \theta_D\) by separately reconstructing the cost function with respect to each circuit parameter, while keeping all other parameters fixed.
- Parameters
substep_optimizer (str or callable) –
Optimizer to use for the substeps of Rotosolve that carries out a univariate (i.e., single-parameter) global optimization. Only used if there are more than one frequency for a given parameter. It must take as inputs:
A function
fn
that maps scalars to scalars,the (keyword) argument
bounds
, andoptional keyword arguments.
It must return two scalars:
The input value
x_min
for whichfn
is minimal, andthe minimal value
y_min=fn(x_min)
orNone
.
Alternatively, the following optimizers are built-in and can be chosen by passing their name:
"brute"
: An iterative version of SciPy’s brute force optimizer. It evaluates the function atNs
equidistant points across the range \([-\pi, \pi]\) and iteratively refines the range around the point with the smallest cost value fornum_steps
times."shgo"
: SciPy’s SHGO optimizer.
substep_kwargs (dict) – Keyword arguments to be passed to the
substep_optimizer
callable. Forsubstep_optimizer="shgo"
, the original keyword arguments of the SciPy implementation are available, forsubstep_optimizer="brute"
the keyword argumentsranges
,Ns
andnum_steps
are useful. Only used if there are more than one frequency for a given parameter.
For each parameter, a purely classical one-dimensional global optimization over the interval \((-\pi,\pi]\) is performed, which is replaced automatically by a closed-form expression for the optimal value if the \(d\text{th}\) parametrized gate has only two eigenvalues. This means that
substep_optimizer
andsubstep_kwargs
will not be used for these parameters. In this case, the optimal value \(\theta^*_d\) is given analytically by\[\begin{split}\theta^*_d &= \underset{\theta_d}{\text{argmin}}\left<H\right>_{\theta_d}\\ &= -\frac{\pi}{2} - \text{arctan2}\left(2\left<H\right>_{\theta_d=0} - \left<H\right>_{\theta_d=\pi/2} - \left<H\right>_{\theta_d=-\pi/2}, \left<H\right>_{\theta_d=\pi/2} - \left<H\right>_{\theta_d=-\pi/2}\right),\end{split}\]where \(\left<H\right>_{\theta_d}\) is the expectation value of the objective function restricted to only depend on the parameter \(\theta_d\).
Warning
The built-in one-dimensional optimizers
"brute"
and"shgo"
for the substeps of a Rotosolve optimization step use the interval \((-\pi,\pi]\), rescaled with the inverse smallest frequency as default domain to optimize over. For complicated cost functions, this domain might not be suitable for the substep optimization and an appropriate range should be passed viabounds
insubstep_kwargs
.The algorithm is described in further detail in Vidal and Theis (2018), Nakanishi, Fujii and Todo (2019), Parrish et al. (2019), and Ostaszewski et al. (2019), and the reconstruction method used for more general operations is described in Wierichs et al. (2022).
Warning
RotosolveOptimizer
will only update parameters that are explicitly marked as trainable. This can be done viarequires_grad
if using Autograd or PyTorch.RotosolveOptimizer
is not yet implemented to work in a stable manner with TensorFlow or JAX.Example:
Initialize the optimizer and set the number of steps to optimize over. Recall that the optimization with
RotosolveOptimizer
uses global optimization substeps of univariate functions. The optimization technique for these substeps can be chosen via thesubstep_optimizer
andsubstep_kwargs
keyword arguments. Here we use the built-in iterative version of SciPy’s brute force optimizer with four iterations. We will run Rotosolve itself for three iterations.>>> opt_kwargs = {"num_steps": 4} >>> opt = qml.optimize.RotosolveOptimizer(substep_optimizer="brute", substep_kwargs=opt_kwargs) >>> num_steps = 3
Next, we create a QNode we wish to optimize:
dev = qml.device('default.qubit', wires=3, shots=None) @qml.qnode(dev) def cost_function(rot_param, layer_par, crot_param, rot_weights=None, crot_weights=None): for i, par in enumerate(rot_param * rot_weights): qml.RX(par, wires=i) for w in dev.wires: qml.RX(layer_par, wires=w) for i, par in enumerate(crot_param*crot_weights): qml.CRY(par, wires=[i, (i+1)%3]) return qml.expval(qml.Z(0) @ qml.Z(1) @ qml.Z(2))
This QNode is defined simply by measuring the expectation value of the tensor product of
PauliZ
operators on all qubits. It takes three parameters:rot_param
controls three Pauli rotations with three parameters, multiplied withrot_weights
,layer_par
feeds into a layer of rotations with a single parameter, andcrot_param
feeds three parameters, multiplied withcrot_weights
, into three controlled Pauli rotations.
We also initialize a set of parameters for all these operations, and start with uniform weights, i.e., all
rot_weights
andcrot_weights
are set to one. This means that all frequencies with which the parameters inrot_param
andcrot_param
enter the QNode are integer-valued. The number of frequencies per parameter are summarized innums_frequency
.from pennylane import numpy as np init_param = ( np.array([0.3, 0.2, 0.67], requires_grad=True), np.array(1.1, requires_grad=True), np.array([-0.2, 0.1, -2.5], requires_grad=True), ) rot_weights = np.ones(3) crot_weights = np.ones(3) nums_frequency = { "rot_param": {(0,): 1, (1,): 1, (2,): 1}, "layer_par": {(): 3}, "crot_param": {(0,): 2, (1,): 2, (2,): 2}, }
The keyword argument
requires_grad
can be used to determine whether the respective parameter should be optimized or not, following the behaviour of gradient computations and gradient-based optimizers when using Autograd or Torch. With TensorFlow, atf.Variable
inside atf.GradientTape
may be used to mark variables as trainable.Now we carry out the optimization. The minimized cost of the intermediate univariate reconstructions can be read out via
full_output
, including the cost after the full Rotosolve step:>>> param = init_param >>> cost_rotosolve = [] >>> for step in range(num_steps): ... param, cost, sub_cost = opt.step_and_cost( ... cost_function, ... *param, ... nums_frequency=nums_frequency, ... full_output=True, ... rot_weights=rot_weights, ... crot_weights=crot_weights, ... ) ... print(f"Cost before step: {cost}") ... print(f"Minimization substeps: {np.round(sub_cost, 6)}") ... cost_rotosolve.extend(sub_cost) Cost before step: 0.04200821039253547 Minimization substeps: [-0.230905 -0.863336 -0.980072 -0.980072 -1. -1. -1. ] Cost before step: -0.9999999990681161 Minimization substeps: [-1. -1. -1. -1. -1. -1. -1.] Cost before step: -0.9999999999999996 Minimization substeps: [-1. -1. -1. -1. -1. -1. -1.]
The optimized values for the parameters are now stored in
param
and the optimization behaviour can be assessed by plottingcost_rotosolve
, which include the substeps of the Rotosolve optimization. Thefull_output
feature is available for both,step
andstep_and_cost
.In general, the frequencies in a QNode will not be integer-valued, requiring us to provide the
RotosolveOptimizer
not only with the number of frequencies but their concrete values. For the example QNode above, this happens if the weights are no longer one:>>> rot_weights = np.array([0.4, 0.8, 1.2], requires_grad=False) >>> crot_weights = np.array([0.5, 1.0, 1.5], requires_grad=False) >>> spectrum_fn = qml.fourier.qnode_spectrum(cost_function) >>> spectra = spectrum_fn(*param, rot_weights=rot_weights, crot_weights=crot_weights) >>> spectra["rot_param"] {(0,): [-0.4, 0.0, 0.4], (1,): [-0.8, 0.0, 0.8], (2,): [-1.2, 0.0, 1.2]} >>> spectra["crot_param"] {(0,): [-0.5, -0.25, 0.0, 0.25, 0.5], (1,): [-1.0, -0.5, 0.0, 0.5, 1.0], (2,): [-1.5, -0.75, 0.0, 0.75, 1.5]}
We may provide these spectra instead of
nums_frequency
to Rotosolve to enable the optimization of the QNode at these weights:>>> param = init_param >>> for step in range(num_steps): ... param, cost, sub_cost = opt.step_and_cost( ... cost_function, ... *param, ... spectra=spectra, ... full_output=True, ... rot_weights = rot_weights, ... crot_weights = crot_weights, ... ) ... print(f"Cost before step: {cost}") ... print(f"Minimization substeps: {np.round(sub_cost, 6)}") Cost before step: 0.09299359486191039 Minimization substeps: [-0.268008 -0.713209 -0.24993 -0.871989 -0.907672 -0.907892 -0.940474] Cost before step: -0.9404742138557066 Minimization substeps: [-0.940474 -1. -1. -1. -1. -1. -1. ] Cost before step: -1.0 Minimization substeps: [-1. -1. -1. -1. -1. -1. -1.]
As we can see, while the optimization got a bit harder and the optimizer takes a bit longer to converge than previously, Rotosolve was able to adapt to the more complicated dependence on the input arguments and still found the global minimum successfully.
Methods
min_analytic
(objective_fn, freq, f0)Analytically minimize a trigonometric function that depends on a single parameter and has a single frequency.
step
(objective_fn, *args[, nums_frequency, ...])Update args with one step of the optimizer.
step_and_cost
(objective_fn, *args[, ...])Update args with one step of the optimizer and return the corresponding objective function value prior to the step.
- static min_analytic(objective_fn, freq, f0)[source]¶
Analytically minimize a trigonometric function that depends on a single parameter and has a single frequency. Uses two or three function evaluations.
- Parameters
objective_fn (callable) – Trigonometric function to minimize
freq (float) – Frequency \(f\) in the
objective_fn
f0 (float) – Value of the
objective_fn
at zero. Reduces the number of calls to the function from three to two if given.
- Returns
Position of the minimum of
objective_fn
float: Value of the minimum ofobjective_fn
- Return type
float
The closed form expression used here was derived in Vidal & Theis (2018) , Parrish et al (2019) and Ostaszewski et al (2021). We use the notation of Appendix A of the last of these references, although we allow for an arbitrary frequency instead of restricting to \(f=1\). The returned position is guaranteed to lie within \((-\pi/f, \pi/f]\).
The used formula for the minimization of the \(d-\text{th}\) parameter then reads
\[\begin{split}\theta^*_d &= \underset{\theta_d}{\text{argmin}}\left<H\right>_{\theta_d}\\ &= -\frac{\pi}{2f} - \frac{1}{f}\text{arctan2}\left(2\left<H\right>_{\theta_d=0} - \left<H\right>_{\theta_d=\pi/(2f)} - \left<H\right>_{\theta_d=-\pi/(2f)}, \left<H\right>_{\theta_d=\pi/(2f)} - \left<H\right>_{\theta_d=-\pi/(2f)}\right),\end{split}\]
- step(objective_fn, *args, nums_frequency=None, spectra=None, shifts=None, full_output=False, **kwargs)[source]¶
Update args with one step of the optimizer. Each step includes multiple substeps, one per parameter.
- Parameters
objective_fn (function) – the objective function for optimization. It should take a sequence of the values
*args
and a list of the gatesgenerators
as inputs, and return a single value.*args (Sequence) – variable length sequence containing the initial values of the variables to be optimized over or a single float with the initial value.
nums_frequency (dict[dict]) – The number of frequencies in the
objective_fn
per parameter. The keys must correspond to argument names of the objective function, the values must be dictionaries that map parameter indices (tuple
) in the argument to the number of frequencies with which it enters the objective function (int
). The parameter index for a scalar QNode argument is()
, for one-dimensional array QNode arguments, it takes the form(i,)
for the i-th parameter in the argument.spectra (dict[dict]) – Frequency spectra in the
objective_fn
per parameter. The formatting is the same as fornums_frequency
, but the values of the inner dictionaries must be sequences of frequencies (Sequence[float]
). For each parameter,num_frequency
take precedence overspectra
.shifts (dict[dict]) – Shift angles for the reconstruction per QNode parameter. The keys have to be argument names of
qnode
and the inner dictionaries have to be mappings from parameter indices to the respective shift angles to be used for that parameter. For \(R\) non-zero frequencies, there must be \(2R+1\) shifts given. Ignored ifnums_frequency
gives a number of frequencies for the respective parameter in the QNode argument.full_output (bool) – whether to return the intermediate minimized energy values from the univariate optimization substeps.
**kwargs – variable length keyword arguments for the objective function.
- Returns
- the new variable values \(x^{(t+1)}\).
If a single arg is provided, list [array] is replaced by array.
- list [float]: the intermediate objective values, only returned if
full_output=True
.
- Return type
list [array] or array
The optimization step consists of multiple substeps.
For each substep, one of the parameters in one of the QNode arguments is singled out, and the objective function is considered as univariate function (i.e., function that depends on a single scalar) of that parameter.
If
nums_frequency
states that there is only a single frequency, orspectra
only contains one positive frequency, for a parameter, an analytic formula is used to return the minimum of the univariate restriction.For multiple frequencies,
fourier.reconstruct()
is used to reconstruct the univariate restriction and a numeric minimization is performed instead. The latter minimization is performed using thesubstep_optimizer
passed toRotosolveOptimizer
at initialization.Note
One of
nums_frequency
andspectra
must contain information about each parameter that is to be trained withRotosolveOptimizer
. For each univariate reconstruction, the data innums_frequency
takes precedence over the information inspectra
.
- step_and_cost(objective_fn, *args, nums_frequency=None, spectra=None, shifts=None, full_output=False, **kwargs)[source]¶
Update args with one step of the optimizer and return the corresponding objective function value prior to the step. Each step includes multiple substeps, one per parameter.
- Parameters
objective_fn (function) – the objective function for optimization. It should take a sequence of the values
*args
and a list of the gatesgenerators
as inputs, and return a single value.*args (Sequence) – variable length sequence containing the initial values of the variables to be optimized over or a single float with the initial value.
nums_frequency (dict[dict]) – The number of frequencies in the
objective_fn
per parameter. The keys must correspond to argument names of the objective function, the values must be dictionaries that map parameter indices (tuple
) in the argument to the number of frequencies with which it enters the objective function (int
). The parameter index for a scalar QNode argument is()
, for one-dimensional array QNode arguments, it takes the form(i,)
for the i-th parameter in the argument.spectra (dict[dict]) – Frequency spectra in the
objective_fn
per parameter. The formatting is the same as fornums_frequency
, but the values of the inner dictionaries must be sequences of frequencies (Sequence[float]
). For each parameter,num_frequency
take precedence overspectra
.shifts (dict[dict]) – Shift angles for the reconstruction per QNode parameter. The keys have to be argument names of
qnode
and the inner dictionaries have to be mappings from parameter indices to the respective shift angles to be used for that parameter. For \(R\) non-zero frequencies, there must be \(2R+1\) shifts given. Ignored ifnums_frequency
gives a number of frequencies for the respective parameter in the QNode argument.full_output (bool) – whether to return the intermediate minimized energy values from the univariate optimization substeps.
**kwargs – variable length keyword arguments for the objective function.
- Returns
the new variable values \(x^{(t+1)}\). If a single arg is provided, list [array] is replaced by array. float: the objective function output prior to the step. list [float]: the intermediate objective values, only returned if
full_output=True
.- Return type
list [array] or array
The optimization step consists of multiple substeps.
For each substep, one of the parameters in one of the QNode arguments is singled out, and the objective function is considered as univariate function (i.e., function that depends on a single scalar) of that parameter.
If
nums_frequency
states that there is only a single frequency, orspectra
only contains one positive frequency, for a parameter, an analytic formula is used to return the minimum of the univariate restriction.For multiple frequencies,
fourier.reconstruct()
is used to reconstruct the univariate restriction and a numeric minimization is performed instead. The latter minimization is performed using thesubstep_optimizer
passed toRotosolveOptimizer
at initialization.Note
One of
nums_frequency
andspectra
must contain information about each parameter that is to be trained withRotosolveOptimizer
. For each univariate reconstruction, the data innums_frequency
takes precedence over the information inspectra
.