Template Class TensorCuda¶
Defined in File TensorCuda.hpp
Inheritance Relationships¶
Base Type¶
public Pennylane::LightningTensor::TensorBase< PrecisionT, TensorCuda< PrecisionT > >
(Template Class TensorBase)
Class Documentation¶
-
template<class PrecisionT>
class TensorCuda : public Pennylane::LightningTensor::TensorBase<PrecisionT, TensorCuda<PrecisionT>>¶ CRTP-enabled class for CUDA-capable Tensor.
- Template Parameters
PrecisionT – Floating point precision.
Public Types
-
using BaseType = TensorBase<PrecisionT, TensorCuda>¶
-
using CFP_t = decltype(cuUtil::getCudaType(PrecisionT{}))¶
Public Functions
-
inline explicit TensorCuda(const std::size_t rank, const std::vector<std::size_t> &modes, const std::vector<std::size_t> &extents, const DevTag<int> &dev_tag, bool device_alloc = true)¶
Construct a new TensorCuda object.
- Parameters
rank – Tensor rank.
modes – Tensor modes.
extents – Tensor extents.
dev_tag – Device tag.
device_alloc – If true, allocate memory on device.
-
inline explicit TensorCuda(const std::vector<std::size_t> &extents, const std::vector<CFP_t> &host_tensor, const DevTag<int> &dev_tag, bool device_alloc = true)¶
Construct a new TensorCuda object from a host data.
- Parameters
extents – Tensor extents.
host_tensor – Host tensor data.
dev_tag – Device tag.
device_alloc – If true, allocate memory on device.
-
TensorCuda() = delete¶
-
~TensorCuda() = default¶
-
inline void CopyGpuDataToHost(std::complex<PrecisionT> *host_tensor, std::size_t length, bool async = false) const¶
Explicitly copy data from GPU device to host memory.
- Parameters
host_tensor – Complex data pointer to receive data from device.
length – Number of elements to copy.
async – If true, the copy is asynchronous. Only synchronous copy is supported now.