qml.qchem.mol_data

mol_data(identifier, identifier_type='name')[source]

Obtain symbols and geometry of a compound from the PubChem Database.

The PubChem database is one of the largest public repositories for information on chemical substances from which symbols and geometry can be retrieved for a compound by its name, SMILES, InChI, InChIKey, or PubChem Compound ID (CID) to build a molecule object for Hartree-Fock calculations. The retrieved atomic coordinates will be converted to atomic units for consistency.

Parameters
  • identifier (str or int) – compound’s identifier as required by the PubChem database

  • identifier_type (str) – type of the provided identifier - name, CAS, CID, SMILES, InChI, InChIKey

Returns

symbols and geometry (in Bohr radius) of the compound

Return type

Tuple(list[str], array[float])

Example

>>> mol_data("BeH2")
(['Be', 'H', 'H'],
tensor([[ 4.79404621,  0.29290755,  0.        ],
        [ 3.77945225, -0.29290755,  0.        ],
        [ 5.80882913, -0.29290755,  0.        ]], requires_grad=True))
>>> mol_data(223, "CID")
(['N', 'H', 'H', 'H', 'H'],
tensor([[ 0.        ,  0.        ,  0.        ],
        [ 1.82264085,  0.52836742,  0.40402345],
        [ 0.01417295, -1.67429735, -0.98038991],
        [-0.98927163, -0.22714508,  1.65369933],
        [-0.84773114,  1.373075  , -1.07733286]], requires_grad=True))

mol_data can also be used with other chemical identifiers - CAS, SMILES, InChI, InChIKey:

>>> mol_data("74-82-8", "CAS")
(['C', 'H', 'H', 'H', 'H'],
tensor([[ 0.        ,  0.        ,  0.        ],
        [ 1.04709725,  1.51102501,  0.93824902],
        [ 1.29124986, -1.53710323, -0.47923455],
        [-1.47058487, -0.70581271,  1.26460472],
        [-0.86795121,  0.7320799 , -1.7236192 ]], requires_grad=True))
>>> mol_data("[C]", "SMILES")
(['C', 'H', 'H', 'H', 'H'],
tensor([[ 0.        ,  0.        ,  0.        ],
        [ 1.04709725,  1.51102501,  0.93824902],
        [ 1.29124986, -1.53710323, -0.47923455],
        [-1.47058487, -0.70581271,  1.26460472],
        [-0.86795121,  0.7320799 , -1.7236192 ]], requires_grad=True))
>>> mol_data("InChI=1S/CH4/h1H4", "InChI")
(['C', 'H', 'H', 'H', 'H'],
tensor([[ 0.        ,  0.        ,  0.        ],
        [ 1.04709725,  1.51102501,  0.93824902],
        [ 1.29124986, -1.53710323, -0.47923455],
        [-1.47058487, -0.70581271,  1.26460472],
        [-0.86795121,  0.7320799 , -1.7236192 ]], requires_grad=True))
>>> mol_data("VNWKTOKETHGBQD-UHFFFAOYSA-N", "InChIKey")
(['C', 'H', 'H', 'H', 'H'],
tensor([[ 0.        ,  0.        ,  0.        ],
        [ 1.04709725,  1.51102501,  0.93824902],
        [ 1.29124986, -1.53710323, -0.47923455],
        [-1.47058487, -0.70581271,  1.26460472],
        [-0.86795121,  0.7320799 , -1.7236192 ]], requires_grad=True))