NMR Shielding Parameters for 130831 QM9 Molecules with up to 9 C, N, O and F atoms
The QM9-NMR dataset [Ref-1] contains gas and (implicit) solvent phase mPW1PW91/6-311+G(2d,p)-level chemical shielding for all atoms in the QM9 dataset [Ref-2] comprising 130,831 stable, synthetically feasible small organic molecules with up to 9 C, N, O and F atoms.
SI_DFT_geo.xyz Contains Cartesian coordinates of 130,831 molecules relaxed at B3LYP/6-31G(2df,p) level. These geometries are collected from the QM9 dataset reported in the Ref-2
SI_DFT_NMR.txt For each molecule in SI_DFT_geo.xyz, contains number of atoms, followed by molecule name and isotropic shielding tensors per atom in the molecule in Gas, CCl4, THF, Acetone, Methanol and DMSO respectively, obtained at mPW1PW91/6-311+G(2d,p) level.
Gaussian16 output files for 130831 molecules with NMR data at the level mPW1PW91/6-311+G(2d,p) @ B3LYP/6-31G(2df,p) is available for download at
Gas phase
CCl4
THF
Acetone
Methanol
DMSO
NOTE: Some output files failed to get uploaded on NoMaD. We are working towards uploading these missing files.
Please use the dataset DOI: https://doi.org/10.17172/NOMAD/2021.10.16-1
Gas - 186.9704
CCl4 - 187.2352
THF - 187.4958
Acetone - 187.5949
Methanol - 187.6181
DMSO - 187.6304
SI_baseline_geo.xyz Contains Cartesian coordinates of 130,831 molecules relaxed at the PM7 level.
SI_baseline_NMR.txt For each molecule in SI_baseline_geo.xyz, contains number of atoms, followed by molecule name and isotropic shielding tensors per atom in the molecule in gas phase obtained at B3LYP/STO-3G level.
Gas - 232.4620
SI_12Drugs_DFT_geo.xyz | Contains 12 Drug molecules relaxed at B3LYP/6-31G(2df,p) level. |
SI_12Drugs_baseline_geo.xyz | Contains 12 Drug molecules relaxed at PM7 level. |
SI_40Drugs_DFT_geo.xyz | Contains 40 Drug molecules relaxed at B3LYP/6-31G(2df,p) level. |
SI_40Drugs_baseline_geo.xyz | Contains 40 Drug molecules relaxed at PM7 level. |
SI_PAH_DFT_geo.xyz | Contains 5 Polycyclic Aromatic Hydrocarbons molecules relaxed at B3LYP/6-31G(2df,p) level. |
SI_PAH_baseline_geo.xyz | Contains 5 Polycyclic Aromatic Hydrocarbons molecules relaxed at PM7 level. |
SI_GDB10to17_DFT_geo.xyz | Contains 200 molecules from GDB10 to GDB17 molecules relaxed at B3LYP/6-31G(2df,p) level. |
SI_GDB10to17_baseline_geo.xyz | Contains 200 molecules from GDB10 to GDB17 molecules relaxed at PM7 level. |
SI_12Drugs_DFT_NMR.txt | For each molecule in SI_12Drugs_DFT_geo.xyz, contains number of atoms, molecule name and isotropic shielding tensors per atom, in Gas phase at mPW1PW91/6-311+G(2d,p). |
SI_12Drugs_baseline_NMR.txt | For each molecule in SI_12Drugs_baseline_geo.xyz, contains number of atoms, molecule name and isotropic shielding tensors per atom, in Gas phase at B3LYP/STO-3G level. |
SI_40Drugs_DFT_NMR.txt | For each molecule in SI_40Drugs_DFT_geo.xyz, contains number of atoms, molecule name and isotropic shielding tensors per atom, in Gas phase at mPW1PW91/6-311+G(2d,p). |
SI_40Drugs_baseline_NMR.txt | For each molecule in SI_40Drugs_baseline_geo.xyz, contains number of atoms, molecule name and isotropic shielding tensors per atom, in Gas phase at B3LYP/STO-3G level. |
SI_PAH_DFT_NMR.txt | For each molecule in SI_PAH_DFT_geo.xyz, contains number of atoms, molecule name and isotropic shielding tensors per atom, in Gas phase at mPW1PW91/6-311+G(2d,p). |
SI_PAH_baseline_NMR.txt | For each molecule in SI_PAH_baseline_geo.xyz, contains number of atoms, molecule name and isotropic shielding tensors per atom, in Gas phase at B3LYP/STO-3G level. |
SI_GDB10to17_DFT_NMR.txt | For each molecule in SI_GDB10to17_DFT_geo.xyz, contains number of atoms, molecule name and isotropic shielding tensors per atom, in Gas phase at mPW1PW91/6-311+G(2d,p). |
SI_GDB10to17_baseline_NMR.txt | For each molecule in SI_GDB10to17_baseline_geo.xyz, contains number of atoms, molecule name and isotropic shielding tensors per atom, in Gas phase at B3LYP/STO-3G level. |
19 June 2021: In our original upload, the atomic indices of the baseline data such as the PM7 geometries were shuffled. We thank Eric Collins for pointing this out. We have now uploaded the file ‘pm7.tar.gz’ with correct atomic indices.
[Ref-1] Revving up 13C NMR shielding predictions across chemical space: benchmarks for atoms-in-molecules kernel machine learning with new data for 134 kilo molecules
Amit Gupta, Sabyasachi Chakraborty and Raghunathan Ramakrishnan
Mach. Learn.: Sci. Technol. 2 (2021) 035010
Supplementary Information to the article (PDF)
[Ref-2] Quantum chemistry structures and properties of 134 kilo molecules
Raghunathan Ramakrishnan, Pavlo Dral, Matthias Rupp, O. Anatole von Lilienfeld
Scientific Data 1, Article number: 140022 (2014).