
NMR Shielding Parameters for 130831 QM9 Molecules with up to 9 C, N, O and F atoms


QM9NMR dataset

The QM9-NMR dataset [Ref-1] contains gas and (implicit) solvent phase mPW1PW91/6-311+G(2d,p)-level chemical shielding for all atoms in the QM9 dataset [Ref-2] comprising 130,831 stable, synthetically feasible small organic molecules with up to 9 C, N, O and F atoms.


B3LYP/6-31G(2df,p) geometries

SI_DFT_geo.xyz Contains Cartesian coordinates of 130,831 molecules relaxed at B3LYP/6-31G(2df,p) level. These geometries are collected from the QM9 dataset reported in the Ref-2

mPW1PW91/6-311+G(2d,p) @ B3LYP/6-31G(2df,p) NMR data

SI_DFT_NMR.txt For each molecule in SI_DFT_geo.xyz, contains number of atoms, followed by molecule name and isotropic shielding tensors per atom in the molecule in Gas, CCl4, THF, Acetone, Methanol and DMSO respectively, obtained at mPW1PW91/6-311+G(2d,p) level.

Raw Input/Output files

Gaussian16 output files for 130831 molecules with NMR data at the level mPW1PW91/6-311+G(2d,p) @ B3LYP/6-31G(2df,p) is available for download at

Gas phase

NOTE: Some output files failed to get uploaded on NoMaD. We are working towards uploading these missing files. 

Please use the dataset DOI: https://doi.org/10.17172/NOMAD/2021.10.16-1

mPW1PW91/6-311+G(2d,p) @ B3LYP/6-31G(2df,p) 13C shielding for tetramethylsilane (TMS) [in ppm]

Gas      - 186.9704
CCl4     - 187.2352
THF      - 187.4958
Acetone  - 187.5949
Methanol - 187.6181
DMSO     - 187.6304

PM7 geometries

SI_baseline_geo.xyz Contains Cartesian coordinates of 130,831 molecules relaxed at the PM7 level.

B3LYP/STO-3G @ PM7 NMR data

SI_baseline_NMR.txt For each molecule in SI_baseline_geo.xyz, contains number of atoms, followed by molecule name and isotropic shielding tensors per atom in the molecule in gas phase obtained at B3LYP/STO-3G level.

B3LYP/STO-3G @ PM7 13C shielding for tetramethylsilane (TMS) [in ppm]

Gas      - 232.4620

Case studies


SI_12Drugs_DFT_geo.xyz Contains 12 Drug molecules relaxed at B3LYP/6-31G(2df,p) level.
SI_12Drugs_baseline_geo.xyz Contains 12 Drug molecules relaxed at PM7 level.
SI_40Drugs_DFT_geo.xyz Contains 40 Drug molecules relaxed at B3LYP/6-31G(2df,p) level.
SI_40Drugs_baseline_geo.xyz Contains 40 Drug molecules relaxed at PM7 level.
SI_PAH_DFT_geo.xyz Contains 5 Polycyclic Aromatic Hydrocarbons molecules relaxed at B3LYP/6-31G(2df,p) level.
SI_PAH_baseline_geo.xyz Contains 5 Polycyclic Aromatic Hydrocarbons molecules relaxed at PM7 level.
SI_GDB10to17_DFT_geo.xyz Contains 200 molecules from GDB10 to GDB17 molecules relaxed at B3LYP/6-31G(2df,p) level.
SI_GDB10to17_baseline_geo.xyz Contains 200 molecules from GDB10 to GDB17 molecules relaxed at PM7 level.

NMR data

SI_12Drugs_DFT_NMR.txt For each molecule in SI_12Drugs_DFT_geo.xyz, contains number of atoms, molecule name and isotropic shielding tensors per atom, in Gas phase at mPW1PW91/6-311+G(2d,p).
SI_12Drugs_baseline_NMR.txt For each molecule in SI_12Drugs_baseline_geo.xyz, contains number of atoms, molecule name and isotropic shielding tensors per atom, in Gas phase at B3LYP/STO-3G level.
SI_40Drugs_DFT_NMR.txt For each molecule in SI_40Drugs_DFT_geo.xyz, contains number of atoms, molecule name and isotropic shielding tensors per atom, in Gas phase at mPW1PW91/6-311+G(2d,p).
SI_40Drugs_baseline_NMR.txt For each molecule in SI_40Drugs_baseline_geo.xyz, contains number of atoms, molecule name and isotropic shielding tensors per atom, in Gas phase at B3LYP/STO-3G level.
SI_PAH_DFT_NMR.txt For each molecule in SI_PAH_DFT_geo.xyz, contains number of atoms, molecule name and isotropic shielding tensors per atom, in Gas phase at mPW1PW91/6-311+G(2d,p).
SI_PAH_baseline_NMR.txt For each molecule in SI_PAH_baseline_geo.xyz, contains number of atoms, molecule name and isotropic shielding tensors per atom, in Gas phase at B3LYP/STO-3G level.
SI_GDB10to17_DFT_NMR.txt For each molecule in SI_GDB10to17_DFT_geo.xyz, contains number of atoms, molecule name and isotropic shielding tensors per atom, in Gas phase at mPW1PW91/6-311+G(2d,p).
SI_GDB10to17_baseline_NMR.txt For each molecule in SI_GDB10to17_baseline_geo.xyz, contains number of atoms, molecule name and isotropic shielding tensors per atom, in Gas phase at B3LYP/STO-3G level.

Revision notes

19 June 2021: In our original upload, the atomic indices of the baseline data such as the PM7 geometries were shuffled. We thank Eric Collins for pointing this out. We have now uploaded the file ‘pm7.tar.gz’ with correct atomic indices.


[Ref-1] Revving up 13C NMR shielding predictions across chemical space: benchmarks for atoms-in-molecules kernel machine learning with new data for 134 kilo molecules
Amit Gupta, Sabyasachi Chakraborty and Raghunathan Ramakrishnan
Mach. Learn.: Sci. Technol. 2 (2021) 035010
Supplementary Information to the article (PDF)

[Ref-2] Quantum chemistry structures and properties of 134 kilo molecules
Raghunathan Ramakrishnan, Pavlo Dral, Matthias Rupp, O. Anatole von Lilienfeld
Scientific Data 1, Article number: 140022 (2014).