Enhancing NMR Spectroscopy for Large Biomolecules Using AI

Download PDF Copy

By Soham NandiReviewed by Susha Cheriyedath, M.Sc.Jun 21 2024

In an article published in the journal Nature, researchers introduced an innovative approach to methyl-transverse relaxation optimized spectroscopy (TROSY) nuclear magnetic resonance (NMR) spectroscopy to characterize large biomolecules in solution. Typically, sample preparation for this technique required deuteration, resulting in limited usage.

a, b Exemplar synthetic data without processing by the FID-Net DNNs (left column) and with FID-Net based processing (right column). a A synthetic spectrum where we have a similar number of signals to HDAC8 (42 kDa). b A spectrum that has a similar number of signals to MSG (81 kDa). One hundred distinct spectra with a similar number of signals to those shown in a or b are generated. These are then analysed using the FID-Net approach and the resulting spectra are peak picked. From FID-Net analysed spectra peaks are picked and compared to ground truth values. From picked peaks, true positive, false positive and false negative rates of peaks are calculated (only considering isolated peaks) and plotted (c, d). Full details are given in the Methods section. Image Credit: https://www.nature.com/articles/s41467-024-49378-8

The results showed that deep neural networks (DNN) could process NMR spectra from protonated, uniformly ¹³C labeled samples to achieve spectra comparable to those obtained from deuterated samples. This method was verified experimentally on proteins up to 360 kilodaltons (kDa). It also showed applicability to three-dimensional (3D) nuclear overhauser effect spectroscopy (NOESY) spectra, improving the study of large biomolecules.

Background

NMR spectroscopy is a crucial technique widely used in material science, chemistry, structural biology, and clinical diagnostics. It offers unique insights into functional motions and non-covalent interactions at atomic-level resolution. However, NMR spectroscopy faces significant challenges due to its intrinsic insensitivity, necessitating ongoing efforts to enhance both resolution and sensitivity.

One of the major hurdles in NMR spectroscopy is nuclear spin-relaxation, which escalates with molecular size, thereby limiting the study of large biomolecules in solution-state NMR. Traditionally, this limitation has imposed size constraints, beyond which most signals become broadened and undetectable.

Previous advancements have extended these size limits for biomolecular NMR applications. Notable developments include improved hardware, refined sample preparation techniques, and advanced pulse sequence designs. A crucial breakthrough in this domain is the introduction of methyl-TROSY methods, which utilize methyl-bearing side chains to probe biomolecular structures and dynamics.

However, high-quality methyl-TROSY spectra require extensive deuteration of the protein, except for specifically labeled methyl groups, which poses challenges such as increased costs, lower yields, and infeasibility for proteins that must be expressed in mammalian systems.

Recent studies have demonstrated that DNNs can be trained to accurately transform and analyze complex NMR data. This study aimed to utilize DNNs to process NMR spectra of uniformly ¹³C labeled, protonated samples, as a result avoiding the need for deuteration. The proposed methodology employed DNNs to generate high-quality ¹³C-¹H correlation spectra akin to classical methyl-TROSY spectra.

By removing one-bond ¹³C-¹³C scalar couplings and improving the resolution of the ¹H and ¹³C dimensions, this approach sharpened observed cross-peaks. The study verified this method on synthetic data and experimental data from proteins ranging from 42 kDa to 360 kDa, demonstrating its applicability to large proteins usually inaccessible to NMR.

Methodology for Protein Expression, Purification, and NMR Data Collection

The process of expressing and preparing proteins for NMR spectroscopy included complex steps specifically for each protein's unique labeling requirements. Specific methyl labeling was achieved by adding alpha-ketobutyric acid or alpha-ketoisovaleric acid shortly before induction, confirming the incorporation of labeled methyl groups into the proteins.

The expression and purification protocols varied based on the protein being studied. For example, histone deacetylase 8 (HDAC8) underwent expression in BL21(λDE3) Escherichia coli (E. coli) cells followed by induction with isopropyl ß-D-1-thiogalactopyranoside (IPTG) and zinc chloride (ZnCl₂) at a lower temperature to improve protein solubility. Purification included nickel-nitrilotriacetic acid (Ni-NTA) affinity chromatography followed by size-exclusion chromatography, resulting in a concentrated sample suitable for NMR analysis in specific buffer conditions.

NMR data collection for similar proteins included two-dimensional (2D) heteronuclear single-quantum coherence (HSQC) spectra recorded on high-field spectrometers equipped with cryoprobes, guaranteeing optimal sensitivity and resolution. These spectra served as input data for the DNN FID-Net, which was trained to improve spectral resolution and remove scalar couplings, as a result transforming them into high-quality 13C-1H correlation spectra resembling classical methyl-TROSY spectra.

Experimental verification using both synthetic and protein-derived data confirmed the usefulness of this approach across a range of protein sizes from 42 kDa to 360 kDa. The study also extended the application of DNNs to collect 3D NOESY spectra of MSG, helping in chemical shift assignments and structural elucidation.

Enhanced Resolution and Spectral Clarity

The researchers investigated the challenges and solutions in obtaining high-quality ¹³C-¹H correlation maps for large proteins using traditional NMR approaches. Traditional methods like ¹³C-¹H HSQC spectra struggle due to uniformly ¹³C-labelled proteins undergoing one-bond ¹³C-¹³C scalar couplings, leading to signal multiplets and complexity in spectra interpretation. Additionally, the absence of deuteration resulted in line broadening due to increased dipolar relaxation, further complicating spectra analysis. Even constant-time ¹³C-¹H HSQC spectra did not generate high-quality data due to skewed intensities and invisible signals.

To address these challenges, two DNNs based on the FID-Net architecture were created. The first DNN removed ¹³C-¹³C couplings and sharpened signals in the ¹³C dimension, while the second DNN reduced decay rates to sharpen peaks in the 1H dimension. These networks were trained on synthetic data resembling real ¹³C-¹H correlation maps of proteins. Testing on synthetic spectra, the FID-Net approach demonstrated notable improvement in resolution and accuracy of peak identification.

The approach was applied to experimental data from uniformly ¹³C-labelled proteins. The FID-Net processing produced high-quality correlation maps comparable to methyl-TROSY spectra. For HDAC8, the FID-Net transformed spectrum showed excellent correspondence with a classical methyl-TROSY spectrum, identifying all expected methyl peaks and additional peaks from other methyl groups. The approach was also tested on the large 360 kDa α7α7 complex, demonstrating the method's capability at the limits of current technology.

The researchers concluded that FID-Net DNNs provided a robust approach for producing high-quality NMR spectra of large proteins without deuteration, offering significant advantages in cost and comprehensive data from all methyl-bearing side chains.

Conclusion

In conclusion, the researchers introduced a unique approach utilizing DNNs to process NMR spectra of uniformly protonated, 13C-labeled samples, eliminating the need for deuteration. By improving resolution and reducing scalar couplings, this method achieved high-quality 13C-1H correlation spectra comparable to methyl-TROSY, applicable to proteins between the range of 42 to 360 kDa.

Journal reference:

Karunanithy, G., Shukla, V. K., & Hansen, D. F. (2024). Solution-state methyl NMR spectroscopy of large non-deuterated proteins enabled by deep neural networks. Nature Communications, 15(1), 5073. https://doi.org/10.1038/s41467-024-49378-8, https://www.nature.com/articles/s41467-024-49378-8

Posted in: AI Research News

Comments (0)

Written by

Soham Nandi

Soham Nandi is a technical writer based in Memari, India. His academic background is in Computer Science Engineering, specializing in Artificial Intelligence and Machine learning. He has extensive experience in Data Analytics, Machine Learning, and Python. He has worked on group projects that required the implementation of Computer Vision, Image Classification, and App Development.

Download PDF Copy

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

APA
Nandi, Soham. (2024, June 21). Enhancing NMR Spectroscopy for Large Biomolecules Using AI. AZoAi. Retrieved on December 23, 2025 from https://www.azoai.com/news/20240621/Enhancing-NMR-Spectroscopy-for-Large-Biomolecules-Using-AI.aspx.
MLA
Nandi, Soham. "Enhancing NMR Spectroscopy for Large Biomolecules Using AI". AZoAi. 23 December 2025. <https://www.azoai.com/news/20240621/Enhancing-NMR-Spectroscopy-for-Large-Biomolecules-Using-AI.aspx>.
Chicago
Nandi, Soham. "Enhancing NMR Spectroscopy for Large Biomolecules Using AI". AZoAi. https://www.azoai.com/news/20240621/Enhancing-NMR-Spectroscopy-for-Large-Biomolecules-Using-AI.aspx. (accessed December 23, 2025).
Harvard
Nandi, Soham. 2024. Enhancing NMR Spectroscopy for Large Biomolecules Using AI. AZoAi, viewed 23 December 2025, https://www.azoai.com/news/20240621/Enhancing-NMR-Spectroscopy-for-Large-Biomolecules-Using-AI.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.

Post a new comment

(Logout)

Post

Sign in to keep reading

We're committed to providing free access to quality science. By registering and providing insight into your preferences you're joining a community of over 1m science interested individuals and help us to provide you with insightful content whilst keeping our service free.