In an article published in the journal Nature, researchers introduced an innovative approach to methyl-transverse relaxation optimized spectroscopy (TROSY) nuclear magnetic resonance (NMR) spectroscopy to characterize large biomolecules in solution. Typically, sample preparation for this technique required deuteration, resulting in limited usage.
The results showed that deep neural networks (DNN) could process NMR spectra from protonated, uniformly 13C labeled samples to achieve spectra comparable to those obtained from deuterated samples. This method was verified experimentally on proteins up to 360 kilodaltons (kDa). It also showed applicability to three-dimensional (3D) nuclear overhauser effect spectroscopy (NOESY) spectra, improving the study of large biomolecules.
Background
NMR spectroscopy is a crucial technique widely used in material science, chemistry, structural biology, and clinical diagnostics. It offers unique insights into functional motions and non-covalent interactions at atomic-level resolution. However, NMR spectroscopy faces significant challenges due to its intrinsic insensitivity, necessitating ongoing efforts to enhance both resolution and sensitivity.
One of the major hurdles in NMR spectroscopy is nuclear spin-relaxation, which escalates with molecular size, thereby limiting the study of large biomolecules in solution-state NMR. Traditionally, this limitation has imposed size constraints, beyond which most signals become broadened and undetectable.
Previous advancements have extended these size limits for biomolecular NMR applications. Notable developments include improved hardware, refined sample preparation techniques, and advanced pulse sequence designs. A crucial breakthrough in this domain is the introduction of methyl-TROSY methods, which utilize methyl-bearing side chains to probe biomolecular structures and dynamics.
However, high-quality methyl-TROSY spectra require extensive deuteration of the protein, except for specifically labeled methyl groups, which poses challenges such as increased costs, lower yields, and infeasibility for proteins that must be expressed in mammalian systems.
Recent studies have demonstrated that DNNs can be trained to accurately transform and analyze complex NMR data. This study aimed to utilize DNNs to process NMR spectra of uniformly 13C labeled, protonated samples, as a result avoiding the need for deuteration. The proposed methodology employed DNNs to generate high-quality 13C-1H correlation spectra akin to classical methyl-TROSY spectra.
By removing one-bond 13C-13C scalar couplings and improving the resolution of the 1H and 13C dimensions, this approach sharpened observed cross-peaks. The study verified this method on synthetic data and experimental data from proteins ranging from 42 kDa to 360 kDa, demonstrating its applicability to large proteins usually inaccessible to NMR.
Methodology for Protein Expression, Purification, and NMR Data Collection
The process of expressing and preparing proteins for NMR spectroscopy included complex steps specifically for each protein's unique labeling requirements. Specific methyl labeling was achieved by adding alpha-ketobutyric acid or alpha-ketoisovaleric acid shortly before induction, confirming the incorporation of labeled methyl groups into the proteins.
The expression and purification protocols varied based on the protein being studied. For example, histone deacetylase 8 (HDAC8) underwent expression in BL21(λDE3) Escherichia coli (E. coli) cells followed by induction with isopropyl ß-D-1-thiogalactopyranoside (IPTG) and zinc chloride (ZnCl2) at a lower temperature to improve protein solubility. Purification included nickel-nitrilotriacetic acid (Ni-NTA) affinity chromatography followed by size-exclusion chromatography, resulting in a concentrated sample suitable for NMR analysis in specific buffer conditions.
NMR data collection for similar proteins included two-dimensional (2D) heteronuclear single-quantum coherence (HSQC) spectra recorded on high-field spectrometers equipped with cryoprobes, guaranteeing optimal sensitivity and resolution. These spectra served as input data for the DNN FID-Net, which was trained to improve spectral resolution and remove scalar couplings, as a result transforming them into high-quality 13C-1H correlation spectra resembling classical methyl-TROSY spectra.
Experimental verification using both synthetic and protein-derived data confirmed the usefulness of this approach across a range of protein sizes from 42 kDa to 360 kDa. The study also extended the application of DNNs to collect 3D NOESY spectra of MSG, helping in chemical shift assignments and structural elucidation.
Enhanced Resolution and Spectral Clarity
The researchers investigated the challenges and solutions in obtaining high-quality 13C-1H correlation maps for large proteins using traditional NMR approaches. Traditional methods like 13C-1H HSQC spectra struggle due to uniformly 13C-labelled proteins undergoing one-bond 13C-13C scalar couplings, leading to signal multiplets and complexity in spectra interpretation. Additionally, the absence of deuteration resulted in line broadening due to increased dipolar relaxation, further complicating spectra analysis. Even constant-time 13C-1H HSQC spectra did not generate high-quality data due to skewed intensities and invisible signals.
To address these challenges, two DNNs based on the FID-Net architecture were created. The first DNN removed 13C-13C couplings and sharpened signals in the 13C dimension, while the second DNN reduced decay rates to sharpen peaks in the 1H dimension. These networks were trained on synthetic data resembling real 13C-1H correlation maps of proteins. Testing on synthetic spectra, the FID-Net approach demonstrated notable improvement in resolution and accuracy of peak identification.
The approach was applied to experimental data from uniformly 13C-labelled proteins. The FID-Net processing produced high-quality correlation maps comparable to methyl-TROSY spectra. For HDAC8, the FID-Net transformed spectrum showed excellent correspondence with a classical methyl-TROSY spectrum, identifying all expected methyl peaks and additional peaks from other methyl groups. The approach was also tested on the large 360 kDa α7α7 complex, demonstrating the method's capability at the limits of current technology.
The researchers concluded that FID-Net DNNs provided a robust approach for producing high-quality NMR spectra of large proteins without deuteration, offering significant advantages in cost and comprehensive data from all methyl-bearing side chains.
Conclusion
In conclusion, the researchers introduced a unique approach utilizing DNNs to process NMR spectra of uniformly protonated, 13C-labeled samples, eliminating the need for deuteration. By improving resolution and reducing scalar couplings, this method achieved high-quality 13C-1H correlation spectra comparable to methyl-TROSY, applicable to proteins between the range of 42 to 360 kDa.