In an article published in the journal Nature, researchers presented a novel approach to quantum refinement (QR) in biomacromolecule structures using machine learning potentials (MLPs) within multiscale ‘our own N-layered integrated molecular orbital and molecular mechanics’ (ONIOM)(quantum mechanics (QM): molecular mechanics (MM)) schemes.
By replacing expensive QM methods with MLPs, the authors achieved quantum-level accuracy with higher efficiency. The refinements provided evidence for the bonded and nonbonded forms of the Food and Drug Administration (FDA)-approved drug nirmatrelvir in a severe acute respiratory syndrome coronavirus (SARS-CoV)-2 protease structure, highlighting the potential for broader QR applications in drug development.
Background
Accurate atomic structures of biomacromolecules are crucial for predicting molecular properties, estimating binding poses, and understanding ligand binding site recognition and biocatalysis. This structural information is essential for the rational design of potent and selective drugs targeting specific binding sites.
X-ray diffraction (XRD) has long been a powerful method for determining the atomic structures of biomacromolecules, relying on standard crystallographic refinement methods that combine MM force fields with experimental XRD data. However, developing reliable force fields for diverse drug molecules is challenging due to the vast chemical space and complex electronic effects.
Recent advancements in artificial intelligence (AI) methods, such as AlphaFold, have shown impressive results in predicting protein structures but struggle with systems involving cofactors or drugs due to limited experimental data. QR, which employs QM methods, has successfully improved the structural quality of some protein-drug complexes but is limited by high computational costs and complex QM/MM setups.
This paper addressed these limitations by incorporating MLPs into multiscale ONIOM(QM: MM) schemes, replacing expensive QM methods. By combining two levels of MLPs for the first time, the study aimed to achieve QM-level accuracy with significantly higher efficiency. This approach was tested on 50 protein-drug systems, demonstrating that MLPs plus ONIOM-based QR methods can broaden QR applications and provide valuable insights for drug development.
Advanced Multiscale QR Techniques
The authors introduced multiscale QR methods using MLPs to enhance the accuracy and efficiency of biomacromolecule structure refinement. The total energy function of the system was divided into contributions from ONIOM-based calculations and crystallographic penalties, balanced by a weighting factor.
Traditional ONIOM methods, such as ONIOM2(QM: MM) and ONIOM3(QM:semi-empirical (SE): MM), significantly improved the structure of active sites but were computationally expensive due to the use of QM methods. To address this, MLPs like ANI-2x, ANI-1ccx, and AIQM1 replace QM methods in ONIOM2(MLP: MM) and ONIOM3(MLP:SE: MM) schemes.
However, MLP-coupled cluster (MLP-CC) models were limited to systems with hydrogen (H), carbon (C), nitrogen (N), and oxygen (O) elements, while MLP-density function theory (MLP-DFT) models cover H, C, N, O, fluorine (F), sulfur (S), and chlorine (Cl) elements.
To extend applications to more complex drug/inhibitor molecules, two MLP levels were combined through an extrapolative ONIOM approach. This resulted in unique ONIOM3(MLP-CC:MLP-DFT: MM) and ONIOM4(MLP-CC:MLP-DFT:SE: MM) schemes. Additionally, combinations of AIQM1 and SE methods described molecules containing elements like phosphorus (P) and bromine (Br). Protein preparations involved downloading experimental data and crystal structures from the Protein Data Bank (PDB), generating topology and parameter files, determining protonation states, and optimizing H atoms.
Various ONIOM-based schemes were tested using Gaussian 16, employing methods like ωB97X-D, ANI, AIQM1, and second-generation geometry, frequency, noncovalent, extended tight binding (GFN2-xTB). The performance of these methods was assessed by refining protein-drug systems and comparing results with high-level ONIOM3(DFT:SE: MM) refinements.
This approach demonstrated that combining MLPs with ONIOM methods provided QM-level accuracy with greater efficiency, facilitating broader QR applications and aiding drug development.
Evaluating MLPs
In evaluating the performance of MLPs for drug structure optimization, gas-phase geometry optimizations were conducted on 50 drugs/inhibitors using various MLPs and SE methods. MLPs generally yielded drug structures comparable to DFT, while the SE method showed some underestimation in bond distances.
A benchmark dataset from PDBbind v2020 further confirmed the accuracy of MLPs, particularly for neutral cases. However, molecules with charged groups exhibited larger deviations. QR schemes combining MLPs and SE methods significantly improved drug structures in protein binding sites, reducing real-space difference density (RSZD) scores and strain energy. MLP-based QRs were computationally efficient compared to DFT-based methods.
Combining different levels of MLPs in ONIOM schemes enhanced accuracy and computational efficiency, especially for systems with diverse chemical elements. Notably, QRs identified coexisting conformers in drug-protein complexes, providing valuable insights for drug design. Overall, MLPs offer promising alternatives for efficient and accurate drug structure optimization in proteins.
Conclusion
In conclusion, the researchers introduced a novel QR approach using MLPs within multiscale ONIOM schemes, achieving quantum-level accuracy with higher efficiency. Applied to 50 protein-drug systems, this method effectively optimized structures and identified conformers, including bonded and nonbonded forms of the drug nirmatrelvir in SARS-CoV-2 protease.
By combining different MLP levels, the approach overcame traditional computational limitations, offering significant potential for broader QR applications in drug development and structural biology. This advancement suggested routine, high-accuracy QR can be achieved on standard desktop computers, enhancing future molecular recognition, catalysis, and drug design research.
Journal reference:
- Yan, Z., Wei, D., Li, X., & Chung, L. W. (2024). Accelerating reliable multiscale quantum refinement of protein–drug systems enabled by machine learning. Nature Communications, 15(1), 4181. https://doi.org/10.1038/s41467-024-48453-4, https://www.nature.com/articles/s41467-024-48453-4