TY - JOUR
T1 - Factors Influencing the Binding of HIV-1 Protease Inhibitors
T2 - Insights from Machine Learning Models
AU - Shalit, Yaffa
AU - Tuvi-Arad, Inbal
N1 - © 2025 The Authors. ChemMedChem published by Wiley‐VCH GmbH.
PY - 2025/5/28
Y1 - 2025/5/28
N2 - HIV-1 protease (PR) inhibitors are crucial for antiviral therapies targeting acquired immunodeficiency syndrome. Hundreds of PR complexes with various ligands have been resolved and deposited in the Protein Data Bank. However, binding affinity measurements for these ligands are not always available. This gap hinders a comprehensive understanding of inhibitor efficacy. To address this challenge, machine learning models are constructed and validated based on the crystallographic coordinates of 291 PR–inhibitor complexes, leveraging over 2500 molecular descriptors. The models achieved accuracy scores exceeding 0.85, and applied to predict the binding affinity of 274 additional complexes for which inhibition constants are not experimentally measured. The analysis is focused on three models, each with 8–9 features, and based on KBest with random forest, recursive feature elimination with random forest, and sequential feature selection with support vector machine. The findings revealed key predictive features, including properties of PR inhibitors like charge distribution, hydrogen-bonding capability, and 3D topology, as well as intrinsic properties of PR, such as active site symmetry and flap mutations. The study highlights the contribution of a comprehensive analysis of accumulated experimental data to enhance the structural understanding of this important molecular system.
AB - HIV-1 protease (PR) inhibitors are crucial for antiviral therapies targeting acquired immunodeficiency syndrome. Hundreds of PR complexes with various ligands have been resolved and deposited in the Protein Data Bank. However, binding affinity measurements for these ligands are not always available. This gap hinders a comprehensive understanding of inhibitor efficacy. To address this challenge, machine learning models are constructed and validated based on the crystallographic coordinates of 291 PR–inhibitor complexes, leveraging over 2500 molecular descriptors. The models achieved accuracy scores exceeding 0.85, and applied to predict the binding affinity of 274 additional complexes for which inhibition constants are not experimentally measured. The analysis is focused on three models, each with 8–9 features, and based on KBest with random forest, recursive feature elimination with random forest, and sequential feature selection with support vector machine. The findings revealed key predictive features, including properties of PR inhibitors like charge distribution, hydrogen-bonding capability, and 3D topology, as well as intrinsic properties of PR, such as active site symmetry and flap mutations. The study highlights the contribution of a comprehensive analysis of accumulated experimental data to enhance the structural understanding of this important molecular system.
KW - crystal structure
KW - human immunodeficiency virus
KW - ligand binding
KW - machine learning
KW - symmetry
UR - http://www.scopus.com/inward/record.url?scp=105008482496&partnerID=8YFLogxK
U2 - 10.1002/cmdc.202500277
DO - 10.1002/cmdc.202500277
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 40432489
AN - SCOPUS:105008482496
SN - 1860-7179
SP - e2500277
JO - ChemMedChem
JF - ChemMedChem
ER -