论文部分内容阅读
Human immunodeficiency virus type 1 (HIV-1) is the etiological agent of human acquired immunodeficiency syndrome (AIDS).For viral replication, there are three essential enzymes: reverse transcriptase (RT), protease (PR) and integrase (IN).HIV-1 integrase, which has no counterpart in the mammalian body has been an attractive target in recent years.This work built several computational models for classification and quantitative prediction the bioactivity of HIV-1 integrase inhibitors.Using a support vector machine (SVM), two computational models were built to predict whether a compound is an active or weakly active strand transfer (ST) inhibitor based on a dataset of 1257 ST inhibitors of HIV-1 integrase.The model built with MACCS fingerprints and 40 MOE descriptors gave a prediction accuracy of 91.82%, 93.64% and a Matthews Correlation Coeffiient (MCC) of 0.73, 0.79 on test set, respectively.Some molecular properties such as electrostatic properties, van der Waals surface area, hydrogen bond properties and the number of fluorine atoms are important factors influencing the interactions between the inhibitor and the integrase.Some scaffolds like β-diketo acid and its derivatives, naphthyridine carboxamide or the isosteric of it and pyrimidionones may play crucial rule to the activity of the HIV-1 integrase inhibitors.Four computational QSAR (Quantitative structure-activity relationship) models have also been built to predict the biological activities of HIV-1 integrase inhibitors.551 HIV1 integrase inhibitors which were detected by radiolabeling and inhibit ST process were ordered.20 MOE descriptors were selected for modeling.The whole dataset was divided into a training set and a test set with two methods: (1) on the basis of a Kohonens self-organizing map (SOM) ; and (2) by a random selection.For each group of corresponding training set and test set, a multilinear regression (MLR) analysis and an SVM were used to establish computer models, respectively.For the two models based on the training set and test set split by SOM, the correlation coefficients (r) of the test sets were over 0.91 ; for the two models based on the training set and test set divided randomly, the r of the test sets were over 0.86 .