Abstract
Acquired immunodeficiency syndrome (AIDS) is one of the most devastating diseases of current century which is caused by the human immunodeficiency virus (HIV). Although great efforts have been done to fight the virus, the need of new therapeutics candidates of any kind still remains. This process needs huge time and experimental endeavor. However, Computer-aided techniques and can speed up the procedure. Currently, cheminformatics tools have proven to be extremely valuable in pharmaceutical research. In the past few decades, a huge number of different molecular descriptors were designed to describe chemical molecules in a quantitative way to make it easy to use them for computational studies. Herein, we present a computational study of anti-HIV small molecules test by the National Cancer Institute (NCI) to introduce the most efficient molecular descriptors for anti-HIV activity. In this regard a dataset of 199 highly active anti-HIV and 174 inactive compounds were defined by 905 molecular descriptors. Data were classified using Random Forest algorithm and the most important molecular descriptors were introduced as the parameters responsible for representing anti-HIV activity. Applying the mentioned computational and cheminformatics methods, it is possible to predict the anti-HIV activity of any given small molecule with high accuracy.
Keywords: Anti-HIV small molecules, cheminformatics, machine learning methods, molecular descriptors.