Abstract
Bioassay record of High Throughput Screening (HTS) contains compounds with high diversity. This high diversity in molecules causes an intense non-linearity into the molecular descriptors set. So to build a QSAR model covering the diversity in molecular structure is a tedious task. In the present work, a method has been proposed to extract information about pharmacophores covering a larger area in the HTS record and development of Support Vector Regression (SVR) QSAR model considering extracted pharmacophores specified to the cell line or target. A probabilistic approach has also been proposed to evaluate the authenticity of predictions made by QSAR model. The developed method has been used for virtual screening of library molecules. The advantage of this protocol is that, it is beneficial for a very large dataset. The proposed method has the capability to extract pharmacophore information from any HTS data. Additionally, this will be advantageous for the development of précised virtual screening model on the basis of high throughput screening data.
Keywords: Cluster, HTS, pharmacophore, QSAR, SVR.