Abstract
Background: Blood-Brain Barrier (BBB) protects the central nervous system from systemic circulation and maintains the homeostasis of the brain. BBB permeability is one of the essential characteristics of drugs acting on the central nervous system to indicate if the drug could reach the brain or not. The available laboratory methods for the prediction of BBB permeability are accurate but expensive and time-consuming. Therefore, many attempts have been made over the years to predict the BBB permeability of compounds using computational approaches. The accuracy of the prediction models with external dataset has always been an issue with the prediction models.
Objective: To develop a Machine learning-based BBB permeability prediction model using physicochemical properties and molecular fingerprints.
Methods: Support vector machine (SVM), k-nearest neighbor (kNN), Random forest (RF), and Naïve Bayes (NB) algorithms were applied on a large dataset of 1978 compounds using 1917 feature vectors containing physicochemical properties, MACCS fingerprints, and substructure fingerprints to predict the BBB permeability.
Results and Discussion: The comparative analysis of performance metrics of developed models suggested that SVM with the radial basis function kernel performed better than the kNN, RF, and NB algorithms. The BBB permeability prediction model's accuracy with the SVM was 96.77%. The prediction performance of the model developed in this study was found better than the existing machine learning-based BBB permeability prediction models.
Conclusion: The prediction model developed in this study could be useful for screening compounds based on their BBB permeability at the preliminary stages of drug design and development.
Keywords: Blood-brain barrier, KNN, naïve bayes, permeability, random forest, Support Vector Machine (SVM).
Graphical Abstract