Abstract
Computer simulations play a crucial role in contemporary chemical investigations, generating enormous amounts of data. The constraint of sharing data and results is regarded as a major impediment in drug discovery. Among the steepest barriers to overcome in the high throughput screening studies is the limited number of suitable, freely accessible repositories for storing drug and drug target data. By offering a uniform data storage and retrieval mechanism, various data might be compared and exchanged easily. This paper presents the stages of the MoStBioDat software platform development, originally designed for the efficient storage, management and access of SDF and PDB data. The detailed architecture and software implementation of this project are described, indicating also the disadvantages of the solutions chosen. The current implementation of the first prototype is written in Python, an open-source, high-level, object-oriented scripting language. The modular architecture of the package enables future extension with the necessary functionalities. The main objective of the MoStBioDat is to serve as an alternative, extensible open-source database derived partly from SDF and PDB files.
Keywords: SDF, PDB, SMILES, HTS, relational database, ligand, macromolecule, python, chemoinformatics