Abstract
Water plays an invaluable role in governing the structure, stability, dynamics, and function of biomolecules, which has also been demonstrated to be critical in mediating biomolecular recognition and association. Accurate determination of the dynamic behavior of water molecules at biological complex interface is important for the understanding of the molecular mechanism of water contributing to the binding between biomolecules and could be exploited as an alternative tool to refine the waters positions in X-ray electron density map. In the present study, a method called local hydrophobic descriptors (LHDs) is used to characterize the hydrophobic landscapes of the hydration sites at protein-DNA interface. The resulting variables of the characterization are then correlated with the experimentally measured B-factor values of 4445 elaborately selected water samples derived from a panel of thematically nonredundant, high-quality protein-DNA interfaces by using a variety of machine learning methods, including PLS, BPNN, SVM, LSSVM, RF, and GP. The results show that the dynamic behavior of interfacial water molecules is primarily governed by the local hydrophobic feature of the hydration sites that water molecules located, and the nonlinear dependence dominates over the linear component in the water B-factor system. We expect that this structured-based approach can be used for predicting the B-factor profile of other biomolecules as well.
Keywords: Water molecule, protein-DNA interaction, local hydrophobic descriptor, machine learning, B-factor, stability, X-ray electron density map, nucleic acids, eukaryotes, transcription, X-ray crystallography, protein polypeptide chains, PLS, BPNN, LSSVM, RF, GP, PROBE, solvent, hydrogen, electrostatic interaction, hydrophobicity, back-propagation neural network, least squares supporting vector machine, Matlab toolbox ZP-explore, random forest, Gaussian, PDB, NDB, SD, velocity, RBF, SVMWater molecule, protein-DNA interaction, local hydrophobic descriptor, machine learning, B-factor, stability, X-ray electron density map, nucleic acids, eukaryotes, transcription, X-ray crystallography, protein polypeptide chains, PLS, BPNN, LSSVM, RF, GP, PROBE, solvent, hydrogen, electrostatic interaction, hydrophobicity, back-propagation neural network, least squares supporting vector machine, Matlab toolbox ZP-explore, random forest, Gaussian, PDB, NDB, SD, velocity, RBF, SVMWater molecule, protein-DNA interaction, local hydrophobic descriptor, machine learning, B-factor, stability, X-ray electron density map, nucleic acids, eukaryotes, transcription, X-ray crystallography, protein polypeptide chains, PLS, BPNN, LSSVM, RF, GP, PROBE, solvent, hydrogen, electrostatic interaction, hydrophobicity, back-propagation neural network, least squares supporting vector machine, Matlab toolbox ZP-explore, random forest, Gaussian, PDB, NDB, SD, velocity, RBF, SVM