Abstract
It is well known that transcription factors can induce deformations in their DNA-binding sites upon complex formation. However, few attempts have been made to investigate the extent to which induced structural deformations in the DNA molecule are conserved between different members of the same transcription factor family. In this article, we used the CRoSSeD methodology for describing DNA structural properties to extract common features in the binding sites of different LacI-GalR family members. The most significant feature identified in this way was located at the center of the binding sites, which is also the most likely location for an induced DNA deformation following an amino acid interdigitation. This feature was related further to specific elements present in the protein structure and was used to identify and characterize deviant family members. A general family-wide binding site model was constructed and applied to screen for unknown member binding sites.
Keywords: Conditional random fields, CRoSSeD, DNA structure, LacI-GalR family, protein-DNA interaction, transcription factor.