Abstract
We have recently developed a theoretical means of studying the mechanical and interaction properties of nucleic acids as a function of their base sequence. This approach, termed ADAPT, can be used to obtain the physical properties of millions of base sequences with only modest computational expense. ADAPT is based on a multi-copy algorithm using special nucleotides (“lexides”) containing all four standard bases whose contribution to the energy of the molecule can be varied. We present here a deeper study of the energy minima which occur in the multi-dimensional space defined by these variable sequences. We also present an extension of the approach termed “gene threading” which enables us to scan genomic sequence data in an attempt to locate preferential binding sites. This technique is illustrated for the case of TATA-box protein binding. ADAPT enables us to demonstrate that, for this protein, DNA deformation alone explains a large part of the experimentally observed consensus binding sequence.
Keywords: Molecular Mechanics, Adapt, Protein dna recognition, Deformation energy, Multi copy algorithm, Genome analysis, Jumna, Dihydrofolate reductase