Abstract
Aim: Robust and more accurate method for identifying transcription factor binding sites (TFBS) for gene expression.
Background: Deep neural networks (DNNs) have shown promising growth in solving complex machine learning problems. Conventional techniques are comfortably replaced by DNNs in computer vision, signal processing, healthcare, and genomics. Understanding DNA sequences is always a crucial task in healthcare and regulatory genomics. For DNA motif prediction, choosing the right dataset with a sufficient number of input sequences is crucial in order to design an effective model.
Objective: Designing a new algorithm which works on different dataset while an improved performance for TFBS prediction.
Methods: With the help of Layerwise Relevance Propagation, the proposed algorithm identifies the invariant features with adaptive noise patterns.
Results: The performance is compared by calculating various metrics on standard as well as recent methods and significant improvement is noted.
Conclusion: By identifying the invariant and robust features in the DNA sequences, the classification performance can be increased.
Keywords: Deep neural networks, transcription factor binding sites, regularization, DNA, RNA, convolutional neural networks.
Graphical Abstract