Abstract
Artificial Neural Network (ANN) techniques are becoming increasing popular in many areas of the biological sciences for the analysis of complex data. Careful selection of key parameters when developing ANN models and algorithms is extremely important in order to create generalised models with real-world applicability. This study applies these approaches to the analysis of proteomic data generated using Surface Enhanced Laser Desorption / Ionisation mass spectrometry profiling of cell lines from patients with breast cancer. Examples of these approaches include constrained architecture, Correlated Activity Pruning (CAPing), appropriate training termination methods and other, more advanced methodologies such as parameterisation by weightings analysis and stepwise additive approaches. These approaches, when applied to breast cancer cell lines from actual patients, resulted in the identification of 8 protein / peptide molecular ions which were capable of classifying samples into their respective groups to an accuracy of 94.8 %with an area under the curve value of 0.993 when examined with a receiver operating characteristic curve. Several ions which appear to show a significant up or down-regulation with regards to treatment regimen have also been identified. These results indicate that when coupled with other powerful techniques, the development of these novel methodologies and algorithms using ANNs allows for the development of effective data mining tools in order to analyse complex, non-linear, noisy data.
Keywords: artificial neural networks, breast cancer, methodologies, models and algorithms, data-mining, proteomics