Abstract
The analytical procedures required to generate a quantified metabolomics data matrix include many and widely different potential sources of error, complicating the generation of reliable data. The methods generally used to assess precision of such data all have distinct merits but some clear limitations as well. In this paper we describe KEMREP (kernel method for the assessment of repeatability and reproducibility), a new method with the advantage and focus aimed specifically at analysis of the reliability of metabolomics data. Repeatability and reproducibility were assessed on gas chromatography- mass spectrometry (GC-MS) generated metabolomics data matrices produced by and between analysts and across laboratories, using cerebrospinal fluid (CSF) and urine as biological samples for analysis. KEMREP provides a visual overlay of the smoothed and scaled versions of the data from repeated samples for a direct and easy qualitative assessment of repeatability or reproducibility of a distinct chromatographic region (univariate) or for the experiment as a whole (multivariate). The KEMREP method can also be extended by the imposition of confidence bounds which provide lower and upper limits that indicate quantitatively whether the experiment was repeatable or reproducible at a predefined input coefficient of variation (CV). KEMREP is thus a novel approach which supplements existing methods of assessment of reliability of metabolomics data; provides a benchmark for assessing the quality of practical work performed by analysts; monitors the sequence of data pre-treatment steps; and tests the robustness of an experimentally designed protocol for metabolomics.
Keywords: Gas chromatography-mass spectrometry, kernel density, metabolomics, qualitative, repeatability, reproducibility.