Abstract
The Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial was a large, randomized controlled trial of cancer screening that also evolved over time into a unique epidemiologic cohort. Vast quantities of data have been collected since the beginning of the trial in 1993. Screening data was obtained through 2006. Questionnaire-based risk factor data (collected at baseline and at other points in the trial), vital status, cancer diagnoses and treatment, biospecimen data and additional ancillary efforts continue to be collected.
Accurate data collection and efficient management methods are required to ensure high-quality data and valid and consistent analyses of trial outcomes. Information Management Services (IMS) was and continues to be responsible for processing and converting the collected raw PLCO data into comprehensive and accessible datasets. IMS also continues to provide a wide spectrum of analytic support including support for trial monitoring, data sharing, and epidemiologic research. In this paper, we describe the data processing and management requirements from the analytic team perspective, highlighting the various data sources and their complexity. We also illustrate the construction of usable analytic data files and discuss the wide range of analytic support provided. Instructions for accessing PLCO data also are provided.
Keywords: Analytic support, cancer screening trial, data processing, PLCO.