Scaling Method for Batch Effect Correction of Gene Expression Data Based on Spectral Clustering

Momo       Matsuda; Xiucai       Ye; Tetsuya       Sakurai

doi:10.2174/1574893615999200818093540

Abstract

Background: Batch effects are usually introduced in gene expression data, which can dramatically reduce the accuracy of statistical inference in the genomic analysis since samples in different batches cannot be directly comparable.

Objective: To accurately measure biological variability and obtain correct statistical inference, we considered to correct/remove the batch effects for merging the samples from different batches into a comparable dataset for high-throughput genomic analysis.

Methods: The existing L/S model uses the empirical Bayes methods to find the constant values for multiplication/addition for each gene. Different from the L/S model, we used the dimensionality reduction method. We proposed an effective scaling method to scale each gene by multiplying a constant value, which was formulated as an optimization problem based on spectral clustering. The data samples from different batches can be merged into a comparable dataset with batch effect correction. Furthermore, we proposed an approximation solution to solve the optimization problem for the scaling adjustment values.

Results: We evaluated the proposed method on both artificial and gene expression datasets by comparing it with the existing well-established batch effect correction methods. Numerical experiments show that the proposed method projects the data samples from different batches to resemble each other and outperforms the others on both microarray and single-cell RNA-seq datasets.

Conclusion: The scaling adjustment for genes and dimensionality reduction improved the accuracy and removed the batch effects, thereby making the proposed method more robust for interfering genes.

Keywords: Batch effects, spectral clustering, scaling adjustment, dimensionality reduction, microarray, single-cell RNA-seq.

« Previous Next »

Graphical Abstract

Rights & Permissions Print Cite

Article Metrics

12

Journal Information

For Authors

For Editors

For Reviewers

Explore Articles

Open Access

Open Access Articles

For Visitors

DOI https://dx.doi.org/10.2174/1574893615999200818093540	Print ISSN 1574-8936
Publisher Name Bentham Science Publisher	Online ISSN 2212-392X

Current Bioinformatics

Scaling Method for Batch Effect Correction of Gene Expression Data Based on Spectral Clustering

Abstract Play Pause

Graphical Abstract

Related Journals

Related Books

Related Articles

Abstract