Bayesian Functional Mixed-effects Models with Grouped Smoothness for Analyzing Time-course Gene Expression Data

Shangyuan       Ye; Ye       Liang; Bo       Zhang

doi:10.2174/1574893615999200520082636

Abstract

Objective: As a result of the development of microarray technologies, gene expression levels of thousands of genes involved in a given biological process can be measured simultaneously, and it is important to study their temporal behavior to understand their mechanisms. Since the dependence between gene expression levels over time for a given gene is often too complicated to model parametrically, sparse functional data analysis has received an increasing amount of attention for analyzing such data.

Methods: We propose a new functional mixed-effects model for analyzing time-course gene expression data. Specifically, the model groups individual functions with heterogeneous smoothness. The proposed method utilizes the mixed-effects model representation of penalized splines for both the mean function and the individual functions. Given noninformative or weakly informative priors, Bayesian inference on the proposed models was developed, and Bayesian computation was implemented by using Markov chain Monte Carlo methods.

Results: The performance of our new model was studied by two simulation studies and illustrated using a yeast cell cycle gene expression dataset. Simulation results suggest that our proposed methods can outperform the previously used methods in terms of the mean integrated squared error. The yeast gene expression data application suggests that the proposed model with two latent groups should be used on this dataset.

Conclusion: The new Bayesian functional mixed-effects model that assumes multiple groups of functions with different smoothing parameters provides an enhanced approach to analyzing timecourse gene expression data.

Keywords: Time-course gene expression data, Bayesian, functional data analysis, mixed-effects models, grouped smoothness, microarray.

« Previous Next »

Graphical Abstract

[1] 
Brown PO, Botstein D. Exploring the new world of the genome with DNA microarrays. Nat Genet  1999; 21(1s): 33-7.
[http://dx.doi.org/10.1038/4462] [PMID: 9915498] 
[2] 
Nicholson JK, Connelly J, Lindon JC, Holmes E. Metabonomics: a platform for studying drug toxicity and gene function. Nat Rev Drug Discov  2002; 1(2): 153-61.
[http://dx.doi.org/10.1038/nrd728] [PMID: 12120097] 
[3] 
Spellman PT, Sherlock G, Zhang MQ, et al. Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol Biol Cell  1998; 9(12): 3273-97.
[http://dx.doi.org/10.1091/mbc.9.12.3273] [PMID: 9843569] 
[4] 
Coffey N, Hinde J. Analyzing time-course microarray data using functional data analysis a review. Stat Appl Genet Mol Biol  2011; 10(1): 23.
[http://dx.doi.org/10.2202/1544-6115.1671] 
[5] 
Leng X, Müller H-G. Classification using functional data analysis for temporal gene expression data. Bioinformatics  2006; 22(1): 68-76.
[http://dx.doi.org/10.1093/bioinformatics/bti742] [PMID: 16257986] 
[6] 
Song JJ, Lee HJ, Morris JS, Kang S. Clustering of time-course gene expression data using functional data analysis. Comput Biol Chem  2007; 31(4): 265-74.
[http://dx.doi.org/10.1016/j.compbiolchem.2007.05.006] [PMID: 17631419] 
[7] 
Luan Y, Li H. Clustering of time-course gene expression data using a mixed-effects model with B-splines. Bioinformatics  2003; 19(4): 474-82.
[http://dx.doi.org/10.1093/bioinformatics/btg014] [PMID: 12611802] 
[8] 
Kim J, Kim H. Partitioning of functional gene expression data using principal points. BMC Bioinformatics  2017; 18(1): 450.
[http://dx.doi.org/10.1186/s12859-017-1860-0] [PMID: 29025390] 
[9] 
Wang L, Zhou J, Qu A. Penalized generalized estimating equations for high-dimensional longitudinal data analysis. Biometrics  2012; 68(2): 353-60.
[http://dx.doi.org/10.1111/j.1541-0420.2011.01678.x] [PMID: 21955051] 
[10] 
Claridge-Chang A, Wijnen H, Naef F, Boothroyd C, Rajewsky N, Young MW. Circadian regulation of gene expression systems in the Drosophila head. Neuron  2001; 32(4): 657-71.
[http://dx.doi.org/10.1016/S0896-6273(01)00515-3] [PMID: 11719206] 
[11] 
Peng X, Karuturi RK, Miller LD, et al. Identification of cell cycle-regulated genes in fission yeast. Mol Biol Cell  2005; 16(3): 1026-42.
[http://dx.doi.org/10.1091/mbc.e04-04-0299] [PMID: 15616197] 
[12] 
Breyne P, Zabeau M. Genome-wide expression analysis of plant cell cycle modulated genes. Curr Opin Plant Biol  2001; 4(2): 136-42.
[http://dx.doi.org/10.1016/S1369-5266(00)00149-7] [PMID: 11228436] 
[13] 
Cho RJ, Huang M, Campbell MJ, et al. Transcriptional regulation and function during the human cell cycle. Nat Genet  2001; 27(1): 48-54.
[http://dx.doi.org/10.1038/83751] [PMID: 11137997] 
[14] 
Ramsay JO, Silverman BW. Functional data analysis. New York: Springer 2005.
[http://dx.doi.org/10.1007/b98888] 
[15] 
de Boor C. On calculating with B-splines. J Approx Theory  1972; 6(1): 50-62.
[http://dx.doi.org/10.1016/0021-9045(72)90080-9] 
[16] 
Wahba G. Spline models for observational data. Siam  1990; 59: 181.
[http://dx.doi.org/10.1137/1.9781611970128] 
[17] 
Green PJ, Silverman BW. Nonparametric regression and generalized linear models. Chapman Hall 1994.
[http://dx.doi.org/10.1007/978-1-4899-4473-3] 
[18] 
Ruppert D, Wand WP, Carroll RJ. Semiparametric regression. Cambridge University Press 2003.
[http://dx.doi.org/10.1017/CBO9780511755453] 
[19] 
Shi M, Weiss RE, Taylor JM. An analysis of paediatric CD4 counts for acquired immune deficiency syndrome using flexible random curves. Appl Stat  1996; 151-63.
[http://dx.doi.org/10.2307/2986151] 
[20] 
Robinson GK. That BLUP is a good thing: The estimation of random effects. Stat Sci  1991; 6(1): 15-32.
[http://dx.doi.org/10.1214/ss/1177011926] 
[21] 
Rice JA, Wu CO. Nonparametric mixed effects models for unequally sampled noisy curves. Biometrics  2001; 57(1): 253-9.
[http://dx.doi.org/10.1111/j.0006-341X.2001.00253.x] [PMID: 11252607] 
[22] 
Wu H, Zhang J. Nonparametric regression methods for longitudinal data analysis: Mixed-effects modeling approaches. New Jersey: Wiley 2006.
[23] 
Thompson WK, Rosen O. A Bayesian model for sparse functional data. Biometrics  2008; 64(1): 54-63.
[http://dx.doi.org/10.1111/j.1541-0420.2007.00829.x] [PMID: 17573864] 
[24] 
Brumback BA, Rice JA. Smoothing spline models for the analysis of nested and crossed samples of curves. J Am Stat Assoc  1998; 93(443): 961-76.
[http://dx.doi.org/10.1080/01621459.1998.10473755] 
[25] 
Guo W. Functional mixed effects models. Biometrics  2002; 58(1): 121-8.
[http://dx.doi.org/10.1111/j.0006-341X.2002.00121.x] [PMID: 11890306] 
[26] 
Berk M. Statistical methods for replicated, high-dimensional biological time series 2012.
[27] 
Ruppert D. Selecting the number of knots for penalized splines. J Comput Graph Stat  2002; 11(23): 735-57.
[http://dx.doi.org/10.1198/106186002853] 
[28] 
Durbán M, Harezlak J, Wand MP, Carroll RJ. Simple fitting of subject-specific curves for longitudinal data. Stat Med  2005; 24(8): 1153-67.
[http://dx.doi.org/10.1002/sim.1991] [PMID: 15568201] 
[29] 
Crainiceanu CM, Goldsmith AJ. Bayesian functional data analysis using WinBUGS. J Stat Softw  2010; 32(11): i11.
[http://dx.doi.org/10.18637/jss.v032.i11] [PMID: 21743798] 
[30] 
Yao F, Muller H, Wang J. Functional data analysis for sparse longitudinal data. J Am Stat Assoc  2005; 100(470): 577-90.
[http://dx.doi.org/10.1198/016214504000001745] 
[31] 
Paul D, Peng J. Consistency of restricted maximum likelihood estimators of principal components. Ann Stat  2009; 37(3): 1229-71.
[http://dx.doi.org/10.1214/08-AOS608] 
[32] 
Peng J, Paul D. A geometric approach to maximum likelihood estimation of the functional principal components from sparse longitudinal data. J Comput Graph Stat  2009; 18(4): 995-1015.
[http://dx.doi.org/10.1198/jcgs.2009.08011] 
[33] 
Cai T, Yuan M. Nonparametric covariance function estimation for functional and longitudinal data. University of Pennsylvania and Georgia Inistitute of Technology 2010.
[34] 
Xiao L, Li C, Checkley W, Crainiceanu C. Fast covariance estimation for sparse functional data. Stat Comput  2017; 28: 511-22.
[PMID: 29449762] 
[35] 
Pinheiro JC, Bates DM. Approximations to the log-likelihood function in the nonlinear mixed-effects model. J Comput Graph Stat  1995; 4(1): 12-35.
[36] 
Pinheiro JC, Chao EC. Efficient Laplacian and Adaptive Gaussian quadrature algorithms for multilevel generalized linear mixed models. J Comput Graph Stat  2006; 15(1): 58-81.
[http://dx.doi.org/10.1198/106186006X96962] 
[37] 
Zhang B, Liu W, Hu Y. Estimating marginal and incremental effects in the analysis of medical expenditure panel data using marginalized two-part random-effects generalized Gamma models: Evidence from China healthcare cost data. Stat Methods Med Res  2018; 27(10): 3039-61.
[http://dx.doi.org/10.1177/0962280217690770] [PMID: 28139170] 
[38] 
Wand MP, Ormerod JT. On semiparametric regression with O’Sullivan penalized splines. Aust N Z J Stat  2009; 50: 179-98.
[http://dx.doi.org/10.1111/j.1467-842X.2008.00507.x] 
[39] 
Rice J, Silverman B. Estimating the mean and covariance structure nonparametrically when the data are curves. J R Stat Soc B  1991; 53: 233-43.
[http://dx.doi.org/10.1111/j.2517-6161.1991.tb01821.x] 
[40] 
Gelman A. Prior distributions for variance parameters in hierarchical models (Comment on Article by Browne and Draper). Bayesian Anal  2006; 1(3): 515-34.
[http://dx.doi.org/10.1214/06-BA117A] 
[41] 
Wand MP, Ormerod JT, Padoan SA, Frühwirth R. Mean field variational Bayes for elaborate distributions. Bayesian Anal  2011; 6(4): 847-900.
[http://dx.doi.org/10.1214/11-BA631] 
[42] 
Huang A, Wand MP. Simple marginally noninformative prior distributions for covariance matrices. Bayesian Anal  2013; 8(2): 439-52.
[http://dx.doi.org/10.1214/13-BA815] 
[43] 
Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB. Bayesian data analysis. CRC press 2013.
[http://dx.doi.org/10.1201/b16018] 
[44] 
Berk M. "sme: Smoothing-splines mixed-effects models" R package version 08 h See  https://CRAN.R-project.org/package=sme. 2013
[45] 
Reiss PT, Huang L, Mennes M. Fast function on scalar regression with penalized basis expansions. Int J Biostat  2010; 6(1): 28.
[http://dx.doi.org/10.2202/1557-4679.1246] [PMID: 21969982] 
[46] 
Faes F, Ormerod JT, Wand MP. Variational Bayesian inference for parametric and nonparametric regression with missing data. J Am Stat Assoc  2011; 106(495): 959-71.
[http://dx.doi.org/10.1198/jasa.2011.tm10301] 
[47] 
Andrieu C, Doucet A. Joint Bayesian model selection and estimation of noisy sinusoids via reversible jump Mcmc. IEEE Trans Signal Process  1999; 47(10): 2667-76.
[http://dx.doi.org/10.1109/78.790649] 

Rights & Permissions Print Cite

Article Metrics

12

1

Journal Information

For Authors

For Editors

For Reviewers

Explore Articles

Open Access

Open Access Articles

For Visitors

DOI https://dx.doi.org/10.2174/1574893615999200520082636	Print ISSN 1574-8936
Publisher Name Bentham Science Publisher	Online ISSN 2212-392X

Current Bioinformatics

Bayesian Functional Mixed-effects Models with Grouped Smoothness for Analyzing Time-course Gene Expression Data

Abstract Play Pause

Graphical Abstract

Related Journals

Related Books

Abstract