Abstract
Objective: As a result of the development of microarray technologies, gene expression levels of thousands of genes involved in a given biological process can be measured simultaneously, and it is important to study their temporal behavior to understand their mechanisms. Since the dependence between gene expression levels over time for a given gene is often too complicated to model parametrically, sparse functional data analysis has received an increasing amount of attention for analyzing such data.
Methods: We propose a new functional mixed-effects model for analyzing time-course gene expression data. Specifically, the model groups individual functions with heterogeneous smoothness. The proposed method utilizes the mixed-effects model representation of penalized splines for both the mean function and the individual functions. Given noninformative or weakly informative priors, Bayesian inference on the proposed models was developed, and Bayesian computation was implemented by using Markov chain Monte Carlo methods.
Results: The performance of our new model was studied by two simulation studies and illustrated using a yeast cell cycle gene expression dataset. Simulation results suggest that our proposed methods can outperform the previously used methods in terms of the mean integrated squared error. The yeast gene expression data application suggests that the proposed model with two latent groups should be used on this dataset.
Conclusion: The new Bayesian functional mixed-effects model that assumes multiple groups of functions with different smoothing parameters provides an enhanced approach to analyzing timecourse gene expression data.
Keywords: Time-course gene expression data, Bayesian, functional data analysis, mixed-effects models, grouped smoothness, microarray.
Graphical Abstract
[http://dx.doi.org/10.1038/4462] [PMID: 9915498]
[http://dx.doi.org/10.1038/nrd728] [PMID: 12120097]
[http://dx.doi.org/10.1091/mbc.9.12.3273] [PMID: 9843569]
[http://dx.doi.org/10.2202/1544-6115.1671]
[http://dx.doi.org/10.1093/bioinformatics/bti742] [PMID: 16257986]
[http://dx.doi.org/10.1016/j.compbiolchem.2007.05.006] [PMID: 17631419]
[http://dx.doi.org/10.1093/bioinformatics/btg014] [PMID: 12611802]
[http://dx.doi.org/10.1186/s12859-017-1860-0] [PMID: 29025390]
[http://dx.doi.org/10.1111/j.1541-0420.2011.01678.x] [PMID: 21955051]
[http://dx.doi.org/10.1016/S0896-6273(01)00515-3] [PMID: 11719206]
[http://dx.doi.org/10.1091/mbc.e04-04-0299] [PMID: 15616197]
[http://dx.doi.org/10.1016/S1369-5266(00)00149-7] [PMID: 11228436]
[http://dx.doi.org/10.1038/83751] [PMID: 11137997]
[http://dx.doi.org/10.1007/b98888]
[http://dx.doi.org/10.1016/0021-9045(72)90080-9]
[http://dx.doi.org/10.1137/1.9781611970128]
[http://dx.doi.org/10.1007/978-1-4899-4473-3]
[http://dx.doi.org/10.1017/CBO9780511755453]
[http://dx.doi.org/10.2307/2986151]
[http://dx.doi.org/10.1214/ss/1177011926]
[http://dx.doi.org/10.1111/j.0006-341X.2001.00253.x] [PMID: 11252607]
[http://dx.doi.org/10.1111/j.1541-0420.2007.00829.x] [PMID: 17573864]
[http://dx.doi.org/10.1080/01621459.1998.10473755]
[http://dx.doi.org/10.1111/j.0006-341X.2002.00121.x] [PMID: 11890306]
[http://dx.doi.org/10.1198/106186002853]
[http://dx.doi.org/10.1002/sim.1991] [PMID: 15568201]
[http://dx.doi.org/10.18637/jss.v032.i11] [PMID: 21743798]
[http://dx.doi.org/10.1198/016214504000001745]
[http://dx.doi.org/10.1214/08-AOS608]
[http://dx.doi.org/10.1198/jcgs.2009.08011]
[PMID: 29449762]
[http://dx.doi.org/10.1198/106186006X96962]
[http://dx.doi.org/10.1177/0962280217690770] [PMID: 28139170]
[http://dx.doi.org/10.1111/j.1467-842X.2008.00507.x]
[http://dx.doi.org/10.1111/j.2517-6161.1991.tb01821.x]
[http://dx.doi.org/10.1214/06-BA117A]
[http://dx.doi.org/10.1214/11-BA631]
[http://dx.doi.org/10.1214/13-BA815]
[http://dx.doi.org/10.1201/b16018]
[http://dx.doi.org/10.2202/1557-4679.1246] [PMID: 21969982]
[http://dx.doi.org/10.1198/jasa.2011.tm10301]
[http://dx.doi.org/10.1109/78.790649]