In the past years many statistical methods and tools have been developed for the analysis of microarrays. Although it is a well-known problem that microarrays often produce widely dispersed data, little considerations about the robustiﬁcation of the current methodology have been made. This work tests a possible approach of robustifying a hierarchical Bayesian ANOVA model, which is speciﬁcally designed for the analysis of microarrays, with respect to its underlying error model. Additionally, it means to provide an understanding of the differences of results compared to the standard model and their differing biological implications.
The core of the method is the model selection of a ﬁtting likelihood function from a set of noncentral student's t distributions of different degrees of freedom and normal distributions. A hybrid MCMC sampler has been designed and implemented in Matlab in order to perform the model inference. It has been tested with several artiﬁcial and biological data sets.
Applying the method to different biological settings, has provided a clear answer to the question: is student's t distribution a more reasonable model distribution for such data sets? Student's t distributions with low degrees of freedom are generally preferred as error model. More importantly the results showed that differences between the robust (student's t) and the standard (Gaussian) model not only occurred in the statistical inference, but also led to different biological conclusions which were drawn based on Gene Ontology analysis.
Thus this work shows the importance of handling the choice of model likelihood with great care in the ﬁeld of microarray analysis.