(2)5-2 Several microarray meta-analysis methods have been developed

(1)

5-1

5 CONCLUSIONS

Despite the increasing popularity of microarray meta-analysis, many issues remain unsolved that can hinder the effectiveness of its application.

Although many methods have been proposed and used in published applications, a detailed workflow to perform microarray data meta-analysis does not exist yet.

Although many grant agencies and peer-review journals now require to make data available, many old studies or new studies funded by private foundations are still not publicly accessible. Studies with censored or incomplete information can be an obstacle for meta-analysis.

It is still unclear how measurements from different platforms compare with each other and inconsistencies in gene coverage and annotation make comparison much more difficult.

(2)

5-2

Several microarray meta-analysis methods have been developed. The selection of a suitable meta-analysis method depends on the type of analysis desired and the hypothesis setting behind each method. Ramasamy and colleagues [48] recommend effect size combination methods as the most comprehensive approach for meta-analysis of two-class gene expression microarrays due to its known advantages However there is no consensus on what is(are) the ‘best’ meta-analysis method(s). A large-scale comparative study and simulation study with adequate evaluation measures are needed to provide insights and practical guidelines for choosing the ‘best’ meta-analysis method(s) in practice.

Only a few of the proposed microarray meta-analysis methods are developed in easy to use software packages. In addition to the scarcity of software packages in the field and the lack of a regular update of the existing ones, most of the available packages either did not have clear manuals or had functions that were not easy to apply. Efforts to provide high-quality documentation of programs in order to make them more reliable and easier to comprehend are well summarized by the concept of ‘literate programming’ and its implementation developed by Knuth [Knuth DE, 1984]. The package Sweave [165] is an example of use of the noweb-like literate programming tool [166]

inside the R language for creation of dynamic statistical reports. It provides a flexible framework for mixing text and R code for automatic document generation. A single source file contains both documentation text and R code, which are then embedded into a final document containing the documentation text together with the R code and/or the output of the code (e.g. text, graphs,

(3)

5-3

tables) by running the code through R. The report can be automatically updated if data or analysis change, which allows for truly reproducible research.

All packages available by Bioconductor now should meet this requirement and, in fact, the most recent packages contain one or more ‘vignettes’, that is documents providing a textual, task-oriented description of the package's functionality. Due to its widely recognized benefits, literate programming practice should be promoted in future software development.

Heterogeneities caused by demographic, clinical and technical variables often exist within and across studies. Failure to consider these potential confounding variables in the statistical models and meta-analysis can result in reduced statistical power or false positives. Meta-analyses of clinical and epidemiological studies use regression modeling to adjust for the confounding effects. Only recently similar techniques have been extended to microarray data meta-analysis [167] however further efforts are needed in this area.