This section provides the fully annotated expression data table after low-level transforms and normalization, and low-level analysis diagnostic plots.
After reannotation (v10 for TAIR v7; Dai et al., 2005) and filtering of probe sets probe sequence specific effects were removed (Wu et al., 2004) as described in our manuscript's Method section. Then probe level signals were normalized for different backgrounds and overall hybridization intensities of individual chips using an iterative 20%-trimmed least squares fit of a generative model with additive-multiplicative noise (Huber et al., 2002) using a log2-asymptotic calibration, and excluding chromosome 5 genes from the parameter fitting step. Finally, transcript expression estimates were obtained by robust fits of linear multi-chip probe level intensity models (Bolstad, 2004).
We here provide the full results and diagnostic plots for the fit:
In the above files, for historical reasons, samples from the F2 and F3 generations were labelled F0 and F1, respectively.
We here first demonstrate the effect of applying to our data two commonly used normalization transforms and contrast this to our conservative approach. We then briefly consider the question of housekeeping genes in this context. Finally, we examine the interesting idea of using CGH measurements for signal normalization.
Choosing of an appropriate normalization transform (Kreil & Russell, 2005) needs to consider
Quantile normalization is a very generic and powerful normalization transform. As its underlying assumption, that the signal distribution for all samples should be the same, is valid in many experiments, it is popular and the default normalization method for many algorithms, such as the rma and gcrma approaches to probe summarization (Irizarry et al., 2003; Wu et al., 2004; Wu & Irizarry, 2005), which are provided in both academic and commercial tools (e.g., Bioconductor or GeneSpring). In our situation, however, where we expect many genes on the trisomic chromosome 5 to be expressed at higher levels in trisomic plants than in disomic plants, the signal distribution in trisomic plants will reflect the large number of more highly expressed genes. Note that this is not easily seen from the diagnostic Q–Q plots above but follows directly from the large imbalance of differential signals shown in the M(A) plots. These indicate that a large number of genes are more highly expressed in the trisomic samples whereas no correspondingly strong down-regulation effect can be observed. As a result, the conditions for quantile normalization are not satisfied.
If quantile normalization is nevertheless applied, we observe a substantial squashing of the systematically increased expression of genes on chromosome 5. Consider the M(A) plots for the group-wise averages. Compare, e.g., the third row of panels examining expression levels in wildtype (F1.WT) and trisomic plants (F1.tri5) of the F3 generation [ conservative vs quantile normalization ].
We think that this resolves the puzzle that some similar studies (e.g., Makarevitch et al., 2008) could detect only a fraction of the effects that we report in our paper. To test this assumption, we have tried to recreate the analysis reported there. To remove any unfair advantages of a better signal-to-noise ratio in larger data sets, we selected a subset of our data for comparable sample sizes. We then applied quantile normalization and used the statistical methods and thresholds of Makarevitch et al. (2008) in testing for differential expression (Benjamini-Hochberg correction for multiple testing, 15% FDR threshold). Trans effects almost vanished (7 genes affected), and for only 23% of genes on chromosome 5 could higher expression levels be detected with significance. Restricting the analysis to more highly expressed genes, where the platform sensitivity is better, the proportion of affected cis genes rises to 44%, which matches the numbers reported by Makarevitch et al. (2008). It is noteworthy that this is still a factor of two below the ratio of genes clearly affected in our study despite us using much more conservative statistical tests and thresholds. This difference is not due to greater sample size in our study, as we can recover most effects on chromosome 5 already with the smaller test set examined above. We conclude that inappropriate normalization can drastically affect the power of an analysis to detect even the large-scale expression differences caused by aneuploidy. This raises the hope that a re-analysis of earlier experiments may yield a considerably more sensitive detection of effects.
M(A) Loess normalization is a another generic and powerful normalization transform. As its underlying assumption, that the differential expression signal should be independent of average gene expression levels and average to zero, is valid in many experiments, it is also popular and the default normalization method in many academic and commercial tools (e.g., Bioconductor or Agilent Feature Extraction Software).
In our situation, however, where we expect many genes on the trisomic chromosome 5 to be expressed at higher levels in trisomic plants than in disomic plants, we do not expect differential expression signals to average to zero. This follows directly from the large imbalance of differential signals seen in the M(A) plots. These indicate that a large number of genes are more highly expressed in the trisomic samples whereas no correspondingly strong down-regulation effect can be observed. As a result, the conditions for M(A) Loess normalization normalization are not satisfied.
While this problem has been recognized (Torres et al., 2007), there is no easy post hoc fix once an aggressive normalization transform has been applied. We illustrate this following the approach suggested by Torres et al. (2007). To this end, we have marked the M(A) Loess average by a white line in an M(A) plot similar to Figure 5 of our paper [ pre.loess ]. In Loess normalization, this average is subtracted from the signal M for all genes. The orange lines indicate the average expression and standard deviations for genes not on the trisomic chromosome 5, purple lines show the trend and variance for chromosome 5. It is clear that the orange centre trend line before Loess normalization shows the desired behaviour, hugging the zero axis. Subtracting the white general trend, in contrast, yielded data that then claimed that, on average, genes on other chromosomes showed reduced expression in trisomics. To fix this, Torres et al. (2007) suggested correcting data by the average M value of these genes after Loess normalization, the value of which is indicated by a yellow dashed line. Results of M(A) Loess normalization and this average bias subtraction fix, however, are worse than before normalization [ post.loess ]: The trend of genes on other chromosomes clearly deviates from the zero axis, showing a bias for upregulation for lowly and highly expression genes and also a bias for downregulation for genes with moderate expression levels.
A traditional approach to microarray normalization tries to exploit so-called housekeeping genes, assuming that such genes can be identified that indeed do not change under experimental conditions. Clearly, a gene that is stable in many conditions might still be affected by a particular experiment. It is hence interesting to briefly consider whether traditional housekeeping genes are stably expressed in a setting like ours where substantial non-random gene expression changes are expected. We therefore compared, for all samples, the measurements probing ribosomal RNA and actin expression levels as well as measurements of two other traditional house keeping genes, GAPDH (GAPC-2) and UPL7 [ house ]. One of the two RNA species, several actin genes, as well as GAPC-2 and UPL7 showed stable expression levels and thus also corroborate our normalization approach. Array measurements furthermore agreed well with qRT-PCR measurements [ Fig. S1, bottom panel shows GAPC-2 ]. Interestingly, however, other supposedly stable genes like 25S rRNA and actin 11 showed marked fluctuations. This confirms that the selection of a good set of housekeeping genes is in general non-trivial. Normalization with a small number of housekeeping genes, moreover, suffers considerable noise compared to normalization methods that estimate their parameters from thousands of genes.
In comparison, in our conservative approach we identify, by iterative trimmed least squares fit, the subset of 60% least varying genes across all samples, and use these to identify a constant background offset and a constant multiplicative scaling factor. Only two parameters per sample need to be estimated, which makes the approach very conservative. As the parameter estimates are based on expression levels measured for thousands of genes they are, moreover, very precise. An established implementation of this method is available in the R package vsn (Huber et al., 2002). We have further excluded all genes on chromosome 5 to ensure that the systemic overexpression of these genes did not affect the normalization procedure. Results with and without genes on chromosome 5 were very similar (data not shown), confirming the expected robustness of the approach to outliers. It is remarkable that this conservative transform already yielded highly consistent data, as is particularly reflected by the centre trend for genes not on chromosome 5 hugging the zero axis (orange lines of Figure 5 of the paper and the Supplement Results section).
As CGH data was available for a number of samples, we could also investigate the performance of using this data for normalization. This approach assumes that the fold-change in expression observed for each gene should reflect the fold-change in dosage measured by CGH. One might hope that gene specific differences in binding behaviour would similarly affect gene transcripts and genomic DNA targets. Interestingly, while the approach worked reasonably well for genes of moderate expression levels (slightly undercorrecting for gene dosage), considerable overcorrection was observed for low and high expression levels [ CGH.norm ]. These difficulties can be explained by the strong non-linear signal response of the platform that is also reflected in Figure 5 of the manuscript.