Tytuł artykułu
Autorzy
Identyfikatory
Warianty tytułu
Języki publikacji
Abstrakty
Microarray images commonly used in gene expression studies are heavily contaminated by noise and/or outlying values (outliers). Unfortunately, standard methodology for the analysis of Illumina BeadChip microarray images turns out to be too vulnerable to data contamination by outliers. In this paper, an alternative approach to low-level pre-processing of images obtained by the BeadChip microarray technology is proposed. The novel approach robustifies the standard methodology in a complex way and thus ensures a sufficient robustness (resistance) to outliers. A gene expression data set from a cardiovascular genetic study is analyzed and the performance of the novel robust approach is compared with the standard methodology. The robust approach is able to detect and delete a larger percentage of outliers. More importantly, gene expressions are estimated more precisely. As a consequence, also the performance of a subsequently performed classification task to two groups (patients vs. control persons) is improved over the cardiovascular gene expression data set. A further improvement was obtained when considering weighted gene expression values, where the weights correspond to a robust estimate of variability of the measurements for each individual gene transcript.
Wydawca
Czasopismo
Rocznik
Tom
Strony
556--563
Opis fizyczny
Bibliogr. 30 poz., rys., tab.
Twórcy
autor
- Institute of Computer Science of the Czech Academy of Sciences, Pod Vodárenskou věží 2, 182 07 Praha 8, Czech Republic
Bibliografia
- [1] Rueda L. Microarray image and data analysis: theory and practice. Boca Raton: CRC Press; 2014.
- [2] Dunning M, Barbosa-Morais N, Lynch A, Tavaré S, Ritchie M. Statistical issues in the analysis of Illumina data. BMC Bioinformatics 2008;9:85.
- [3] Fraser K, Wang Z, Liu X. Microarray image analysis: an algorithmic approach. Boca Raton: Chapman & Hall/CRC; 2010.
- [4] Smith M, Dunning M, Tavaré S, Lynch A. Identification and correction of previously reported spatial phenomena using raw Illumina BeadArray data. BMC Bioinformatics 2010;11:208.
- [5] Göhlmann H, Talloen W. Gene expression studies using Affymetrix microarrays. Boca Raton: Chapman & Hall/CRC; 2009.
- [6] Dziuda DM. Data mining for genomics and proteomics: analysis of gene and protein expression data. New York: Wiley; 2010.
- [7] Irizarry RA, Bolstad BM, Collin F, Cope LM, Hobbs B, Speed TP. Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Res 2003;31:e15.
- [8] Wu Z, Irizarry RA, Gentleman R, Martinez-Murillo F, Spencer F. A model-based background adjustment for oligonucleotide expression arrays. J Am Stat Assoc 2004;99:909–17.
- [9] Arteaga-Salas JM, Zuzan H, Langdon WB, Upton GJG, Harrison AP. An overview of image-processing methods for Affymetrix GeneChips. Brief Bioinformatics 2008;9:25–33.
- [10] Dunning M, Smith ML, Ritchie ME, Tavaré S. Beadarray: R classes and methods for Illumina bead-based data. Bioinformatics 2007;23:2183–4.
- [11] Kalina J, Schlenker A. Robust image analysis of BeadChip microarrays. In: Secca MF, Schier J, Fred ALN, Gamboa H, Elias D, editors. Proceedings BIOIMAGING 2015. Lisbon: SciTePress; 2015. p. 89–94.
- [12] Oldham MC, Konopka G, Iwamoto K, Langfelder P, Kato T, Horvath S, et al. Functional organization of the transcriptome in human brain. Nat Neurosci 2008;11:1271–82.
- [13] Dunning M, Smith ML, Ritchie ME, Tavaré S. beadarray: R classes and methods for Illumina bead-based data, R package version 2.26; 2017, https://bioconductor.org/packages/release/bioc/html/beadarray.html.
- [14] Shi W, Banerjee A, Ritchie M, Gerondakis S, Smyth G. Illumina WG-6 beadchip strips should be normalized separately. BMC Bioinformatics 2009;10:372.
- [15] Ritchie ME, Silver J, Oshlack A, Holmes M, Diyagama D, Holloway A, et al. A comparison of background correction methods for two-colour microarrays. Bioinformatics 2007;23:2700–7.
- [16] Serfling R, Wang S. General foundations for studying masking and swamping robustness of outlier identifiers. Stat Methodol 2014;20:79–90.
- [17] Rousseeuw PJ, Leroy AM. Robust regression and outlier detection. 2nd ed. Hoboken: Wiley; 2003.
- [18] Víšek JA. Robust error-term-scale estimate. Inst Math Stat Collect 2010;7:254–67.
- [19] Davies P, Gather U. Breakdown and groups. Ann Stat 2005;33:997–1035.
- [20] Víšek JA. Consistency of the least weighted squares under heteroscedasticity. Kybernetika 2011;47:179–206.
- [21] Kalina J, Hlinka J. Implicitly weighted robust classification applied to brain activity research. Commun Comput Inform Sci 2017;690:87–107.
- [22] Hubert M, Rousseeuw PJ, van Aelst S. Highbreakdown robust multivariate methods. Stat Sci 2008;23:92–119.
- [23] Guo Y, Hastie T, Tibshirani R. Regularized discriminant analysis and its application in microarrarys. Biostatistics 2005;8:86–100.
- [24] Roelant E, van Aelst S. An L1-type estimator of multivariate location and shape. Stat Methods Appl 2007;15:381–93.
- [25] Lee J, Ciccarello S, Acharjee M, Das K. Dimension reduction of gene expression data. J Stat Theory Pract 2018;12:450–61.
- [26] Ding C, Peng H. Minimum redundancy feature selection from microarray gene expression data. J Bioinformatics Comput Biol 2005;3:185–205.
- [27] Liu X, Krishnan A, Mondry A. An entropy-based gene selection method for cancer classification using microarray data. BMC Bioinformatics 2005;6:76.
- [28] Loewe RP, Nelson PJ. Microarray bioinformatics. Methods Mol Biol 2011;671:295–320.
- [29] Cardona A, Tomancak P. Current challenges in open-source bioimage informatics. Nature Methods 2012;9:661–5.
- [30] Korpelainen E, Tuimala J, Somervuo P, Huss M, Wong G. RNA-seq data analysis: a practical approach. Boca Raton: CRC Press; 2015.
Uwagi
PL
Opracowanie rekordu w ramach umowy 509/P-DUN/2018 ze środków MNiSW przeznaczonych na działalność upowszechniającą naukę (2018).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-fd673485-6c71-4855-8588-dd5970a457df