PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
DNA microarray data is expected to be a great help in the development of efficient diagnosis and tumor classification. However, due to the small number of instances compared to a large number of genes, many of the computational learning methods encounter difficulties to select the low subgroups. In order to select significant genes from the high dimensional data for tumor classification, nowadays, several researchers are exploring microarray data using various gene selection methods. However, there is no agreement between existing gene selection techniques that produce the relevant gene subsets by which it improves the classification accuracy. This motivates us to invent a new hybrid gene selection method which helps to eliminate the misleading genes and classify a disease correctly in less computational time. The proposed method composes of two-stage, in the first stage, EGS method using multi-layer approach and f-score approach is applied to filter the noisy and redundant genes from the dataset. In the second stage, adaptive genetic algorithm (AGA) work as a wrapper to identify significant genes subsets from the reduced datasets produced by EGS that can contribute to detect cancer or tumor. AGA algorithm uses the support vector machine (SVM) and Naïve Bayes (NB) classifier as a fitness function to select the highly discriminating genes and to maximize the classification accuracy. The experimental results show that the proposed framework provides additional support to a significant reduction of cardinality and outperforms the state-of-art gene selection methods regarding accuracy and an optimal number of genes.
Twórcy
autor
  • Department of Computer Science & Engineering, NIT, Raipur, Chhattisgarh 492010, India
autor
  • Department of Computer Science & Engineering, NIT, Raipur, India
autor
  • Department of Computer Science & Engineering, NIT, Raipur, India
Bibliografia
  • [1] Ang JC, Mirzal A, Haron H, Nuzly H, Hamed A. Supervised, unsupervised, and semi-supervised feature selection: a review on gene selection. IEEE/ACM Trans Comput Biol Bioinform 2016;13(5):971–89.
  • [2] Mafarja M, Mirjalili S. Whale optimization approaches for wrapper feature selection. Appl Soft Comput J 2018;62:441–53.
  • [3] Nakariyakul S. High-dimensional hybrid feature selection using interaction information-guided search. Knowl Based Syst 2018;145:59–66.
  • [4] Hancer E, Xue B, Zhang M. Differential evolution for filter feature selection based on information theory and feature ranking. Knowl Based Syst 2018;140:103–19.
  • [5] Bonilla-Huerta E, Hernández-Montiel M, Morales-Caporal R, Arjona-López M. Hybrid framework using multiple-filters and an embedded approach for an efficient selection and classification of microarray data. IEEE/ACM Trans Comput Biol Bioinform 2016;13(1):12–26.
  • [6] Mohamad MS, Omatu S, Deris S, Yoshioka M. A modified binary particle swarm optimization for selecting the small subset of informative genes from gene expression data. IEEE Trans Inf Technol Biomed 2011;15(6):813–22.
  • [7] Liu H, Li D. Predicting novel salivary biomarkers for the detection of pancreatic cancer using biological feature-based classification. Pathol Res Pract 2016.
  • [8] Hancer E, Xue B, Karaboga D, Zhang M. A binary ABC algorithm based on advanced similarity scheme for feature selection. Appl Soft Comput J 2015;36:334–48.
  • [9] Kumar S, Kumar P, Kumar A, Swarnkar T. Elitism based multi-objective differential evolution for feature selection: a filter approach with an efficient redundancy measure. J King Saud Univ – Comput Inf Sci 2017.
  • [10] Wang A, An N, Yang J, Chen G, Li L, Alterovitz G. Wrapper-based gene selection with Markov blanket. Comput Biol Med 2017;81:11–23.
  • [11] Aziz R, Verma CK, Srivastava N. A novel approach for dimension reduction of microarray. Comput Biol Chem 2017;71:161–9.
  • [12] Alshamlan HM, Badr GH, Alohali YA. Genetic Bee Colony (GBC) algorithm: a new gene selection method for microarray cancer classification. Comput Biol Chem 2015;56:49–60.
  • [13] Alshamlan H, Badr G, Alohali Y. mRMR-ABC: a hybrid gene selection algorithm for cancer classification using microarray gene expression profiling, vol. 2015. 2015.
  • [14] Aziz N, Verma R, Jha CK, Srivastava M. Artificial neural network classification of microarray data using new hybrid gene selection method. Int J Data Min Bioinform 2017;17(1):42–65.
  • [15] Lee S, Xu Z, Li T, Yang Y. A novel bagging C4. 5 algorithm based on wrapper feature selection for supporting wise clinical decision making. J Biomed Inform 2017.
  • [16] Wan Y, Wang M, Ye Z, Lai X. A feature selection method based on modified binary coded ant colony optimization algorithm. Appl Soft Comput J 2016;49:248–58.
  • [17] Das AK, Goswami S, Chakrabarti A, Chakraborty B. A new hybrid feature selection approach using feature association map for supervised and unsupervised classification. Expert Syst Appl 2017;88:81–94.
  • [18] Paul D, Su R, Romain M, Sébastien V, Pierre V, Isabelle G. Feature selection for outcome prediction in oesophageal cancer using genetic algorithm and random forest classifier. Comput Med Imaging Graph 2016.
  • [19] Goldberg JHH, David E. Genetic algorithms and machine learning. Mach Learn 1988;3(2):95–9.
  • [20] Zheng H, Zhang Y, Liu J, Wei H, Zhao J, Liao R. A novel model based on wavelet LS-SVM integrated improved PSO algorithm for forecasting of dissolved gas contents in power transformers. Electr Power Syst Res 2018;155:196–205.
  • [21] Seijo-Pardo B, Porto-Díaz I, Bolón-Canedo V, Alonso-Betanzos A. Ensemble feature selection: homogeneous and heterogeneous approaches. Knowl Based Syst 2017;118:124–39.
  • [22] Ding C, Peng H. Minimum redundancy feature selection from microarray gene expression data. J Bioinform Comput 2005;3(2):185–205.
  • [23] Huang Y, McCullagh PJ, Black ND. An optimization of ReliefF for classification in large datasets. Data Knowl Eng 2009;68(11):1348–56.
  • [24] Jin X, Xu A, Bie R, Guo P. Machine learning techniques and chi-square feature selection for cancer classification using SAGE gene. International Workshop on Data Mining for Biomedical Applications. Berlin: Springer; 2006. p. 106–15.
  • [25] Mode H. Joint & conditional entropy, mutual information. Part I. Joint and conditional entropy; 2014.
  • [26] Sadri A, Ren Y, Salim FD. Information gain-based metric for recognizing transitions in human activities. Pervasive Mob Comput 2017;38:92–109.
  • [27] Leung Y, Hung Y. A multiple-filter-multiple-wrapper approach to gene selection and microarray data classification. IEEE/ACM Trans Comput Biol Bioinform 2010;7(1):108–17.
  • [28] Apolloni J, Leguizamón G, Alba E. Two hybrid wrapper-filter feature selection algorithms applied to high-dimensional microarray experiments. Appl Soft Comput 2016;38:922–32.
  • [29] Bonilla-Huerta E. Hybrid filter-wrapper with a specialized random multi-parent crossover operator for gene selection and classification problems. International Conference on Intelligent Computing. Berlin, Heidelberg: Springer; 2011. p. 453–61.
  • [30] Yin H, Member S, Jha NK. A health decision support system for disease diagnosis based on wearable medical sensors and machine learning ensembles. IEEE Trans Multi-Scale Comput Syst 2017;3(4):228–41.
  • [31] Silwattananusarn T, Kanarkard W, Tuamsuk K. Enhanced classification accuracy for cardiotocogram data with ensemble feature selection and classifier ensemble. J Comput Commun 2016;20–35.
  • [32] Al-Rajab M, Lu J, Xu Q. Examining applying high performance genetic data feature selection and classification algorithms for colon cancer diagnosis. Comput Methods Programs Biomed 2017;146:11–24.
  • [33] Ghorai S. Multicategory cancer classification from gene expression data by multiclass NPPC ensemble; 2010;41–6.
  • [34] Rachman AA. Cancer classification using fuzzy C-means with feature selection. 12th Int. Conf. Math. Stat. Their Appl.. 2016. pp. 31–4.
  • [35] Zhang J, Chung HS, Member S, Lo W. Clustering-based adaptive crossover and mutation probabilities for genetic algorithms. IEEE Trans Evol Comput 2007;11(3):326–35.
  • [36] Wei X, Shao W, Zhang C, Li J, Wang B. Improved self-adaptive genetic algorithm with quantum scheme for electromagnetic optimisation. Microwaves Antennas Propag 2014;8(12):965–72.
  • [37] Alirezazadeh P, Fathi A, Abdali-Mohammadi F. A genetic algorithm-based feature selection for kinship verification. IEEE Signal Process Lett 2015;22(12):2459–63.
  • [38] Pes B, Dessì N, Angioni M. Exploiting the ensemble paradigm for stable feature selection: a case study on high-dimensional genomic data. Inf Fusion 2017;35:132–47.
  • [39] Laura Emmanuella LEA, De Paula Canuto AM. Filter-based optimization techniques for selection of feature subsets in ensemble systems. Expert Syst Appl 2014;41(4):1622–31.
  • [40] Ebrahimpour MK, Eftekhari M. Ensemble of feature selection methods: a hesitant fuzzy sets approach. Appl Soft Comput J 2017;50:300–12.
  • [41] Moradi P, Gholampour M. A hybrid particle swarm optimization for feature subset selection by integrating a novel local search strategy. Appl Soft Comput J 2016;43:117– 30.
  • [42] Srinivas M, Patnaik LM. Adaptive probabilities of crossover and mu tation in genetic algorithms. IEEE Trans Syst Man Cybern 1994;24(4):656–67.
  • [43] Arauzo-Azofra A, Benitez J, Castro J. A feature set measure based on relief. Proc. Fifth Int. Conf. Recent Adv. Soft Comput.. 2004. pp. 104–9.
  • [44] Mundra PA, Rajapakse JC. SVM-RFE with MRMR filter for gene selection. IEEE Trans Nanobiosci 2010;9(1):31–7.
  • [45] Peng H, Long F, Ding C. Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 2005;27(8):1226–38.
  • [46] Thakur M. A new genetic algorithm for global optimization of multimodal continuous functions. J Comput Sci 2013;5 (2):298–311.
  • [47] Kundakcı N, Kulak O. Hybrid genetic algorithms for minimizing makespan in dynamic job shop scheduling problem. Comput Ind Eng 2016;96:31–51.
  • [48] Cai Z, Zhu W. Feature selection for multi-label classification using neighborhood preservation. IEEE/CAA J Autom Sin 2018;5(1):320–30.
  • [49] Wu X, Kumar V, Ross QJ, Ghosh J, Yang Q, Motoda H, et al. Top 10 algorithms in data mining. Knowl Inf Syst 2008;14(1):1–37.
  • [50] Bron E, Smits M, Van Swieten J, Niessen W, Klein S. Feature selection based on SVM significance maps for classification of dementia. Int Work Mach Learn Med Imaging 2014;19(5):272–9.
  • [51] Díaz-Uriarte R, Alvarez de Andrés S. Gene selection and classification of microarray data using random forest. BMC Bioinform 2006;7(1).
  • [52] Lu H, Chen J, Yan K, Jin Q, Xue Y, Gao Z. A hybrid feature selection algorithm for gene expression data classification. Neurocomputing 2017.
  • [53] Li Y, Yang Y, Li G, Xu M, Huang W. A fault diagnosis scheme for planetary gearboxes using modified multi-scale symbolic dynamic entropy and mRMR feature selection. Mech Syst Signal Process 2017;91:295–312.
  • [54] Bashir S, Qamar U, Khan FH. IntelliHealth: a medical decision support application using a novel weighted multilayer classifier ensemble framework. J Biomed Inform 2016;59:185–200.
  • [55] Lai C, Yeh W, Chang C. Gene selection using information gain and improved simplified swarm optimization. Neurocomputing 2016;218:331–8.
  • [56] Rankawat SA, Dubey R. Robust heart rate estimation from multimodal physiological signals using beat signal quality index based majority voting fusion method. Biomed Signal Process Control 2017;33:201–12.
  • [57] Chen Y, Lin C. Combining SVMs with various feature selection strategies. Featur Extr 2006;(1):315–24.
  • [58] Soufan O, Kleftogiannis D, Kalnis P, Bajic VB. DWFS: a wrapper feature selection tool based on a parallel genetic algorithm. PLoS ONE 2015;10(2):e0117988.
  • [59] Van't Veer GJ, Dai LJ, Van De Vijver H, He MJ, Hart YD, Mao AA, et al. Gene expression profiling predicts clinical outcome of breast cancer. Lett Nat 2002;415(345):530–6.
  • [60] Alba JGE. Parallel multi-swarm optimizer for gene selection in DNA microarrays. Appl Intell 2012;37(2):255–66.
  • [61] Tong DL, Mintram R. Genetic Algorithm-Neural Network (GANN): a study of neural network activation functions and depth of genetic algorithm search applied to feature selection. Int J Mach Learn Cybern 2010;1(1–4):75–87.
  • [62] Bolón-Canedo A, Sánchez-Maroño V, Alonso-Betanzos N. Distributed feature selection: an application to microarray data classification. Appl Soft Comput 2015;30:136–50.
  • [63] Chen Y, Li Y, Wang G, Zheng Y, Xu Q, Fan J. A novel bacterial foraging optimization algorithm for feature selection. Expert Syst Appl 2017;83:1–17.
  • [64] Mollaee M, Moattar MH. A novel feature extraction approach based on ensemble feature selection and modified discriminant independent component analysis for microarray data classification. Biocybern Biomed Eng 2016;36(3):1–9.
  • [65] Medjahed SA, Saadi TA, Benyettou A, Ouali M. Kernel-based learning and feature selection analysis for cancer diagnosis. Appl Soft Comput J 2017;51:39–48.
  • [66] Pashaei E, Aydin N. Binary black hole algorithm for feature selection and classification on biological data. Appl Soft Comput J 2017;56:94–106.
  • [67] Zainudin M, Sulaiman M, Mustapha N, Perumal T, Nazri A, Mohamed R, et al. Feature selection optimization using hybrid Relief-f with self-adaptive differential evolution. Int J Intell Eng Syst 2017;10(3):21–9.
  • [68] Sasikala S, Balamurugan SA, Geetha S. A novel memetic algorithm for discovering knowledge in binary and multi class predictions based on support vector machine. Appl Soft Comput J 2016;49:407–22.
  • [69] Hall M. Correlation-based feature selection for machine learning; 1999.
  • [70] Moradi P, Rostami M. Integration of graph clustering with ant colony optimization for feature selection. Knowl Based Syst 2015;84:144–61.
  • [71] Ferreira AJ, Figueiredo MAT. Efficient feature selection filters for high-dimensional data. Pattern Recognit Lett 2012;33(13):1794–804.
  • [72] Sahin H, Subasi A. Classification of the cardiotocogram data for anticipation of fetal risks using machine learning techniques. Appl Soft Comput J 2015;33:231–8.
  • [73] El Houby EMF. A framework for prediction of response to HCV therapy using different data mining techniques. Adv Bioinform 2014;11.
  • [74] Jain I, Kumar V, Jain R. Correlation feature selection based improved-binary particle swarm optimization for gene selection and cancer classification. Appl Soft Comput J 2018;62:203–15.
  • [75] Sree Ranjini KS, Murugan S. Memory based hybrid dragonfly algorithm for numerical optimization problems. Expert Syst Appl 2017;83:63–78.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-12aeccb0-6b80-4707-ba12-beb47696f6c7
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.