Microarray analysis is widely used for cancer diagnosis and classification. However, among a large amount of genes in microarray data, only a small fraction of them is effective for making a highly reliable model. There are two major challenges in this regard: Thus, one of the challenging tasks is how to identify significant genes from thousands of them in datasets that can improve the generated model and the other one is how to select the subset of genes with minimum dependency to the samples in datasets which is termed as stability of selected sets. Different approaches have been presented in previous works. In this study, we propose a new algorithm for gene selection based on the phase diagram method which has been proposed earlier. Ridge logistic regression has been used to estimate the probability of genes that are most likely to belong to a set of stable genes with high classification capability. In order to consider the stability issue, a method is proposed for the final selection of selected sets. The B632+ error estimation method has been applied to evaluate the performance of the model. The proposed method was applied to four cancer datasets and obtained results are compared with other validation methods and the results show that the selected genes have superiority in terms of the number of genes, degree of stability and classification accuracy.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.