A big problem in applying DNA microarrays for classification is dimension of the dataset. Recently we proposed a gene selection method based on Partial Least Squares (PLS) for searching best genes for classification. The new idea is to use PLS not only as multiclass approach, but to construct more binary selections that use one versus rest and one versus one approaches. Ranked gene lists are highly instable in the sense, that a small change of the data set often leads to big change of the obtained ordered list. In this article, we take a look at the assessment of stability of our approaches. We compare the variability of the obtained ordered lists from proposed methods with well known Recursive Feature Elimination (RFE) method and classical t-test method. This paper focuses on effective identification of informative genes. As a result, a new strategy to find small subset of significant genes is designed. Our results on real cancer data show that our approach has very high accuracy rate for different combinations of classification methods giving in the same time very stable feature rankings.
2
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
Mathematical modeling of cell signaling pathways has become a very important and challenging problem in recent years. The importance comes from possible applications of obtained models. It may help us to understand phenomena appearing in single cells and cell populations on a molecular level. Furthermore, it may help us with the discovery of new drug therapies. Mathematical models of cell signaling pathways take different forms. The most popular way of mathematical modeling is to use a set of nonlinear ordinary differential equations (ODEs). It is very difficult to obtain a proper model. There are many hypotheses about the structure of the model (sets of variables and phenomena) that should be verified. The next step, fitting the parameters of the model, is also very complicated because of the nature of measurements. The blotting technique usually gives only semi-quantitative observations, which are very noisy and collected only at a limited number of time moments. The accuracy of parameter estimation may be significantly improved by a proper experiment design. Recently, we have proposed a gradient-based algorithm for the optimization of a sampling schedule. In this paper we use the algorithm in order to optimize a sampling schedule for the identification of the mathematical model of the NF[...]B regulatory module, known from the literature. We propose a two-stage optimization approach: a gradient-based procedure to find all stationary points and then pair-wise replacement for finding optimal numbers of replicates of measurements. Convergence properties of the presented algorithm are examined.
3
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
The dynamical behaviour of a cell signalling pathway may be described by means of a set of nonlinear ordinary differential equations. The data for parameter estimation are collected only at discrete time moments that are relatively rare. We show a gradient-based algorithm for parameter estimation. We also present some considerations about identifiability of cell signalling pathways. The approach is illustrated on a model of NF?B transcription factor pathway.
PL
Dynamiczne zachowanie komórkowych szlaków sygnałowych może być modelowane za pomocą nieliniowych równań różniczkowych zwyczajnych. Dane potrzebne do identyfikacji zbierane są w nielicznych, dyskretnych chwilach czasu. W artykule zamieszczamy gradientową metodę identyfikacji parametrów oraz przedstawiamy rozważania dotyczące identyfikowalności parametrów. Podejście jest zilustrowane na przykładzie modelu szlaku sygnałowego czynnika transkrypcyjnego NF?B.
W pracy przedstawiono wyniki gradientowej optymalizacji parametru [beta] ekstrapolatora ułamkowego rzędu dla układu regulacji złożonego z obiektu typu inercja z opóźnieniem i dyskretnego regulatora PID. Otrzymano odwzorowanie optymalnej wartości parametru [beta] na płaszczyźnie stała czasowa/opóźnienie. Stwierdzono, że jedynie ujemna wartość [beta] powoduje istotną poprawę jakości regulacji w porównaniu z układem regulacji z ekstrapolatorem zerowego rzędu w sensie całkowo-kwadratowego wskaźnika jakości.
EN
The work presents results of a gradient-based optimization of a parameter [beta] of a fractional order hold in a control system composed of an inertia with delay and PID controller. The mapping of the optimal value of the parameter [beta] on the time constant/delay plane has been obtained. We observed that only negative value of the parameter [beta] causes substantial improvements of the control quality in comparison to the control system with the zero order hold in sense of the integral-quadratic performance index.
The DNA microarray-based technique has been developed to semi-quantitatively measure the in vivo global chromatin condensation state at the resolution of a gene. Chromatin was fractionated due to the differential solubility of histone H1-containing and histone H1-free nucleosomes. A set of genes non-randomly distributed between histone H1-free (uncondensed or open) and histone H1-containing (condensed or closed) chromatin fractions has been identified. The transcript levels have been measured for the same group of genes. The correlation between transcriptional activity and chromatin fraction distribution of particular genes has been established.
6
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
DNA microarrays provide a new technique of measuring gene expression, which has attracted a lot of research interest in recent years. It was suggested that gene expression data from microarrays (biochips) can be employed in many biomedical areas, e.g., in cancer classification. Although several, new and existing, methods of classification were tested, a selection of proper (optimal) set of genes, the expressions of which can serve during classification, is still an open problem. Recently we have proposed a new recursive feature replacement (RFR) algorithm for choosing a suboptimal set of genes. The algorithm uses the support vector machines (SVM) technique. In this paper we use the RFR method for finding suboptimal gene subsets for tumor/normal colon tissue classification. The obtained results are compared with the results of applying other methods recently proposed in the literature. The comparison shows that the RFR method is able to find the smallest gene subset (only six genes) that gives no misclassifications in leave-one-out cross-validation for a tumor/normal colon data set. In this sense the RFR algorithm outperforms all other investigated methods.
Microarrays are new technique of gene expression measurements that attracted a great deal of research interest in recent years. It has been suggested that gene expression data from microarrays (biochips) can be utilized in many biomedical areas, for example in cancer classification. Whereas several, new and existing, methods of classification has been tested, a selection of proper (optimal) set of genes, which expression serves during classification, is still an open problem. In this paper we propose a heuristic method of choosing suboptimal set of genes by using support vector machines (SVMs). Obtained set of genes optimizes one-leave-out cross-validation error. The method is tested on microarray gene expression data of samples of two cancer types: acute myeloid leukemia (AML) and acute lymphoblastic leukemia (ALL). The results show that quality of classification of selected set of genes is much better than for sets obtained using another methods of feature selection.
Proper classification of cancer is a crucial aspect in diagnosis and choosing optimal medical therapy. It has been suggested, in recent years, that classification process of cancer can be done using gene expression monitoring. Usefulness of this approach has increased due to the new technique of gene expression monitoring – using so called "expression chips". Recently in [1, 3] a heuristic method of cancer classification, called weighted voting (WV) method, based on gene expression levels has been proposed and tested on a set of samples of acute myeloid leukemia (AML) and acute lymphoblastic leukemia (ALL). Here a more traditional approach to feature selection and classification is presented and tested on the same data set. Feature selection is performed using modified Sebestyen criterion and classification is done using linear classifying function trained by modified perception algorithm. Obtained results are better than results of the WV method. In cross-validation of initial set all 38 samples were classified correctly (WV – 1 incorrect) and only one sample from independent set was classified incorrectly (WV – 2 incorrect).
9
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
This paper deals with a problem of identification and suboptimal control of a counterflow heat exchanger. From the point of view of control theory the heat exchanger is a nonlinear, multidimensional, distributed parameter, dynamical system, and due to its complexity it is difficult to identify it as a black box. In this paper a hybrid model containing neural networks is identified. Its complicated structure makes the analytical calculation of the gradient of performance index with respect to neural network weights very difficult. This problem is solved using a special, structural formulation of sensitivity analysis called generalized back propagation through time (GBPTT). This method is universal, can be used for searching suboptimal parameters (weights) or suboptimal control signals in continuous or discrete time, nonlinear, dynamical systems. Moreover, the presented method is fully mnemonic. The obtained model of the heat exchanger and the same methodology is used during the gradient calculation of the suboptimal control signal of the heat exchanger. Numerical examples are presented.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.