This paper deals with a structural classification by the aid of support vector machine (SVM) classifier. Amino acid composition (AAC) and pseudo amino acid composition (PseAA) features were applied with different variants. Additionally the feature reflecting the length of protein chain was taken into consideration. The SVM classifier was compared to minimallength classifiers with respect to the AAC features. The best model of SVM classifier was chosen using grid method on the basis of cross-validation (CV) as criterion. The best model of SVM classifier is evaluated with respect to proper evaluation rates. The SCOP database and the ASTRAL tool were a source of non-homologous data to avoid the redundancy and to ensure a maximal amount of available data.
2
Dostęp do pełnego tekstu na zewnętrznej witrynie WWW
Recursive feature elimination method (RFE), cross validation coefficient (CV) and accuracy of classification of test data are applied as a criterion of feature selection in order to find relevant features and to analyze their influence on classifier accuracy. Feature selection method was compared to principal component analysis (PCA) to understand the effectiveness of feature reduction. Support vector machine classifier with radial basis function (RBF) kernel is applied to find the best set of features using grid model selection and to select and assess relevant features. The best selected feature set is then analyzed and interpreted as the source of knowledge about the protein structure and biochemical properties of amino acids included in the protein domain sequence.
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.