Powiadomienia systemowe
- Sesja wygasła!
Tytuł artykułu
Autorzy
Treść / Zawartość
Pełne teksty:
Identyfikatory
Warianty tytułu
Języki publikacji
Abstrakty
In the field of Bioinformatics, the scientific community is fully aware of the challenges associated with enzyme classification. In this study, a novel strategy is proposed based on the use of Anomalous Autoencoders to characterize chitinases belonging to glycoside hydrolases. Python and TensorFlow programming technologies were employed to conduct this analysis. The designed classifier consists of two levels that determine both the enzymatic nature of an amino acid sequence and its corresponding chitinase enzyme family. These levels considered class imbalance and the underrepresentation of those enzyme families in the CAZy.org database. Furthermore, a comprehensive comparison was made with other available software in the field. To represent the amino acid sequences, embeddings generated from the ProtFlash model were used. The results obtained in this study confirm the effectiveness of the proposed implementation compared to the methods EzyPred, ECPred, and Proteinfer
Słowa kluczowe
Rocznik
Tom
Strony
42--48
Opis fizyczny
Bibliogr. 27 poz., rys.
Twórcy
- Center for Informatics Research, Faculty of Mathematics, Physics, and Computer Science, Central University “Marta Abreu” de Las Villas, Cuba
- Center for Informatics Research, Faculty of Mathematics, Physics, and Computer Science, Central University “Marta Abreu” de Las Villas, Cuba
- Center for Informatics Research, Faculty of Mathematics, Physics, and Computer Science, Central University “Marta Abreu” de Las Villas, Cuba
- Center for Informatics Research, Faculty of Mathematics, Physics, and Computer Science, Central University “Marta Abreu” de Las Villas, Cuba
Bibliografia
- [1] N. Buton, F. Coste, and Y. Le Cunff, “Predicting Enzymatic Function of Protein Sequences With Attention,” Bioinformatics, vol. 39, no. 10, Oct. 2023, doi: 10.1093/bioinformatics/btad620.
- [2] Y. González Valle, D. Galpert, and R. Molina-Ruiz, “Integración De Rasgos Y Aprendizaje Semi-Supervisado Para La Clasiϐicación Funcional De Enzimas Utilizando K-Means De Spark,” Revista Cubana de Ciencias Informáticas, vol. 14, no. 4, 2020.
- [3] Y. González Valle, D. Galpert, and R. Molina-Ruiz, “Agrupamiento Funcional De Enzimas GH-70 Utilizando Aprendizaje Semi-Supervisado Y Apache Spark,” Revista Cubana de Transformación Digital, pp. 14–32, 2021.
- [4] H. Chehili, S. E. Aliouane, A. Bendahmane, and M. A. Hamidechi, “DeepEnz: Prediction Of Enzyme Classiϐication By Deep Learning,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 22, no. 2, 2021, doi: 10.11591/ijeecs.v22.i2.pp1108-1115.
- [5] Z. Tao, B. Dong, Z. Teng, and Y. Zhao, “The Classification of Enzymes by Deep Learning,” IEEE Access, vol. 8, 2020, doi: 10.1109/ACCESS.2020.2992468.
- [6] N. Ibtehaz and D. Kihara, “Application of Sequence Embedding in Protein Sequence-Based Predictions,” in Machine Learning in Bioinformatics of Protein Sequences: Algorithms, Databases and Resources for Modern Protein Bioinformatics, 2022. doi: 10.1142/9789811258589_0002.
- [7] K. K. Yang, Z. Wu, C. N. Bedbrook, and F. H. Arnold, “Learned Protein Embeddings For Machine Learning,” Bioinformatics, vol. 34, no. 15, pp. 2642–2648, Aug. 2018, doi: 10.1093/bioinformatics/bty178.
- [8] C. Marquet et al., “Embeddings From Protein Language Models Predict Conservation And Variant Effects,” Hum Genet, vol. 141, no. 10, 2022, doi: 10.1007/s00439-021-02411-y.
- [9] M. M. Moya and D. R. Hush, “Network Constraints And Multi-Objective Optimization For One-Class Classification,” Neural Networks, vol. 9, no. 3, 1996, doi: 10.1016/0893-6080(95)00120-4.
- [10] M. Sakurada and T. Yairi, “Anomaly Detection Using Autoencoders with Nonlinear Dimensionality Reduction,” in ACM International Conference Proceeding Series, 2014. doi: 10.1145/2689746.2689747.
- [11] K. Pawar and V. Attar, “Deep Learning Model Based on Cascaded Autoencoders and One-Class Learning For Detection And Localization Of Anomalies From Surveillance Videos,” IET Biom, vol. 11, no. 4, 2022, doi: 10.1049/bme2.12064.
- [12] L. López, N. Acosta-Mendoza, and A. Gago-Alonso, “Detección De Anomalías Basada En Aprendizaje Profundo,” Revista de Ciencias Informáticas, vol. 13, no. 3, 2020.
- [13] M. V. Nallapareddy and R. Dwivedula, “ABLE: Attention Based Learning For Enzyme Classification,” Comput Biol Chem, vol. 94, p. 107558, 2021, doi: https://doi.org/10.1016/j.compbiolchem.2021.107558.
- [14] R. Atienza, Advanced Deep Learning with Keras. 2018.
- [15] L. Wang, H. Zhang, W. Xu, Z. Xue, and Y. Wang,“Deciphering The Protein Landscape With Protf-lash, A Lightweight Language Model,” Cell RepPhys Sci, vol. 4, no. 10, p. 101600, 2023, doi: https: //doi.org/10.1016/j.xcrp.2023.101600.
- [16] K. Cabello-Solorzano, I. Ortigosa de Araujo, M.Peña, L. Correia, and A. J. Tallón-Ballesteros, “TheImpact of Data Normalization on the Accuracy of Machine Learning Algorithms: A Comparative Analysis,” 2023. doi: 10.1007/978-3-031-42536-3_33.
- [17] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: Synthetic Minority Over-Sampling Technique,” Journal of Artiϔicial Intelligence Research, vol. 16, 2002, doi: 10.1613/jair.953.
- [18] G. Douzas, F. Bacao, and F. Last, “Improving Imbalanced Learning Through A Heuristic Oversampling Method Based On K-Means And SMOTE,” Inf Sci (N Y), vol. 465, 2018, doi: 10.1 016/j.ins.2018.06.056.
- [19] H. Han, W. Y. Wang, and B. H. Mao, “Borderline-SMOTE: A New Over-Sampling Method In Imbalanced Data Sets Learning,” in Lecture Notes in Computer Science, 2005. doi: 10.1007/11538059_91.
- [20] Aurélien Géaron, Hands-on Machine Learning with Scikit-Learn, Keras and TensorFlow: Concepts, Tools, And Techniques to Build Intelligent Systems. 2022.
- [21] D. P. Kingma and J. L. Ba, “Adam: A Method For Stochastic Optimization,” in 3rd International Conference on Learning Representations, ICLR 2015 - Conference Track Proceedings, 2015.
- [22] R. Dhanuka, A. Tripathi, and J. P. Singh, “A Semi-Supervised Autoencoder-Based Approach for Protein Function Prediction,” IEEE J Biomed Health Inform, vol. 26, no. 10, pp. 4957–4965, Oct. 2022, doi: 10.1109/JBHI.2022.3163150.
- [23] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: A Simple Way To Prevent Neural Networks From Overfitting,” Journal of Machine Learning Research, vol. 15, 2014.
- [24] T. Dozat, “Incorporating Nesterov Momentum into Adam,” ICLR Workshop, no. 1, 2016.
- [25] H. Bin Shen and K. C. Chou, “EzyPred: A Top–Down Approach For Predicting Enzyme Functional Classes And Subclasses,” Biochem Biophys Res Commun, vol. 364, no. 1, pp. 53–59, Dec. 2007, doi: 10.1016/J.BBRC.2007.09.098.
- [26] A. Dalkiran, A. S. Rifaioglu, M. J. Martin, R. Cetin-Atalay, V. Atalay, and T. Doğan, “ECPred: A Tool For The Prediction Of The Enzymatic Functions Of Protein Sequences Based On The EC Nomenclature,” BMC Bioinformatics, vol. 19, no. 1, Sep. 2018, doi: 10.1186/s12859-018-2368-y.
- [27] T. Sanderson, M. L. Bileschi, D. Belanger, and L. J. Colwell, “ProteInfer, Deep Neural Networks for Protein Functional Inference,” Elife, vol. 12, 2023, doi: 10.7554/eLife.80942.
Uwagi
Opracowanie rekordu ze środków MNiSW, umowa nr POPUL/SP/0154/2024/02 w ramach programu "Społeczna odpowiedzialność nauki II" - moduł: Popularyzacja nauki (2025).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-71f4194e-d0d5-4771-87dd-19956452621d
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.