PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Comparative Study of Supervised Learning Methods for Malware Analysis

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Malware is a software designed to disrupt or even damage computer system or do other unwanted actions. Nowadays, malware is a common threat of the World Wide Web. Anti-malware protection and intrusion detection can be significantly supported by a comprehensive and extensive analysis of data on the Web. The aim of such analysis is a classification of the collected data into two sets, i.e., normal and malicious data. In this paper the authors investigate the use of three supervised learning methods for data mining to support the malware detection. The results of applications of Support Vector Machine, Naive Bayes and k-Nearest Neighbors techniques to classification of the data taken from devices located in many units, organizations and monitoring systems serviced by CERT Poland are described. The performance of all methods is compared and discussed. The results of performed experiments show that the supervised learning algorithms method can be successfully used to computer data analysis, and can support computer emergency response teams in threats detection.
Rocznik
Tom
Strony
24--33
Opis fizyczny
Bibliogr. 34 poz., rys., tab.
Twórcy
  • Research and Academic Computer Network NASK, Warsaw, Poland
  • Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland
  • Research and Academic Computer Network NASK, Warsaw, Poland
  • Institute of Control and Computation Engineering, Warsaw University of Technology, Warsaw, Poland
Bibliografia
  • [1] T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning, 2nd ed. Series in Statistics, Springer, 2009.
  • [2] W. S. Noble, “Support vector machine applications in computational biology”, in Kernel Methods in Computational Biology, B. Scholkopf, K. Tsuda, and J.-P. Vert, Eds. Cambridge, USA: MIT Press, 2004, pp. 71–92.
  • [3] C. Wagner, G. Wagener, R. State, and T. Engel, “Malware analysis with graph kernels and support vector machines”, in Proc. 4th Int. Conf. Malicious and Unwanted Software MALWARE 2009, Montreal, Canada, 2009, pp. 63–68.
  • [4] M. Krzyśko, W. Wołyński, T. Gorecki, and M. Skorzybut, Systemy uczące się (Learning systems). Warszawa: Wydawnictwo Naukowo-Techniczne, 2009, pp. 107–187 (in Polish).
  • [5] K. Rieck, T. Holz, C. Willems, P. Dussel, and P. Laskov, “Learning and classification of malware behavior”, in Proc. 5th In. Conf. DIMVA 2008, Paris, France, 2008, vol. 5137, pp. 108–125.
  • [6] M. Amanowicz and P. Gajewski, “Military communications and information systems interoperability”, in Proc. Milit. Commun. Conf. MILCOM 96, McLean, VA, USA, 1996, vol. 1–3, pp. 280–283.
  • [7] R. Kasprzyk and Z. Tarapata, “Graph-based optimization method for information diffusion and attack durability in networks”, in Rough Sets and Current Trends in Computing, LNCS, vol. 6086, pp. 698–709, Springer, 2010.
  • [8] M. Mincer and E. Niewiadomska-Szynkiewicz, “Application of social network analysis to the investigation of interpersonal connections”, J. Telecommun. Inform. Technol., no. 2, pp. 81–89, 2012.
  • [9] M. Shankarapani, K. Kancherla, S. Ramammoorthy, R. Movva, and S.Mukkamala, “Kernel machines for malware classification and similarity analysis”, in Proc. Int. Joint Conf. Neural Netw. IJCNN 2010, Barcelona, Spain, 2010, pp. 1–6.
  • [10] S. Forrest et al., “Self-nonself discrimination in a computer”, in Proc. Comp. Soc. Symp. Res. Secur. and Priv., Oakland, CA, USA, 1994, vol. 10, pp. 311–324.
  • [11] I. Liane de Oliveira, A. Ricardo, A. Gregio, and A. M. Cansian, “A malware detection system inspired on the human immune system”, in Computational Science and its Applications – ICCSA 2012, LNCS, vol. 7336, pp. 286–301, Springer, 2012.
  • [12] E. Stalmans and B. Irwin, “A framework for DNS based detection and mitigation of malware infections on a network”, in Proc. 10th Ann. Inform. Secur. South Africa Conf. ISSA 2011, Johannesburg, South Africa, 2011.
  • [13] M. Zubair Shafiq, S. Ali Khayam, and M. Farooq, “Embedded malware detection using markov n-grams”, in Detection of Intrusions and Malware, and Vulnerability Assessment, T. Holz and H. Bos, Eds., LNCS, vol. 6739, pp. 88–107. Springer, 2008.
  • [14] M. Franklin, A. Halevy, and D. Maier. “From databases to dataspaces: A new abstraction for information management”, Sigmod Record, vol. 34, no. 4, pp. 27–33, 2005.
  • [15] K. Lasota and A. Kozakiewicz, “Analysis of the Similarities in Malicious DNS Domain Names”, in Secure and Trust Computing, Data Management and Applications, C. Lee, J.-M. Seigneur, J. J. Park, and R.R. Wagner, Eds. Communications in Computer and Information Science, vol. 187, pp. 1–6. Springer, 2011.
  • [16] Y. Yanfang, D. Wang, T. Li, and D. Ye, “IMDS: Intelligent malware detection system”, in Proc. 13th ACM SIGKDD Int. Conf. Knowl. Discov. Data Mining KDD’07, San Jose, CA, USA, 2007, pp. 1043–1047.
  • [17] M. R. Faghani and H. Saidi, “Malware propagation in online social networks”, in Proc. Int. Conf. Malicious and Unwanted Software MALWARE 2009, Montreal, Canada, 2009, pp. 8–14.
  • [18] P. Cunningham and S. J. Delany, “k-Nearest Neighbour Classifiers”, Tech. Rep. UCD-CSI-2007-4, UCD School of Computer Science and Informatics, Dublin, 2007, pp. 1–17.
  • [19] J. M. Keller, “A fuzzy k-Nearest Neighbor algorithm”, IEEE Trans. Syst., Man, and Cybernet., vol. 15, no. 4, pp. 580–585, 1985.
  • [20] L. Jiang, H. Zhang and Z. Cai “A Novel Bayes Model: Hidden Naive Bayes”, IEEE Trans. Knowl. Data Engin., vol. 21, no. 10, pp. 1361–1371, 2009.
  • [21] V. N. Vapnik, The Nature of Statistical Learning Theory. New York: Springer, 1995.
  • [22] A. Borders and L. Bottou, “The Huller: a simple and efficient online SVM”, in Machine Learning: ECML-2005, J. Gama, R. Camacho, P. Brazdil, A. Jorge, and L. Torgo, Eds., LNCS, vol. 3720, pp. 505–512. Springer, 2005.
  • [23] J. Koronacki and J. Ćwik, Statystyczne systemy uczące się (Statistical learning systems). Warsaw: Exit, 2008 (in Polish).
  • [24] T. Joachims, “Support Vector and Kernel Methods”, SIGIR-Tutorial, Cornell University Computer Science Department, 2003.
  • [25] N. Cristianini and J. Shawe-Taylor, An Introduction to Support Vector Machines, 1st ed. Cambridge University Press, 2000, pp. 25–29.
  • [26] E. Ikonomowska, D. Gorgevik, and S. Loskovska, “A survey of stream data mining”, in Proc. 8th Nat. Conf. Int. Particip. ETAI 2007, Ohrid, Republic of Macedonia, 2007, pp. 16–20.
  • [27] T. Joachims, “Training linear SVMs in linear time”, in Proc. 12th ACM SIGKDD Int. Conf. Knowl. Discov. Data Mining KDD 2006, Philadelphia, PA, USA, 2006 pp. 217–226.
  • [28] Kernel Machines homepage [Online]. Available: http://www.kernel-machines.org/
  • [29] C. Burges, “A Tutorial on Support Vector Machines for Pattern Recognition”, Knowledge Discovery and Data Mining, vol. 2, no. 2, pp. 121–167, 1998.
  • [30] T. Jebara, “Multi-task feature and kernel selection for SVMs”, in Proc. Int. Conf. on Machine Learning, Banff, Canada, 2004, pp. 55–63.
  • [31] F. R. Bach, G. Lanckriet, and M. Jordan “Multiple kernel learning, conic duality and the smo algorithm”, in Proc. 21st Int. Conf. Machine Learning ICML’04, Banff, Canada, 2004, pp. 6–13.
  • [32] F. R. Bach, R. Thibaux, and M. I. Jordan, “Computing regularization paths for learning multiple kernels” in Advances in Neural Information Processing Systems, L. K. Saul, Y. Weiss, and L. Bottou, Eds. MIT Press, 2005, pp. 73–80.
  • [33] N6 Platform homepage [Online]. Available: http://www.cert.pl/news/tag/n6
  • [34] G. M. Draper, “Interactive radial vizualizations for information retrieval and management”, Ph.D. Thesis, Univertity of Utah, 2009, Chapter 3.
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-13105d82-68da-4a6e-bc1d-3024abc6ffdc
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.