PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Customer’s Purchase Prediction Using Customer Segmentation Approach for Clustering of Categorical Data

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
Traditional clustering algorithms which use distance between a pair of data points to calculate their similarity are not suitable for clustering of boolean and categorical attributes. In this paper, a modified clustering algorithm for categorical attributes is used for segmentation of customers. Each segment is then mined using frequent pattern mining algorithm in order to infer rules that helps in predicting customer’s next purchase. Generally, purchases of items are related to each other, for example, grocery items are frequently purchased together while electronic items are purchased together. Therefore, if the knowledge of purchase dependencies is available, then those items can be grouped together and attractive offers can be made for the customers which, in turn, increase overall profit of the organization. This work focuses on grouping of such items. Various experiments on real time database are implemented to evaluate the performance of proposed approach.
Twórcy
autor
  • Department of Computer Science, Amity School of Engineering and Technology, Delhi, India
  • Department of Mathematics, Amity Institute of Applied Sciences, Amity University Uttar Pradesh, Noida, India-201303, phone: +98 91 402 516
Bibliografia
  • Aggarwal, C.C., Han, J., Wang, J., and Yu, P.S. (2003, September). A framework for clustering evolving data streams, Proceedings of the 29th international conference on Very large data bases, 29, 81–92. VLDB Endowment.
  • Aggarwal, C.C. and Philip, S.Y. (2010). On clustering massive text and categorical data streams, Knowledge and information systems, 24, 1, 171–196.
  • Agrawal, R., Imieliński, T. and Swami, A. (1993, June). Mining association rules between sets of items in large databases, Acmsigmod record, 22, 2, 207–216. ACM.
  • Ayed, A.B., Halima, M.B., and Alimi, A.M. (2014, August). Survey on clustering methods: Towards fuzzy clustering for big data, In: Soft Computing and Pattern Recognition (SoCPaR), 2014 6th International Conference of (pp. 31–336). IEEE
  • Bai, L., Liang, J., Dang, C., and Cao, F. (2011). A novel attribute weighting algorithm for clustering high-dimensional categorical data, Pattern Recognition, 44, 12, 2843–2861.
  • Barbará, D., Li, Y., and Couto, J. (2002, November). COOLCAT: an entropy-based algorithm for categorical clustering, In: Proceedings of the eleventh international conference on Information and knowledge management, pp. 582–589, ACM.
  • Cai, F., Le-Khac, N.A., and Kechadi, T. (2016). Clustering approaches for financial data analysis: a survey, arXiv preprint arXiv:1609.08520.
  • Cao, F., Liang, J., Bai, L., Zhao, X., and Dang, C. (2010). A framework for clustering categorical time-evolving data, IEEE Transactions on Fuzzy Systems, 18, 5, 872–882.
  • Chen, H.L., Chen, M.S., and Lin, S.C. (2009). Catching the trend: A framework for clustering concept-drifting categorical data, IEEE Transactions on Knowledge and Data Engineering, 21, 5, 652–665.
  • Chen, G., Jaradat, S.A., Banerjee, N., Tanaka, T.S., Ko, M.S., and Zhang, M.Q. (2002). Evaluation and comparison of clustering algorithms in analyzing ES cell gene expression data, Statistica Sinica, 241–262.
  • Chen, K. and Liu, L. (2009). He-tree: a framework for detecting changes in clustering structure for categorical data streams, The VLDB Journal – The International Journal on Very Large Data Bases, 18, 6, 1241–1260.
  • Fahad, A., Alshatri, N., Tari, Z., Alamri, A., Khalil, I., Zomaya, A.Y. and Bouras, A. (2014). A survey of clustering algorithms for big data: Taxonomy and empirical analysis, IEEE transactions on emerging topics in computing, 2, 3, 267–279.
  • Feizi-Derakhshi, M.R. and Zafarani, E. (2012). Review and comparison between clustering algorithms with duplicate entities detection purpose, International Journal of Computer Science and Emerging Technologies, 3(3).
  • Gao, S., Wang, Y., Cheng, J., Inazumi, Y., and Tang, Z. (2016). Ant colony optimization with clustering for solving the dynamic location routing problem, Applied Mathematics and Computation, 285, 149–173.
  • Han, J., Pei, J., and Kamber, M. (2011). Data mining: concepts and techniques, Elsevier.
  • Hancer, E. and Karaboga, D. (2017). A comprehensive survey of traditional, merge-split and evolutionary approaches proposed for determination of cluster number, Swarm and Evolutionary Computation, 32, 49–67.
  • Huang, Z. (1998). Extensions to the K-means algorithm for clustering large data sets with categorical values, Data mining and knowledge discovery, 2, 3, 283–304.
  • Hunt, L. and Jorgensen, M. (2011). Clustering mixed data, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1, 4, 352–361.
  • Jain, A.K., Murty, M.N., and Flynn, P.J. (1999). Data clustering: a review, ACM computing surveys (CSUR), 31, 3, 264–323.
  • Jain, A.K. (2010). Data clustering: 50 years beyond Kmeans, Pattern recognition letters, 31, 8, 651–666.
  • Jaggi, C.K., Khanna, A. and Mittal, M. (2011). Credit financing for deteriorating imperfect quality items under inflationary conditions, International Journal of Services Operations and Informatics, 6, 4, 292–309.
  • Jayaswal, M.K., Sangal, I., Mittal, M. and Malik, S. (2019). Effects of learning on retailer ordering policy for imperfect quality items with trade credit financing, Uncertain Supply Chain management, 7, 1, 49–62.
  • Kaufman, L. and Rousseeuw, P.J. (2009). Finding groups in data: an introduction to cluster analysis, vol. 344, John Wiley and Sons.
  • Kriegel, H.P., Kröger, P., and Zimek, A. (2009). Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering, ACM Transactions on Knowledge Discovery from Data (TKDD), 3, 1, 1.
  • Liao, T.W. (2005). Clustering of time series data – a survey, Pattern recognition, 38, 11, 1857–1874.
  • Liu, Sifeng and Yang, Y. (2011). A brief introduction to Grey systems theory. Grey Systems: Theory and Application, 2. doi: 10.1109/GSIS.2011.6044018.
  • Mittal, M., Khanna, A., and Jaggi, C.K. (2017). Retailer’s ordering policy for deteriorating imperfect quality items when demand and price are timedependent under inflationary conditions and permissible delay in payments, International Journal of Procurement Management, 10, 4, 461–494.
  • 29 Reshu Agarwal, G.L. and Mittal, M. (2019). Inventory classification using multilevel association rule mining, International Journal of Decision Support System Technology, 11, 1, 1–12.
  • Sajana, T., Rani, C.S., and Narayana, K.V. (2016). A survey on clustering techniques for big data mining, Indian Journal of Science and Technology, 9, 3.
  • Shen H. and Duan Z.. Application Research of Clustering Algorithm Based on K-Means in Data Mining, 2020 International Conference on Computer Information and Big Data Applications (CIBDA), Guiyang, China, 2020, pp. 66–69, doi: 10.1109/CIBDA50819.2020.00023.
  • Singh, J., Mittal M., and Pareek S. (2016). Customer behavior Prediction using K-means Clustering algorithm, Optimal Inventory Control and Management Techniques, 256–267.
  • Singhal, G., Panwar, S., Jain, K., and Banga, D. (2013). A comparative study of data clustering algorithms, International Journal of Computer Applications, 83, 15.
  • Quinlan, R.C. (1993). 4.5: Programs for machine learning morgankaufmann publishers inc. San Francisco, USA.
  • Vo-Van, T., Nguyen-Hai, A., Tat-Hong, M.V., and Nguyen-Trang, T. (2020). A New Clustering Algorithm and Its Application in Assessing the Quality of Underground Water, Scientific Programming, 2020.
  • Xu, R. and Wunsch, D. (2005). Survey of clustering algorithms, IEEE Transactions on neural networks, 16, 3, 645–678.
  • Zhao, C., Johnsson, M., and He, M. (2017). Data mining with clustering algorithms to reduce packaging costs: A case study, Packaging Technology and Science, 30, 5, 173–193.
Uwagi
Opracowanie rekordu ze środków MEiN, umowa nr SONP/SP/546092/2022 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2022-2023)
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-2ace626a-d1de-4bdd-9e60-87255a95f416
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.