PL EN


Preferencje help
Widoczny [Schowaj] Abstrakt
Liczba wyników
Tytuł artykułu

Preprocessing large datasets using Gaussian mixture modelling to improve prediction accuracy of truck productivity at mine sites

Treść / Zawartość
Identyfikatory
Warianty tytułu
Języki publikacji
EN
Abstrakty
EN
The historical datasets at operating mine sites are usually large. Directly applying large datasets to build prediction models may lead to inaccurate results. To overcome the real-world challenges, this study aimed to handle these large datasets using Gaussian mixture modelling (GMM) for developing a novel and accurate prediction model of truck productivity. A large dataset of truck haulage collected at operating mine sites was clustered by GMM into three latent classes before the prediction model was built. The labels of these latent classes generated a latent variable. Two multiple linear regression (MLR) models were then constructed, including the ordinary-MLR (O-MLR) and the hybrid GMM-MLR models. The GMM-MLR model incorporated the observed input variables and a latent variable in the form of interaction terms. The O-MLR model was the baseline model and did not involve the latent variable. The GMM-MLR model performed considerably better than the O-MLR model in predicting truck productivity. The interaction terms quantitatively measured the differences in how the observed input variables affected truck productivity in three classes (high, medium, and low truck productivity). The haul distance was the most crucial input variable in the GMM-MLR model. This study provides new insights into handling massive amounts of data in truck haulage datasets and a more accurate prediction model for truck productivity.
Rocznik
Strony
661--680
Opis fizyczny
Bibliogr. 65 poz., rys., tab., wykr.
Twórcy
autor
  • University of Alberta, Edmonton, Department of Civil and Environmental Engineering, Alberta T6g 2E3, CANADA
autor
  • University of Alberta, Department of Mathematical and Statistical Sciences, Edmonton, Alberta T6g 2g1, CANADA
autor
  • University of Alberta, Department of Mathematical and Statistical Sciences, Edmonton, Alberta T6g 2g1, CANADA
  • University of Alberta, Department of Mathematical and Statistical Sciences, Edmonton, Alberta T6g 2g1, CANADA
Bibliografia
  • [1] S. Sleep, I.J. Laurenzi, J.A. Bergerson, H.L. MacLean, Evaluation of variability in greenhouse gas intensity of Canadian oil sands surface mining and upgrading operations. Environ. Sci. & Technol. 52 (20), 11941-11951 (2018). DOI: https://doi.org/10.1021/acs.est.8b03974.
  • [2] CAPP, A strong energy sector is key to ensure Canada’s prosperity for the future. Canadian Association of Petroleum Producers (CAPP) (2018). https://www.capp.ca/economy/canadian-economic-contribution.
  • [3] A.K. Katta, M. Davis, V. Subramanyam, A.F. Dar, M.A.H. Mondal, M. Ahiduzzaman, A. Kumar, Assessment of energy demand-based greenhouse gas mitigation options for Canada’s oil sands. J. Clean. Prod. 241, 118306 (2019). DOI: https://doi.org/10.1016/j.jclepro.2019.118306.
  • [4] P. Bodziony, Z. Kasztelewicz, P. Sawicki, The problem of multiple criteria selection of the surface mining haul trucks. Arch. Min. Sci. 61 (2), 223-243 (2016). DOI: http://doi.org/10.1515/amsc-2016-0017.
  • [5] S. Alarie, M. Gamache, Overview of solution strategies used in truck dispatching systems for open pit mines. Int. J. Surf. Min. Reclam. Environ. 16 (1), 59-76 (2002). DOI: https://doi.org/10.1076/ijsm.16.1.59.3408.
  • [6] P.J. Bartos, Is mining a high-tech industry?: Investigations into innovation and productivity advance. Resour. Policy 32 (4), 149-158 (2007). DOI: https://doi.org/10.1016/j.resourpol.2007.07.001.
  • [7] E.K. Chanda, S. Gardiner, A comparative study of truck cycle time prediction methods in open‐pit mining. Eng. Constr. Archit. Manag. 17 (5), 446-460 (2010). DOI: https://doi.org/10.1108/09699981011074556.
  • [8] Y. Gui, Z. Tao, C. Wang, X. Xie, Study on remote monitoring system for landslide hazard based on wireless sensor network and its application. J. Coal Sci. Eng. 17, 464-468 (2011). DOI: https://doi.org/10.1007/s12404-011-0422-8.
  • [9] Q. Gu, C. Lu, J. Guo, S. Jing, Dynamic management system of ore blending in an open pit mine based on GIS/ GPS/GPRS. Min. Sci. Technol. 20 (1), 132-137 (2010). DOI: https://doi.org/10.1016/S1674-5264(09)60174-5.
  • [10] V. Sabniveesu, A. Kavuri, R. Kavi, V. Kulathumani, V. Kecojevic, A. Nimbarte, Use of wireless, ad-hoc networks for proximity warning and collision avoidance in surface mines. Int. J. Min., Reclam. Environ. 29 (5), 331-346 (2015). DOI: https://doi.org/10.1080/17480930.2015.1086550.
  • [11] E. Siami-Irdemoosa, S.R. Dindarloo, Prediction of fuel consumption of mining dump trucks: A neural networks approach. Appl. Energy 151, 77-84 (2015). DOI: https://doi.org/10.1016/j.apenergy.2015.04.064.
  • [12] K. Zhang, S. Ji, Y. Zhang, J. Zhang, R. Pan, MEMS inertial sensor for strata stability monitoring in underground mining: An experimental study. Shock Vib. 2018, 4895862 (2018). DOI: https://doi.org/10.1155/2018/4895862.
  • [13] J. Baek, Y. Choi, Deep neural network for predicting ore production by truck-haulage systems in open-pit mines. Appl. Sci. 10 (5), 1657 (2020). DOI: https://doi.org/10.3390/app10051657.
  • [14] M.A. Shahin, H.R. Maier, M.B. Jaksa, Data division for developing neural networks applied to geotechnical engineering. J. Comput. Civ. Eng. 18 (2), 105-114 (2004). DOI: https://doi.org/10.1061/(ASCE)0887-3801(2004)18:2(105).
  • [15] S.R. Dindarloo, E. Siami-Irdemoosa, Data mining in mining engineering: results of classification and clustering of shovels failures data. Int. J. Min. Reclam. Environ. 31 (2), 105-118 (2017). DOI: https://doi.org/10.1080/17480930.2015.1123599.
  • [16] M.S. Alam, S. Paul, A comparative analysis of clustering algorithms to identify the homogeneous rainfall gauge stations of Bangladesh. J. Appl. Stat. 47 (8), 1460-1481 (2020). DOI: https://doi.org/10.1080/02664763.2019.1675606.
  • [17] J. Yang, C. Ning, C. Deb, F. Zhang, D. Cheong, S.E. Lee, C. Sekhar, K.W. Tham, K-shape clustering algorithm for building energy usage patterns analysis and forecasting model accuracy improvement. Energy Build. 146, 27-37 (2017). DOI: https://doi.org/10.1016/j.enbuild.2017.03.071.
  • [18] L. Tu, Y. Lv, Y. Zhang, X. Cao, Logistics service provider selection decision making for healthcare industry based on a novel weighted density-based hierarchical clustering. Adv. Eng. Inform. 48, 101301 (2021). DOI: https://doi.org/10.1016/j.aei.2021.101301.
  • [19] X. Wang, H.J. Hamilton, A comparative study of two density-based spatial clustering algorithms for very large datasets, in: B. Kégl, G. Lapalme (Eds.) Advances in Artificial Intelligence. Springer Berlin Heidelberg, Berlin, Heidelberg (2005). https://doi.org/10.1007/11424918_14.
  • [20] L. Zhang, S.-K. Oh, W. Pedrycz, B. Yang, Y. Han, Building fuzzy relationships between compressive strength and 3D microstructural image features for cement hydration using Gaussian mixture model-based polynomial radial basis function neural networks. Appl. Soft Comput. 112, 107766 (2021). DOI: https://doi.org/10.1016/j.asoc.2021.107766.
  • [21] J. Diaz-Rozo, C. Bielza, P. Larrañaga, Machine-tool condition monitoring with Gaussian mixture models-based dynamic probabilistic clustering. Eng. Appl. Artif. Intell. 89, 103434 (2020). DOI: https://doi.org/10.1016/j.engappai.2019.103434.
  • [22] C.M. Bishop, Pattern Recognition and Machine Learning. Springer, Verlag New York (2006).
  • [23] J.A. Rice, Mathematical Statistics and Data Analysis. Duxbury Press, Belmont, CA (1995).
  • [24] E.G. Cervantes, S.P. Upadhyay, H. Askari-Nasab, Improvements to production planning in oil sands mining through analysis and simulation of truck cycle times. Mining Optimization Laboratory (MOL), University of Alberta, 142-156 (2019).
  • [25] F. Ge, Y. Ju, Z. Qi, Y. Lin, Parameter estimation of a Gaussian mixture model for wind power forecast error by Riemann L-BFGS optimization. IEEE Access. 6, 38892-38899 (2018). DOI: https://10.1109/ACCESS.2018.2852501.
  • [26] Y. Lu, Z. Tian, P. Peng, J. Niu, W. Li, H. Zhang, GMM clustering for heating load patterns in-depth identification and prediction model accuracy improvement of district heating system. Energy Build. 190, 49-60 (2019). DOI: https://doi.org/10.1016/j.enbuild.2019.02.014.
  • [27] L. Ni, D. Wang, J. Wu, Y. Wang, Y. Tao, J. Zhang, J. Liu, Streamflow forecasting using extreme gradient boosting model coupled with Gaussian mixture model. J. Hydrol. 586, 124901 (2020). DOI: https://doi.org/10.1016/j.jhydrol.2020.124901.
  • [28] G.H. Lubke, J. Luningham, Fitting latent variable mixture models. Behav. Res. Ther. 98, 91-102 (2017). DOI: https://doi.org/10.1016/j.brat.2017.04.003.
  • [29] O.E. Parsons, A Gaussian mixture model approach to classifying response types, in: N. Bouguila, W. Fan (Eds.) Mixture Models and Applications. Springer International Publishing, Cham (2020). DOI: https://doi.org/10.1007/978-3-030-23876-6_1.
  • [30] L. Ye, Y. Zhang, C. Zhang, P. Lu, Y. Zhao, B. He, Combined Gaussian mixture model and cumulants for probabilistic power flow calculation of integrated wind power network. Comput. Electr. Eng. 74, 117-129 (2019). DOI: https://doi.org/10.1016/j.compeleceng.2019.01.010.
  • [31] K.S. Berlin, N.A. Williams, G.R. Parra, An introduction to latent variable mixture modeling (part 1): Overview and cross-sectional latent class and latent profile analyses. J. Pediatr. Psychol. 39 (2), 174-187 (2013). DOI: https://doi.org/10.1093/jpepsy/jst084.
  • [32] G. Ciulla, A. D’Amico, Building energy performance forecasting: A multiple linear regression approach. Appl. Energy. 253, 113500 (2019). DOI: https://doi.org/10.1016/j.apenergy.2019.113500.
  • [33] L. Wu, C. Hu, W.V. Liu, Forecasting the deterioration of cement-based mixtures under sulfuric acid attack using support vector regression based on Bayesian optimization. SN Appl. Sci. 2, 1970 (2020). DOI: https://doi.org/10.1007/s42452-020-03778-9.
  • [34] W. Tian, Y. Liu, Y. Heo, D. Yan, Z. Li, J. An, S. Yang, Relative importance of factors influencing building energy in urban environment. Energy 111, 237-250 (2016). DOI: https://doi.org/10.1016/j.energy.2016.05.106.
  • [35] S. Dhulipala, G.R. Patil, Freight production of agricultural commodities in India using multiple linear regression and generalized additive modelling. Transp. Policy 97, 245-258 (2020). DOI: https://doi.org/10.1016/j.tranpol.2020.06.012.
  • [36] Q. Tan, Y. Wei, M. Wang, Y. Liu, A cluster multivariate statistical method for environmental quality management. Eng. Appl. Artif. Intell. 32, 1-9 (2014). DOI: https://doi.org/10.1016/j.engappai.2014.02.007.
  • [37] M. Maaouane, S. Zouggar, G. Krajačić, H. Zahboune, Modelling industry energy demand using multiple linear regression analysis based on consumed quantity of goods. Energy, 225, 120270 (2021). DOI: https://doi.org/10.1016/j.energy.2021.120270.
  • [38] F. Leisch, FlexMix: A general framework for finite mixture models and latent class regression in R.J. Stat. Softw. 11 (8), 1-18 (2004). DOI: https://doi.org/10.18637/jss.v011.i08.
  • [39] Y. Fu, X. Liu, S. Sarkar, T. Wu, Gaussian mixture model with feature selection: An embedded approach. Comput. Ind. Eng. 152, 107000 (2021). DOI: https://doi.org/10.1016/j.cie.2020.107000.
  • [40] Y. Li, E. Schofield, M. Gönen, A tutorial on Dirichlet process mixture modeling. J. Math. Psychol. 91, 128-144 (2019). DOI: https://doi.org/10.1016/j.jmp.2019.04.004.
  • [41] J.S. Russell, A.E. Raftery, Performance of Bayesian model selection criteria for Gaussian mixture models. Despartment of Statistics, University of Washington (2009).
  • [42] G.J. McLachlan, S.X. Lee, S.I. Rathnayake, Finite mixture models. Annu. Rev. Stat. Appl. 6, 355-378 (2019). DOI: https://doi.org/10.1146/annurev-statistics-031017-100325.
  • [43] MEP, Current and historical Alberta weather station data viewer. Ministry of Environment and Parks (MEP), Government of Alberta (2019). https://acis.alberta.ca/weather-data-viewer.jsp.
  • [44] Z. Ma, H. Li, Q. Sun, C. Wang, A. Yan, F. Starfelt, Statistical analysis of energy consumption patterns on the heat demand of buildings in district heating systems. Energy Build. 85, 464-472 (2014). DOI: https://doi.org/10.1016/j.enbuild.2014.09.048.
  • [45] M. Mittlböck, Calculating adjusted RP2 P measures for Poisson regression models. Comput. Methods Programs Biomed. 68 (3), 205-214 (2002). DOI: https://doi.org/10.1016/S0169-2607(01)00173-0.
  • [46] U. Groemping, Relative importance for linear regression in R: The package relaimpo. J. Stat. Softw., 17 (1), 1-27 (2006). DOI: https://doi.org/10.18637/jss.v017.i01.
  • [47] K. Patil, N.K. Nagwani, S. Tripathi, A parametric study of partitioning and density based clustering techniques for boxplot generation. 2018 3rd International Conference for Convergence in Technology (I2CT), 1-5 (2018). DOI: https://doi.org/10.1109/I2CT.2018.8529468.
  • [48] L. Wei, Empirical Bayes test of regression coefficient in a multiple linear regression model. Acta Math. Appl. Sin. 6, 251-262 (1990). DOI: https://doi.org/10.1007/BF02019151.
  • [49] C. Schexnayder, S.L. Weber, B.T. Brooks, Effect of truck payload weight on production. J. Constr. Eng. Manag. 125 (1), 1-7 (1999). DOI: https://doi.org/10.1061/(ASCE)0733-9364(1999)125:1(1).
  • [50] Z. Ge, Effectiveness of the T-test in multiple linear regression modeling of environmental systems. Environ. Eng. Sci. 26 (2), 377-384 (2008). DOI: https://doi.org/10.1089/ees.2008.0014.
  • [51] K. Iqbal, D. Sun, Development of thermo-regulating polypropylene fibre containing microencapsulated phase change materials. Renew. Energy 71, 473-479 (2014). DOI: https://doi.org/10.1016/j.renene.2014.05.063.
  • [52] J. Jaccard, R. Turrisi, J. Jaccard, Interaction Effects in Multiple Regression. Sage, Thousand Qaks, CA (2003).
  • [53] R.L. Moy, L.S. Chen, L.J. Kao, Multiple Linear Regression, in: R.L. Moy, L.S. Chen, L.J. Kao (Eds.) Study Guide for Statistics for Business and Financial Economics: A Supplement to the Textbook by Cheng-Few Lee. John C. Lee and Alice C. Lee, Springer International Publishing, Cham, 223-240 (2015). DOI: https://doi.org/10.1007/978-3-319-11997-7_15.
  • [54] D. Kyburz, C. Gabay, B.A. Michel, A. Finckh, The long-term impact of early treatment of rheumatoid arthritis on radiographic progression: a population-based cohort study. Rheumatology 50 (6), 1106-1110 (2011). DOI: https://doi.org/10.1093/rheumatology/keq424.
  • [55] M. Lunt, Introduction to statistical modelling 2: categorical variables and interactions in linear regression. Rheumatology 54 (7), 1141-1144 (2015). DOI: https://doi.org/10.1093/rheumatology/ket172.
  • [56] Y. Liu, J. Wang, Z. Wang, X. Lu, M. Avdeev, S. Shi, C. Wang, T. Yu, Predicting creep rupture life of Ni-based single crystal superalloys using divide-and-conquer approach based machine learning. Acta Mater. 195, 454-467 (2020). DOI: https://doi.org/10.1016/j.actamat.2020.05.001.
  • [57] M. Ho Park, M. Ju, S. Jeong, J. Young Kim, Incorporating interaction terms in multivariate linear regression for post-event flood waste estimation. Waste Manag. 124, 377-384 (2021). DOI: https://doi.org/10.1016/j.wasman.2021.02.004.
  • [58] V.F. Navarro Torres, J. Ayres, P.L.A. Carmo, C.G.L. Silveira, Haul productivity optimization: An assessment of the optimal road grade. In: E. Widzyk-Capehart, A. Hekmat, R. Singhal (Eds.) Proceedings of the 27th International Symposium on Mine Planning and Equipment Selection - MPES 2018, Springer International Publishing, Cham, 345-353 (2019). DOI: https://doi.org/10.1007/978-3-319-99220-4_28.
  • [59] X. Sun, H. Zhang, F. Tian, L. Yang, The use of a machine learning method to predict the real-time link travel time of open-pit trucks, Math. Probl. Eng. 2018, 4368045 (2018). DOI: https://doi.org/10.1155/2018/4368045.
  • [60] A.S. Shirkhorshidi, S. Aghabozorgi, T.Y. Wah, T. Herawan, Big data clustering: A review, in: B. Murgante, S. Misra, A.M.A.C. Rocha, C. Torre, J.G. Rocha, M.I. Falcão, D. Taniar, B.O. Apduhan, O. Gervasi (Eds.) Computational Science and Its Applications – ICCSA 2014. Springer International Publishing, Cham, 707-720 (2014). DOI: https://doi.org/10.1007/978-3-319-09156-3_49.
  • [61] C. Wu, K.W. Chau, Y. Li, Predicting monthly streamflow using data-driven models coupled with data-preprocessing techniques. Water Resour. Res. 45, W08432 (2009). DOI: https://doi.org/10.1029/2007WR006737.
  • [62] S. Ma, G. Huang, K. Obais, S.W. Moon, W.V. Liu, Hysteresis loss of ultra-large off-the-road tire rubber compounds based on operating conditions at mine sites. Proc. Inst. Mech. Eng. D: J. Automob. Eng. 236 (2-3), 439-450 (2022). DOI: https://doi.org/10.1177/09544070211015525.
  • [63] K. Drosou, C. Koukouvinos, Proximal support vector machine techniques on medical prediction outcome. J. Appl. Stat. 44 (3), 533-553 (2017). DOI: https://doi.org/10.1080/02664763.2016.1177499.
  • [64] M. Cakir, M.A. Guvenc, S. Mistikoglu, The experimental application of popular machine learning algorithms on predictive maintenance and the design of IIoT based condition monitoring system. Comput. Ind. Eng. 151, 106948 (2021). DOI: https://doi.org/10.1016/j.cie.2020.106948.
  • [65] R. Tadeusiewicz, Neural networks in mining sciences – general overview and some representative examples. Arch. Min. Sci. 60 (4), 971-984 (2015). DOI: https://doi.org/10.1515/amsc-2015-0064.
Uwagi
PL
Opracowanie rekordu ze środków MEiN, umowa nr SONP/SP/546092/2022 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2022-2023)
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-f88f25aa-0ed6-43e5-bc02-9da1b730574c
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.