Tytuł artykułu
Treść / Zawartość
Pełne teksty:
Identyfikatory
Warianty tytułu
Analiza dyskryminacyjna źródeł wycieków wody do kopalni na podstawie trójwymiarowej interpolacji danych o zdarzeniach rzadkich
Języki publikacji
Abstrakty
When the distribution of water quality samples is roughly balanced, the Bayesian criterion model of water-inrush source generally can obtain relatively accurate results of water-inrush source identification. However, it is often difficult to achieve desired classification results when training samples are imbalanced. Sample imbalance is common in the source identification of mine water-inrush. Therefore, we propose a three-dimensional (3D) spatial resampling method based on rare water quality samples, which achieves the balance of water quality samples. Based on the virtual water sample points distributed by the 3D grid, the method uses the 3D Inverse Distance Weighting (IDW) method to interpolate the groundwater ion concentration of the virtual water samples to achieve oversampling of rare water samples. Case study in Gubei Coal Mine shows that the method improves overall discriminant accuracy of the Bayesian criterion model by 5.26%, from 85.26% to 90.69%. In particular, the discriminative precision of the rare class is improved from 0% to 83.33%, which indicates that the method can improve the discriminant accuracy of the rare class to large extent. In addition, this method increases the Kappa coefficient of the model by 19.92%, from 52.26% to 72.19%, increasing the degree of consistency from “general” to “significant”. Our research is of significance to enriching and improving the theory of prevention and treatment of mine water damage.
W przypadku zrównoważonych danych o jakościowym rozkładzie próbek, zastosowanie kryterium Bayesowskiego do modelowania źródeł wycieków daje stosunkowo dokładne wyniki w analizie dyskryminacyjnej źródeł wycieków wody kopalnianej. Jednakże w przypadku niezrównoważonych danych, pożądane efekty kategoryzacji są niezmiernie trudne do uzyskania. Dane o składzie próbek są w znacznej mierze niezrównoważone, i jest to powszechny problem napotykany przy identyfikacji źródeł wycieków. W obecnej pracy zaproponowano więc trójwymiarową (3D) metodę powtórnego próbkowania z wykorzy-staniem próbek wód z kategorii zdarzeń rzadkich, tak by uzyskać zrównoważony zbiór danych. W oparciu o wirtualne punkty na trójwymiarowej siatce, wykorzystano trójwymiarową metodęśredniej ważonej odległością (Inverse Distance Weighing – IDW) do interpolacji stężenia jonów w wodach gruntowych w wirtualnych próbkach wody, w celu nadpróbkowania dla kategorii zdarzeń rzadkich. Studium przypadku kopalni węgla Gubei pokazuje, że metoda poprawia dokładność dopasowania modelu w oparciu o kryterium Bayesowskie o 5.25% (z 85.26% na 90.96 %). W szczególności, dokładność rozróżniania i dyskryminacji próbek należących do kategorii zdarzeń rzadkich wzrasta od 0% do 83.33%, co oznacza bardzo znaczna poprawę. Ponadto, wartość współczynnika Kappa wzrasta o 19.92%, od 52.26 % do 72.19%, tym samym podnosząc poziom zgodności metody z poziomu ogólnego na „znaczący”. Prowadzone przez nas badania mają poważne znaczenie z punktu widzenia udoskonalenia teorii leżących u podstaw metod i technik zapobiegania i kontroli wycieków wód kopalnianych.
Wydawca
Czasopismo
Rocznik
Tom
Strony
321--333
Opis fizyczny
Bibliogr. 28 poz., rys., tab., wykr.
Twórcy
autor
- School of Civil & Hydraulic Engineering, Hefei University of Technology
autor
- School of Civil & Hydraulic Engineering, Hefei University of Technology
- School of Resource & Environmental Engineering, Hefei University of Technology
autor
- School of Resource & Environmental Engineering, Hefei University of Technology
autor
- School of Resource & Environmental Engineering, Hefei University of Technology
autor
- School of Resource & Environmental Engineering, Hefei University of Technology
Bibliografia
- [1] Aleegria F.C., Serra A.C., 2000. Computer vision applied to the automatic calibration of measuring instruments [J]. Measurement 28 (3), 185-195.
- [2] Bagyaraj M., Ramkumar T., Venkatramanan S., 2013. Application of Remote Sensing and GIS Analysis for Identifying Groundwater Potential Zone in Parts of Kodaikanal Taluk [J]. Frontiers of Earth Science 7 (1), 65-75.
- [3] Ben Xudong, Guo Haiying, XIE Yiwei, 2006. The Application of Fuzzy Comprehensive Evaluation to Discrimination of Mine Water-Inrush Source [J]. Mining Safety & Environmental Protection (03), 57-59 (in Chinese).
- [4] Chai X., Deng L., Yang Q., 2004. Test-cost sensitive naïve bayes classification [C]. IEEE International Conference on Data Mining, 2004(ICDM’04). IEEE,51-58.
- [5] Chawla N. V., Bowyer K.W., Hall L.O., 2002. SMOTE:synthetic minority over-sampling technique [J]. Journal of artificial intelligence researth 16 (1), 321-357.
- [6] Chen X., Wasikowski M., 2008. Fast:a roc-based feature selection metric for small samples and imbalanced data classification problems [C]. 14th ACM SIGKDD International Coference on Knowledge Discovery and Data Mining. ACM, 124-132.
- [7] Chen You, Chen Xueqi, Li Yang, 2007. Lightweight Intrusion Detection System Based on Feature Selection [J]. Journal of Software (07): 1639-1651 (in Chinese).
- [8] Li B., Jia Z., 2009. Some results on condition numberss of the scaled total least squares problem [J]. Linear Algebra&Its Applications 435(3), 674-686.
- [9] Liang Tianming, Xu Xinzheng, Xiao Pengcheng, 2017. A new image classification method based on modified condensed nearest neighbor and convolutional neural networks [J]. Pattern Recognition Letters 94, 105-111.
- [10] Ma Lei, 2010. A GIS-Based System for Mine Water-Inrush Source Quick Discrimination with Comprehensive Information [J]. Hei University of Technology (in Chinese).
- [11] Manzi M., Durrheim R.J., Hein K., 2012. 3D Edge Detection Seismic Attributes Used to Map Potential Conduits for Water and Methane in Deep Gold Mines in the Witwatersrand Basin [J]. Geophysics 77 (5), 133-147 (South Africa).
- [12] Mishra B.K., Shkla P., Madhu S.V., 2018. Prevalence of double diabetes in youth onset diabetes patients fromeast Delhi and neighboring NCR region [J]. Diabetes & Metabolic Syndrome: Clinical Research & Reviews 12, 839-842.
- [13] Moayedikia A., Ong K.L., Boo Y., 2017. Feature selection for high dimensional imbalanced class data using hormony search [J]. Engineering Application of Artificial Intelligence 55, 38-49.
- [14] Pantaleoni E., 2013. Combining a Road Pollution Dispersion Model with GIS to Determine Carbon Monoxide Concentration in Tennessee [J]. Environmental Monitoring & Assessment 185 (3), 2705-2722.
- [15] Rina K., Datta P., Singh C., 2012. Ch aracterization and Evaluation of Processes Governing the Groundwater Quality in Partso of the Sabarmati Basin,Gujarat Using Hydrochemistry Integrated with GIS [J]. Hydrological Processes 26 (10), 1538-1551.
- [16] Umar M., Waseem A., Sabir M. A., 2013. The Impact of Geology of Recharge Areas On Groundwater Quality: A Case Study of Zhob River Basin, Pakistan [J]. Clean-soil Air Water 41 (2), 119-127.
- [17] Vincenzi V., Gargini A., Goldscheider N., 2009. Using Tracer Tests and Hydrological Observations to Evaluate Effects of Tunnel Drainage On Groundwater and Surface Waters in the Northern Apennines [J]. Hydrogeology Journal 17 (1), 135-150 (Italy).
- [18] Wen Yimin, Li Jian, Du Feiming, 2009. Using Ensemble Learning Strategy to Handle Class Imbalance Problems [J]. Computing Technology and Automation 110 (02): 103-106 (in Chinese).
- [19] Weiss G.M., 1995. Learning with rare cases and small disjuncts //Proceedings of the 12th International Conference on Machine Learning [J]. San Francisco: Morgan Kaufmann, 58-565.
- [20] Wang Dachun, Zhang Renquan, Shi Yihong, 1995. Basis Hydrogeology Beijing: Geological Publishing House, 113 -115(in Chinese).
- [21] Xu Fei, Zheng Changjiang, Yang Cheng, 2012. Identification Method of Traffic Congestion Based on Resampling [J]. Journal of Highway and Transportation Research and Development 203 (11), 140-144 (in Chinese).
- [22] Xu Wenning, Wang Pengxin, Han Ping, 2011. Application of Kappa coefficient to accuracy assessments of drought forecasting model:a case study of Guanzhong Plain [J]. Journal of Natural Disasters (06), 81-86 (in Chinese).
- [23] Ye Zhifei, Wen Yimin, Lv Baoliang, 2009. A survey of imbalanced pattern classification problems [J]. CAAI Transactions on Intelligent Systems 16 (02), 148-156 (in Chinese).
- [24] Yuan Xingmei, Yang Ming, Yang Yang, 2013. A structured SVM integrated classifier for unbalanced data [J]. PR&AI 26 (3), 215-320 (in Chinese).
- [25] Zhang Chunlei, Qian Jiazhong, Zhao Weidong, 2010. The Application of Bayesian Approach to Discrimination of Mine Water-Inrush Source [ J]. Coal Geology & Exploration 220 (04), 34-37 (in Chinese).
- [26] Zhao Nan, Zhang Xiaofang, Zhang Lijun, 2018. A Survey of Unbalanced Data Classification Research [J]. Computer Science 45 (6a), 22-27 (in Chinese).
- [27] Zou Peng, Yu Bo, Wang Xianquan, 2011. Cost-Sensitive Learning Method with Data Drift in Customer Segmentation [J]. Journal of Harbin Institute of Technology 43 (01), 119-124 (in Chinese).
- [28] Zhu Qiuan, Zhang Wangchang, Yu Yunhui, 2004. The Spatial Interpolations in GIS [J]. Journal of Jiangxi Normal University (Natural Sciences Edition) (02), 183-188 (in Chinese).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-fef440d8-7017-4197-9f56-f630347b7aa3