Abstract:
Quickly identifying the type of water inrush source is a key part of mine water damage prevention and control.To realize the accurate identification of mine water sources from the Pingdingshan Coalfield, water samples from different aquifers, such as surface water, Quaternary pore water, Carboniferous tuff karst water, Permian sandstone water, and Cambrian tuff karst water, were extracted, respectively, and the key discriminatory indexes, Na
++K
+,Ca
2+,Mg
2+,Cl
-,SO
42-,and HCO
-3,were selected for the analysis.To avoid model overfitting due to the interference of outlier data, the paper utilizes box plots to show the discrete distributions of the data accurately, and twenty sets of outliers are quickly identified from the data to clean the study data.The cleaned data is divided into learning and test samples in the ratio of 8∶2,and the learning samples are fed into the Light Gradient Boosting Machine(LightGBM)for model training.The tree-structured Parson estimator(TPE)is used to optimize the main parameters of LightGBM and construct the TPE-LightGBM model.Comparing the results of LightGBM with those of TPE-LightGBM,the model's accuracy is improved by 13.9%,which indicates that the TPE algorithm is effective.To further validate the performance of the model, the experimental results are compared with the Random Search-Multi-Layer Perceptron Machine(RS-MLP)and Genetic Algorithm-Extreme Gradient Boosting Tree(GA-XGBoost)models.The results show that the TPE-LightGBM model has higher accuracy and lower generalization error, which indicates that TPE-LightGBM is more advantageous and applicable in water source identification.The contribution of the variables was quantified using the Gini coefficients, and based on the calculations, it is clear that Ca
2+ has the highest contribution, so it is necessary to pay attention to the changes in the concentration of Ca
2+.