An improved method of groundwater model structural uncertainty analysis
-
摘要: 高斯过程回归(GPR)是一种基于贝叶斯理论的监督学习算法,在基于数据驱动(DDM)的模型结构不确定性分析中具有广泛应用。目前研究中通常假设物理参数和超参独立并进行联立识别,这会导致参数补偿。文章提出两步识别DDM量化模型结构误差,并通过2个地下水模型案例,分别在不考虑模型结构误差、考虑模型结构误差(联立识别DDM、两步识别DDM)的情况下,对比分析了参数识别和模型预测结果。结果表明,不考虑模型结构误差直接进行参数识别时,为补偿结构误差,物理参数会过度拟合,从而影响模型预测效果。基于DDM刻画模型结构偏差时,物理参数和超参的独立性假设会影响参数识别结果。提出的两步识别DDM法没有假设物理参数和超参独立,能够减少参数过度拟合效应,从而更准确刻画结构误差,有效提高了模型的预测性能。
-
关键词:
- 模型结构不确定性 /
- 数据驱动 /
- 高斯过程回归 /
- 马尔科夫链蒙特卡洛模拟 /
- 溶质运移
Abstract: Gaussian Process Regression (GPR) is a supervised learning algorithm based on Bayesian theory, which is widely used in model structural uncertainty analysis based on data-driven method (DDM). In this study, it is usually assumed that the physical parameters and hyperparameters are independent and identified jointly, which will lead to parameter compensation. In this paper, a two-stage based DDM method is proposed to quantify the model structural errors, and two case studies are used to compare and analyze the results of parameter identification and model prediction with considering the model structural errors (joint calibration based DDM and two-stage based DDM) and without considering the model structural errors. The results show that when the parameters are identified directly without considering the model structural errors, the parameters will be overfitted and compensate the model structural errors, thereby affecting the model prediction performance. When considering the model structure deviation based on DDM, the independence assumption of physical parameters and hyperparameters will affect the parameter estimation results. The proposed two-stage based DDM method does not assume that the physical parameters and hyperparameters are independent, and can reduce parameter overfitting caused by the independence assumption of physical parameters and hyperparameters, portraying more accurate structural errors and effectively improving the model prediction performance. -
-
表 1 模型参数的先验分布
Table 1 Prior distributions of model parameters
参数 先验分布 Co /(mol·L−1) Uniform on [45.0,52.0] V/(cm·d−1) Uniform on [45.0,52.0] λ Gamma, k=5, θ=0. 2, on [0.1,0.8] σ Exponential, μ=0. 25, on [3.0,10.0] σδ Uniform on [0.1,0.5] 表 2 模型预测性能指标统计结果
Table 2 Statistics of model prediction performance
识别期 验证期 RMSE MAE MRE RMSE MAE MRE 不考虑结构误差 4.6056 4.1056 0.1974 8.9583 6.7699 0.2279 联立识别DDM 5.3165 4.9521 0.2225 8.2356 6.3889 0.2240 两步识别DDM 4.6500 4.2916 0.2094 7.7770 5.6255 0.2068 表 3 模型参数的先验分布
Table 3 Prior distributions of model parameters
参数 先验分布 Krr/(m·d−1) Uniform on [8.0,14.0] M/d Uniform on [75.0,90.0] λ Gamma, k=5, θ=0. 2, on [0.2,0.6] σ Exponential, μ=0. 25, on [5.0,10.0] σδ Uniform on [0.05,0.5] 表 4 模型预测性能指标统计结果
Table 4 Statistics of model prediction performance
识别期 验证期 RMSE MAE MRE RMSE MAE MRE 不考虑结构误差 5.2084 3.5591 0.2226 3.5764 3.1728 0.4488 联立识别DDM 8.1380 7.9843 0.6576 1.8912 1.5249 0.2253 两步识别DDM 8.0390 7.7821 0.6315 1.7184 1.3589 0.1954 -
[1] 薛禹群. 中国地下水数值模拟的现状与展望[J]. 高校地质学报,2010,16(1):1 − 6. [XUE Yuqun. Present situation and prospect of groundwater numerical simulation in China[J]. Geological Journal of China Universities,2010,16(1):1 − 6. (in Chinese with English abstract) DOI: 10.3969/j.issn.1006-7493.2010.01.001 [2] DUMEDAH G, WALKER J P. Assessment of model behavior and acceptable forcing data uncertainty in the context of land surface soil moisture estimation[J]. Advances in Water Resources,2017,101:23 − 36. DOI: 10.1016/j.advwatres.2017.01.001
[3] 周燕怡, 王旭升. 巴丹吉林沙漠潜水蒸发的数值模拟研究[J]. 水文地质工程地质,2019,46(5):44 − 54. [ZHOU Yanyi, WANG Xusheng. Numerical simulation of groundwater evaporation in the Badain Jaran Desert of China[J]. Hydrogeology & Engineering Geology,2019,46(5):44 − 54. (in Chinese with English abstract) DOI: 10.16030/j.cnki.issn.1000-3665.2019.05.07 [4] 高烨, 梁收运, 王申宁, 等. 地下水数值模拟不确定性分析研究进展[J]. 地下水,2020,42(1):28 − 31. [GAO Ye, LIANG Shouyun, WANG Shenning, et al. Research progress on uncertainty analysis of groundwater numerical simulation[J]. Ground Water,2020,42(1):28 − 31. (in Chinese with English abstract) [5] 陈梦佳, 吴剑锋, 孙晓敏, 等. 地下水典型非水相液体污染运移模拟的尺度提升研究[J]. 水文地质工程地质,2020,47(1):11 − 18. [CHEN Mengjia, WU Jianfeng, SUN Xiaomin, et al. Upscaling of PCE transport modeling based on UTCHEM in heterogeneous porous media[J]. Hydrogeology & Engineering Geology,2020,47(1):11 − 18. (in Chinese with English abstract) DOI: 10.16030/j.cnki.issn.1000-3665.201901032 [6] REFSGAARD J C, VAN DER SLUIJS J P, BROWN J, et al. A framework for dealing with uncertainty due to model structure error[J]. Advances in Water Resources,2006,29(11):1586 − 1597. DOI: 10.1016/j.advwatres.2005.11.013
[7] WATSON T A, DOHERTY J E, CHRISTENSEN S. Parameter and predictive outcomes of model simplification[J]. Water Resources Research,2013,49(7):3952 − 3977. DOI: 10.1002/wrcr.20145
[8] WU J C, ZENG X K. Review of the uncertainty analysis of groundwater numerical simulation[J]. Chinese Science Bulletin,2013,58(25):3044 − 3052. DOI: 10.1007/s11434-013-5950-8
[9] DOHERTY J, WELTER D. A short exploration of structural noise[J]. Water Resources Research,2010,46(5):W05525.
[10] DOHERTY J, CHRISTENSEN S. Use of paired simple and complex models to reduce predictive bias and quantify uncertainty[J]. Water Resources Research,2011,47(12):W12534.
[11] ERDAL D, NEUWEILER I, HUISMAN J A. Estimating effective model parameters for heterogeneous unsaturated flow using error models for bias correction[J]. Water Resources Research,2012,48(6):W06530.
[12] WHITE J T, DOHERTY J E, HUGHES J D. Quantifying the predictive consequences of model error with linear subspace analysis[J]. Water Resources Research,2014,50(2):1152 − 1173. DOI: 10.1002/2013WR014767
[13] DRAPER D. Assessment and propagation of model uncertainty[J]. Journal of the Royal Statistical Society: Series B (Methodological),1995,57(1):45 − 70. DOI: 10.1111/j.2517-6161.1995.tb02015.x
[14] HOETING J A, MADIGAN D, VOLINSKY R C T. Bayesian model averaging: a tutorial[J]. Statistical Science,1999,14(4):382 − 401.
[15] 杜新忠, 李叙勇, 王慧亮, 等. 基于贝叶斯模型平均的径流模拟及不确定性分析[J]. 水文,2014,34(3):6 − 10. [DU Xinzhong, LI Xuyong, WANG Huiliang, et al. Multi-model ensemble runoff simulation based on Bayesian model averaging method and model structure uncertainty analysis[J]. Journal of China Hydrology,2014,34(3):6 − 10. (in Chinese with English abstract) DOI: 10.3969/j.issn.1000-0852.2014.03.002 [16] 王亮. 贝叶斯模型平均方法研究综述与展望[J]. 技术经济与管理研究,2016(3):19 − 23. [WANG Liang. Overview and prospect of Bayesian model averaging[J]. Technoeconomics & Management Research,2016(3):19 − 23. (in Chinese with English abstract) DOI: 10.3969/j.issn.1004-292X.2016.03.004 [17] 王倩, 师鹏飞, 宋培兵, 等. 基于贝叶斯模型平均法的洪水集合概率预报[J]. 水电能源科学,2016,34(6):64 − 66. [WANG Qian, SHI Pengfei, SONG Peibing, et al. Multi-model ensemble flood probability forecasting based on BMA[J]. Water Resources and Power,2016,34(6):64 − 66. (in Chinese with English abstract) [18] 江善虎, 任立良, 刘淑雅, 等. 基于贝叶斯模型平均的水文模型不确定性及集合模拟[J]. 中国农村水利水电,2017(1):107 − 112. [JIANG Shanhu, REN Liliang, LIU Shuya, et al. An analysis of hydrological modeling and ensemble simulation uncertainty using the Bayesian model averaging[J]. China Rural Water and Hydropower,2017(1):107 − 112. (in Chinese with English abstract) DOI: 10.3969/j.issn.1007-2284.2017.01.025 [19] ROJAS R, FEYEN L, DASSARGUES A. Conceptual model uncertainty in groundwater modeling: Combining generalized likelihood uncertainty estimation and Bayesian model averaging[J]. Water Resources Research,2008,44(12):W12418.
[20] LIU Z, MERWADE V. Separation and prioritization of uncertainty sources in a raster based flood inundation model using hierarchical Bayesian model averaging[J]. Journal of Hydrology,2019,578:124100. DOI: 10.1016/j.jhydrol.2019.124100
[21] LU D, YE M, CURTIS G P. Maximum likelihood Bayesian model averaging and its predictive analysis for groundwater reactive transport models[J]. Journal of Hydrology,2015,529(3):1859 − 1873.
[22] CAO T T, ZENG X K, WU J C, et al. Integrating MT-DREAMzs and nested sampling algorithms to estimate marginal likelihood and comparison with several other methods[J]. Journal of Hydrology,2018,563:750 − 765. DOI: 10.1016/j.jhydrol.2018.06.055
[23] DEMISSIE Y K, VALOCCHI A J, MINSKER B S, et al. Integrating a calibrated groundwater flow model with error-correcting data-driven models to improve predictions[J]. Journal of Hydrology,2009,364(3/4):257 − 271.
[24] KHALIL A, ALMASRI M N, MCKEE M, et al. Applicability of statistical learning algorithms in groundwater quality modeling[J]. Water Resources Research,2005,41(5):W05010.
[25] TESORIERO A J, GRONBERG J A, JUCKEM P F, et al. Predicting redox-sensitive contaminant concentrations in groundwater using random forest classification[J]. Water Resources Research,2017,53(8):7316 − 7331. DOI: 10.1002/2016WR020197
[26] XU T F, VALOCCHI A J. A Bayesian approach to improved calibration and prediction of groundwater models with structural error[J]. Water Resources Research,2015,51(11):9290 − 9311. DOI: 10.1002/2015WR017912
[27] XU T F, VALOCCHI A J, YE M, et al. Quantifying model structural error: Efficient Bayesian calibration of a regional groundwater flow model using surrogates and a data-driven error model[J]. Water Resources Research,2017,53(5):4084 − 4105. DOI: 10.1002/2016WR019831
[28] PAN Y, ZENG X K, XU H X, et al. Assessing human health risk of groundwater DNAPL contamination by quantifying the model structure uncertainty[J]. Journal of Hydrology,2020,584:124690. DOI: 10.1016/j.jhydrol.2020.124690
[29] REICHERT P, SCHUWIRTH N. Linking statistical bias description to multiobjective model calibration[J]. Water Resources Research,2012,48(9):W09543.
[30] BRYNJARSDÓTTIR J, OʼHAGAN A. Learning about physical parameters: the importance of model discrepancy[J]. Inverse Problems,2014,30(11):114007. DOI: 10.1088/0266-5611/30/11/114007
[31] KENNEDY M C, O'HAGAN A. Bayesian calibration of computer models[J]. Journal of the Royal Statistical Society: Series B Statistical Methodology,2001,63(3):425 − 464. DOI: 10.1111/1467-9868.00294
[32] RASMUSSEN C E, WILLIAMS C K I. Gaussian processes for machine learning[M]. Cambridge: MIT Press, 2006: 69-106.
[33] VRUGT J A, BRAAK C J F T, DIKS C G H, et al. Accelerating Markov Chain Monte Carlo simulation by differential evolution with self-adaptive randomized subspace sampling[J]. International Journal of Nonlinear Sciences & Numerical Simulation,2009,10(3):273 − 290.
[34] LALOY E, VRUGT J A. High-dimensional posterior exploration of hydrologic models using multiple-try DREAM(ZS) and high-performance computing[J]. Water Resources Research,2012,50(3):182 − 205.
[35] KASS R E, RAFTERY A E. Bayesian factors[J]. Journal of the American statistical association,1995,90(430):773 − 795. DOI: 10.1080/01621459.1995.10476572
[36] LIU P G, ELSHALL A S, YE M, et al. Evaluating marginal likelihood with thermodynamic integration method and comparison with several other numerical methods[J]. Water Resources Research,2016,52(2):734 − 758. DOI: 10.1002/2014WR016718
[37] SMOLYAK S A. Quadrature and interpolation formulas for tensor products of certain classes of functions[J]. Soviet Math Dokl,1963(4):240 − 243.
[38] MA X, ZABARAS N. An adaptive hierarchical sparse grid collocation algorithm for the solution of stochastic differential equations[J]. Journal of Computational Physics,2009,228(8):3084 − 3113. DOI: 10.1016/j.jcp.2009.01.006
[39] ZENG X K, YE M, BURKARDT J, et al. Evaluating two sparse grid surrogates and two adaptation criteria for groundwater Bayesian uncertainty quantification[J]. Journal of Hydrology,2016,535:120 − 134. DOI: 10.1016/j.jhydrol.2016.01.058
[40] 侯泽宇, 卢文喜, 王宇. 基于替代模型的地下水DNAPLs污染源反演识别[J]. 中国环境科学,2019,39(1):188 − 195. [HOU Zeyu, LU Wenxi, WANG Yu. Surrogate-based source identification of DNAPLs-contaminated groundwater[J]. China Environmental Science,2019,39(1):188 − 195. (in Chinese with English abstract) DOI: 10.3969/j.issn.1000-6923.2019.01.021 [41] 高鑫宇, 曾献奎, 吴吉春. 基于改进稀疏网格替代模拟的地下水DNAPLs运移不确定性分析[J]. 水文地质工程地质,2020,47(1):1 − 10. [GAO Xinyu, ZENG Xiankui, WU Jichun. Uncertainty analysis of groundwater DNAPLs migration based on improved sparse grids surrogate model[J]. Hydrogeology & Engineering Geology,2020,47(1):1 − 10. (in Chinese with English abstract)