|Table of Contents|

Comparison of Performance of Machine Learning Models on PM2.5 Concentration Estimation Under Different Variable Combination Scenarios—Taking Guanzhong Region of Shaanxi, China as an Example(PDF)

《地球科学与环境学报》[ISSN:1672-6561/CN:61-1423/P]

Issue:
2025年第04期
Page:
829-843
Research Field:
黄河流域生态保护和高质量发展专刊(下)
Publishing date:

Info

Title:
Comparison of Performance of Machine Learning Models on PM2.5 Concentration Estimation Under Different Variable Combination Scenarios—Taking Guanzhong Region of Shaanxi, China as an Example
Author(s):
XU Cui-ling1*HU Xue2YUAN Bing3GUO Can1ZHAO Li-hua1
(1. School of Geological Engineering and Geomatics, Chang'an University, Xi'an 710054, Shaanxi, China; 2. Survey Office of National Bureau Statistical in Shanxi, Taiyuan 030001, Shanxi, China; 3. PowerChina Fujian Electric Power Engineering Co., Ltd., Fuzhou 350003, Fujian, China)
Keywords:
atmospheric environment PM2.5 concentration RF model GBT model LightGBM model multivariate combination machine learning Shaanxi
PACS:
X513; TP181
DOI:
10.19814/j.jese.2024.12020
Abstract:
Obtaining high-resolution and high-precision continuous PM2.5 concentration is conducive to revealing the air quality distribution pattern, which is of great significance to environmental governance, air pollution prevention and sustainable economic development. Based on the PM2.5 ground monitoring data, aerosol optical depth(AOD)data, meteorological data, geographic data and co-monitored pollutants data in Guanzhong region of Shaanxi from 2020 to 2022, 11 variable combination scenarios were designed according to the variable attributes; random forest(RF)model, gradient boosting tree(GBT)model and light gradient boosting machine(LightGBM)model under different scenarios were constructed to estimate PM2.5 concentration in Guanzhong region, and the accuracy of estimation results by three model under different scenarios was compared. The results show that ① the estimation effects by three models under multivariate, bivariate and univariate combination scenarios are in the descending order; ② under the same scenario, LightGBM model performs the best among three models, and the model under multivariate combination scenario has the best fitting results among 11 variable combination scenarios, with an determination coefficient(R2)of 0.94, an root mean square error(RMSE)of 9.31 μg·m-3, and an mean absolute error(MAE)of 6.27 μg·m-3; ③ compared with ChinaHighPM2.5 and VANPM2.5 datasets, the result estimated by LightGBM model under multivariate combination scenario not only has consistency with the PM2.5 concentration in Guanzhong region during 2020-2022 from ChinaHighPM2.5 and VANPM2.5 datasets in terms of spatial distribution, but also has more advantages in detail portrayal and estimation accuracy, which can improve the accuracy and reliability of the estimation results.

References:

[1] XUE T T,WANG L M,ZHANG X,et al.Ambient Fine Particulate Matter and Life's Essential 8 and Mortality in Adults in China:A Nationwide Retrospective Cohort Study[J].Preventive Medicine,2024,186:108094.
[2] ZHAO C,PU W Y,NIU M Y,et al.Respiratory Exposure to PM2.5 Soluble Extract Induced Chronic Lu-ng Injury by Disturbing the Phagocytosis Function of Macrophage[J].Environmental Science and Pollution Research,2022,29(10):13983-13997.
[3] POPE III C A,BURNETT R T,THUN M J,et al.Lung Cancer,Cardiopulmonary Mortality,and Long-term Exposure to Fine Particulate Air Pollution[J].JAMA,2002,287(9):1132-1141.
[4] SAMET J M,DOMINICI F,CURRIERO F C,et al.Fine Particulate Air Pollution and Mortality in 20 U.S. Cities,1987-1994[J].New England Journal of Medicine,2000,343:1742-1749.
[5] ZANOBETTI A,FRANKLIN M,KOUTRAKIS P,et al.Fine Particulate Air Pollution and Its Components in Association with Cause-specific Emergency Admissions[J].Environmental Health,2009,8:58.
[6] GUO Y M,JIA Y P,PAN X C,et al.The Association Between Fine Particulate Air Pollution and Hospital Emergency Room Visits for Cardiovascular Diseases in Beijing,China[J].Science of the Total Environment,2009,407(17):4826-4830.
[7] 郭新彪,魏红英.大气PM2.5对健康影响的研究进展[J].科学通报,2013,58(13):1171-1177.
GUO Xin-biao,WEI Hong-ying.Progress on the Health Effects of Ambient PM2.5 Pollution[J].Chinese Science Bulletin,2013,58(13):1171-1177.
[8] 王跃思,张军科,王莉莉,等.京津冀区域大气霾污染研究意义、现状及展望[J].地球科学进展,2014,29(3):388-396.
WANG Yue-si,ZHANG Jun-ke,WANG Li-li,et al.Researching Significance,Status and Expectation of Haze in Beijing-Tianjin-Hebei Region[J].Advances in Earth Science,2014,29(3):388-396.
[9] YANG Y,LUO L W,SONG C,et al.Spatiotemporal Assessment of PM2.5-related Economic Losses from Health Impacts During 2014-2016 in China[J].International Journal of Environmental Research and Public Health,2018,15(6):1278.
[10] 吴 迪,高枞亭,李建平,等.东北地区PM2.5质量浓度遥感估算与时空分布特征[J].地理科学,2023,43(10):1869-1878.
WU Di,GAO Zong-ting,LI Jian-ping,et al.Remote Sensing Estimation and Spatial-temporal Distribution of PM2.5 Concentration in Northeast China[J].Scientia Geographica Sinica,2023,43(10):1869-1878.
[11] DONG F,YU B L,PAN Y L.Examining the Synergistic Effect of CO2 Emissions on PM2.5 Emissions Reduction:Evidence from China[J].Journal of Cleaner Production,2019,223:759-771.
[12] HAN L J,ZHOU W Q,LI W F,et al.Impact of Urbanization Level on Urban Air Quality:A Case of Fine Particles(PM2.5)in Chinese Cities[J].Environmental Pollution,2014,194:163-170.
[13] PUI D Y H,CHEN S C,ZUO Z L.PM2.5 in China:Measurements,Sources,Visibility and Health Effe-cts,and Mitigation[J].Particuology,2014,13:1-26.
[14] 刘基伟,闵素芹,金梦迪.基于分布式感知深度神经网络的高分辨率PM2.5值估算[J].地理学报,2021,76(1):191-205.
LIU Ji-wei,MIN Su-qin,JIN Meng-di.High Resolution PM2.5 Estimation Based on the Distributed Perception Deep Neural Network Model[J].Acta Geographica Sinica,2021,76(1):191-205.
[15] 向 娟,陶明辉,郭 玲,等.基于卫星遥感的近地面PM2.5浓度反演进展[J].遥感学报,2022,26(9):1757-1776.
XIANG Juan,TAO Ming-hui,GUO Ling,et al.Progress of Near-surface PM2.5 Concentration Retrieve Based on Satellite Remote Sensing[J].National Remote Sensing Bulletin,2022,26(9):1757-1776.
[16] 刘保献,李 倩,孙瑞雯,等.2018~2020年北京市大气PM2.5污染特征及改善原因[J].环境科学,2023,44(5):2409-2420.
LIU Bao-xian,LI Qian,SUN Rui-wen,et al.Pollution Characteristics and Factors Influencing the Reduction in Ambient PM2.5 in Beijing from 2018 to 2020[J].Environmental Science,2023,44(5):2409-2420.
[17] 李同文.顾及时空特征的大气PM2.5神经网络遥感反演[D].武汉:武汉大学,2020.
LI Tong-wen.Research on Atmospheric PM2.5 Neural Network Remote Sensing Retrieval Considering Spa-tiotemporal Characteristics[D].Wuhan:Wuhan University,2020.
[18] 胡占占,陈传法,胡保健.基于时空XGBoost的中国区域PM2.5浓度遥感反演[J].环境科学学报,2021,41(10):4228-4237.
HU Zhan-zhan,CHEN Chuan-fa,HU Bao-jian.Estimating PM2.5 Concentrations Across China Based on Space-time XGBoost Approach[J].Acta Scientiae Circumstantiae,2021,41(10):4228-4237.
[19] WANG C,LIU Q M,YING N,et al.Air Quality Eva-luation on an Urban Scale Based on MODIS Satellite Images[J].Atmospheric Research,2013,132:22-34.
[20] WANG Z F,CHEN L F,TAO J H,et al.Satellite-ba-sed Estimation of Regional Particulate Matter(PM)in Beijing Using Vertical-and-RH Correcting Method[J].Remote Sensing of Environment,2010,114(1):50-63.
[21] 徐建辉,江 洪.长江三角洲PM2.5质量浓度遥感估算与时空分布特征[J].环境科学,2015,36(9):3119-3127.
XU Jian-hui,JIANG Hong.Estimation of PM2.5 Concentration over the Yangtze Delta Using Remote Sen-sing:Analysis of Spatial and Temporal Variations[J].Environmental Science,2015,36(9):3119-3127.
[22] 吴健生,王 茜.基于AOD数据反演地面PM2.5浓度研究进展[J].环境科学与技术,2017,40(8):68-76.
WU Jian-sheng,WANG Xi.Research Progress of Retrieval Ground-level PM2.5 Concentration Based on AOD Data[J].Environmental Science & Technology,2017,40(8):68-76.
[23] 王之戈.融合时空特征的中国大气PM2.5遥感估算及其健康风险评估[D].杭州:浙江大学,2024.
WANG Zhi-ge.Satellite-based Estimation of PM2.5 and Its Health Impacts in China by Integrating Spa-tiotemporal Characteristics[D].Hangzhou:Zhejiang University,2024.
[24] 朱 立,戴晓慧,张脉惠,等.数据驱动的FY-4A遥感估算安徽省近地面PM2.5浓度模型研究[J].环境科学学报,2023,43(11):196-205.
ZHU Li,DAI Xiao-hui,ZHANG Mai-hui,et al.Research on the Data-driven Model for Estimating the Near-surface PM2.5 Concentration in Anhui Province Using FY-4A Remote Sensing Data[J].Acta Scientiae Circumstantiae,2023,43(11):196-205.
[25] XIAO L,LANG Y C,CHRISTAKOS G.High-resolution Spatiotemporal Mapping of PM2.5 Concentrations at Mainland China Using a Combined BME-GWR Te-chnique[J].Atmospheric Environment,2018,173:295-305.
[26] WU J S,YAO F,LI W F,et al.VIIRS-based Remote Sensing Estimation of Ground-level PM2.5 Concentrations in Beijing-Tianjin-Hebei:A Spatiotemporal Statistical Model[J].Remote Sensing of Environment,2016,184:316-328.
[27] 沈焕锋,李同文.大气PM2.5遥感制图研究进展[J].测绘学报,2019,48(12):1624-1635.
SHEN Huan-feng,LI Tong-wen.Progress of Remote Sensing Mapping of Atmospheric PM2.5[J].Acta Geodaetica et Cartographica Sinica,2019,48(12):1624-1635.
[28] KIM S M,KOO J H N,LEE H,et al.Comparison of PM2.5 in Seoul,Korea Estimated from the Various Ground-based and Satellite AOD[J].Applied Scien-ces,2021,11(22):10755.
[29] 程雅琪.基于融合高分辨率AOD的京津冀PM2.5浓度估算[D].石家庄:河北师范大学,2019.
CHENG Ya-qi.Estimation of PM2.5 Concentration in Beijing,Tianjin and Hebei Based on Fusion High Re-solution AOD[D].Shijiazhuang:Hebei Normal University,2019.
[30] 杨立娟,张建霞,林木生.中国东部沿海四省一市PM2.5浓度遥感估算方法研究[J].遥感技术与应用,2021,36(6):1408-1415.
YANG Li-juan,ZHANG Jian-xia,LIN Mu-sheng.Research on Methods of Remotely Sensed PM2.5 Concentrations Estimation in Four Provinces and One Ci-ty Along the East Coast of China[J].Remote Sensing Technology and Application,2021,36(6):1408-1415.
[31] 曹 媛,宫明艳,沈 非,等.中国区域PM2.5浓度估算以及影响因素解析[J].大气与环境光学学报,2023,18(3):245-257.
CAO Yuan,GONG Ming-yan,SHEN Fei,et al.Estimation of PM2.5 Concentration and Analysis of Influencing Factors in China[J].Journal of Atmospheric and Environmental Optics,2023,18(3):245-257.
[32] KOTHANDARAMAN D,PRAVEENA N,VARADARAJKUMAR K,et al.Intelligent Forecasting of Air Quality and Pollution Prediction Using Machine Learning[J].Adsorption Science & Technology,2022,2022:5086622.
[33] 夏志业,刘志红,王永前,等.MODIS气溶胶光学厚度的PM2.5质量浓度遥感反演研究[J].高原气象,2015,34(6):1765-1771.
XIA Zhi-ye,LIU Zhi-hong,WANG Yong-qian,et al.Research on Ground-level PM2.5 Mass Concentration Retrieval Based on MODIS Aerosol Optical Thickness[J].Plateau Meteorology,2015,34(6):1765-1771.
[34] LI H,FARUQUE F,WILLIAMS W,et al.Optimal Temporal Scale for the Correlation of AOD and Ground Measurements of PM2.5 in a Real-time Air Quality Estimation System[J].Atmospheric Environment,2009,43(28):4303-4310.
[35] 徐发昭,李 净,褚馨德,等.基于MODIS数据与多机器学习法的日PM2.5模拟研究[J].中国环境科学,2022,42(6):2523-2529.
XU Fa-zhao,LI Jing,CHU Xin-de,et al.Simulation of Daily PM2.5 Based on MODIS Data and Multi-machine Learning Method[J].China Environmental Science,2022,42(6):2523-2529.
[36] 同丽嘎,李雪铭,张 靖.包头市细颗粒物遥感监测混合线性模型[J].遥感信息,2019,34(2):36-41.
TONG Li-ga,LI Xue-ming,ZHANG Jing.Mixed Li-near Model of Remote Sensing Monitoring for Fine Particulate Matter(PM2.5)in Baotou City[J].Remote Sensing Information,2019,34(2):36-41.
[37] TU M H,OLOFSSON U.Estimating PM Levels on an Underground Metro Platform by Exploring a New Model-based Factor Research[J].Atmospheric Environment:X,2024,22:100261.
[38] WEI J,LI Z Q,CRIBB M,et al.Improved 1 km Resolution PM2.5 Estimates Across China Using Enhanced Space-time Extremely Randomized Trees[J].Atmospheric Chemistry and Physics,2020,20(6):3273-3289.
[39] 张 娜,陈文倩,白雪松,等.基于时空优化模型的PM2.5遥感估测研究[J].中国环境科学,2024,44(7):3625-3636.
ZHANG Na,CHEN Wen-qian,BAI Xue-song,et al.PM2.5 Remote Sensing Estimation Based on Spatiotemporal Factor Optimization Model [J].China Environmental Science,2024,44(7):3625-3636.
[40] 耿 冰,孙义博,曾巧林,等.基于深度学习方法的PM2.5精细化时空估算模型[J].中国环境科学,2021,41(8):3502-3510.
GENG Bing,SUN Yi-bo,ZENG Qiao-lin,et al.Refined Spatiotemporal Estimation Model of PM2.5 Ba-sed on Deep Learning Method[J].China Environmental Science,2021,41(8):3502-3510.
[41] 张凯南.基于卫星遥感的关中盆地PM2.5浓度时空变化特征研究[D].西安:长安大学,2021.
ZHANG Kai-nan.Spatiotemporal Variation of PM2.5 in Guanzhong Basin Based on Satellite Remote Sensing[D].Xi'an:Chang'an University,2021.
[42] 李常巘佶,高美玲,李振洪.典型人类活动对关中平原城市群PM2.5浓度的影响[J].地球科学与环境学报,2024,46(2):180-195.
LI Chang-yan-ji,GAO Mei-ling,LI Zhen-hong.Impacts of Typical Human Activities on PM2.5 Concentrations in Guanzhong Plain Urban Agglomeration,China[J].Journal of Earth Sciences and Environment,2024,46(2):180-195.
[43] BREIMAN L.Random Forests[J].Machine Learning,2001,45:5-32.
[44] BREIMAN L.Bagging Predictors[J].Machine Learning,1996,24:123-140.
[45] HO T K.The Random Subspace Method for Constru-cting Decision Forests[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1998,20(8):832-844.
[46] 彭豪杰,周 杨,胡校飞,等.基于深度学习与随机森林的PM2.5浓度预测模型[J].遥感学报,2023,27(2):430-440.
PENG Hao-jie,ZHOU Yang,HU Xiao-fei,et al.A PM2.5 Prediction Model Based on Deep Learning and Random Forest[J].National Remote Sensing Bulletin,2023,27(2):430-440.
[47] FRIEDMAN J H.Stochastic Gradient Boosting[J].Computational Statistics & Data Analysis,2002,38(4):367-378.
[48] KE G,MENG Q,FINLEY T,et al.LightGBM:A Hi-ghly Efficient Gradient Boosting Decision Tree[C]∥GUYON I,VON LUXBURG U,BENGIO S,et al.Advances in Neural Information Processing Systems 30(NIPS 2017).Long Beach:NIPS,2017:3147-3155.
[49] 吴 迪,杜 宁,王 莉,等.基于GTWR-XGBoost模型的四川省PM2.5小时浓度估算[J].环境科学,2023,44(7):3738-3748.
WU Di,DU Ning,WANG Li,et al.Estimation of PM2.5 Hourly Concentration in Sichuan Province Ba-sed on GTWR-XGBoost Model[J].Environmental Science,2023,44(7):3738-3748.
[50] 韦 晶,李占清.中国高分辨率高质量PM2.5数据集(2000~2023)[DS/OL].北京:国家青藏高原科学数据中心,DOI:10.5281/zenodo.3539349.
WEI JING,LI Zhan-qing.ChinaHighPM2.5:High-re-solution and High-quality Ground-level PM2.5 Dataset for China(2000-2023)[DS/OL].Beijing:National Tibetan Plateau Data Center,DOI:10.5281/zenodo.3539349.
[51] VAN DONKELAAR A,HAMMER M S,BINDLE L,et al.Monthly Global Estimates of Fine Particulate Matter and Their Uncertainty[J].Environmental Science & Technology,2021,55(22):15287-15300.

Memo

Memo:
-
Last Update: 2025-07-25