CN-AEBench Dataset

CN-AEBench is a comprehensive multi-source atmospheric and environmental dataset integrating ground meteorological observations, environmental monitoring, and ECMWF IFS data. This visualization platform demonstrates the dataset's reliability and consistency through three representative extreme weather events across different climate zones in China, validating the data quality and applicability for atmospheric and environmental research and machine learning applications.

πŸ“Š
Data Coverage
2023-2025-Future
Fresh data, updated continuously
🌍
Stations
Meteorological Stations: 2,250+; Environmental Monitoring Station: ~2600
Nationwide coverage across China
πŸ”¬
Parameters
41+
Meteorological & air quality & EC-NWP variables
⏱️
Resolution
1 Hour
High temporal resolution data

Case Studies: Extreme Weather Events

🌫️

Zhengzhou

Severe Fog-Haze Event

Event Date: Dec 25-31, 2023
Event Type: Fog-Haze Transition
Max PM2.5: > 200 ΞΌg/mΒ³
Min Visibility: < 1 km
Stations: 8 stations

Analysis of severe winter fog-haze pollution event in North China Plain, demonstrating the dataset's capability to capture complex atmospheric processes and aerosol-meteorology interactions.

View Visualization β†’
🌧️

Huizhou

Extreme Rainstorm & Typhoon

Event Date: Jul 17-22, 2024
Event Type: Convective Rainstorm
Max Wind Speed: > 12 m/s
Warning Level: Red Alert
Stations: 4 stations

Investigation of extreme precipitation & typhoon event in South China coastal region, showcasing the dataset's ability to capture severe convective weather systems and their environmental impacts.

View Visualization β†’
πŸŒͺ️

Hetian

Dust Storm Event

Event Date: May 1-7, 2025
Event Type: Sand-Dust Storm
Max PM10: > 1000 ΞΌg/mΒ³
Warning Level: Orange Alert
Stations: 6 stations

Examination of severe dust storm at Taklamakan Desert edge, validating the dataset's performance in extreme arid conditions and dust aerosol monitoring capabilities.

View Visualization β†’

Benchmark Test Results

Evaluation Framework

🌏 Representative Regions

We selected four representative regions across China for comprehensive evaluation:

πŸ”οΈ Lanzhou & Surrounding Cities Northwest China

Characterized by arid climate, complex terrain, and frequent dust events influenced by the Gobi Desert.

🌺 Kunming & Surrounding Cities Southwest China

Located in the Yunnan-Guizhou Plateau with mild climate year-round and unique monsoon patterns.

❄️ Harbin & Surrounding Cities Northeast China

Features extreme continental climate with harsh winters and significant seasonal variations.

🌊 Shanghai East Coast

China's economic hub with subtropical monsoon climate, urban heat island effects, and typhoon influences.

Each region is trained and tested independently to ensure robust regional performance.

πŸ“… Data Split Strategy

Validation Set: Days 1-10 of Oct 2023, Jan/Apr/Jul/Sep/Dec 2024, Mar/Jun 2025
Test Set: Days 14-end of the same months
Training Set: All remaining data
Overall ratio - Training:Validation:Test = 7:1:2

πŸ“Š Evaluation Metrics

Standard Metrics

MAE (Mean Absolute Error) and Corr (Correlation Coefficient) for basic accuracy assessment

Temporal Pattern Metric

Soft-DTW (Soft Dynamic Time Warping) measures shape similarity while allowing temporal shifts, crucial for capturing evolutionary patterns

Extreme Event Metrics

Enhanced RQE (Relative Quantile Error) variants specifically designed for extreme weather prediction. See paper for details.

πŸ”¬ Experimental Setup

Short-term Prediction

Configuration: 24 steps β†’ 1\4\12\24 steps

Evaluation: Lead times at 1h (nowcasting), 4h, 12h, 24h (short-term), and overall performance

Long-term Prediction

Configuration: 48 steps β†’ 1\24\48\72\96\120 steps

Evaluation: Lead times at 1h (nowcasting), 24h, 48h, 72h, 96h, 120h (extended range), and overall performance

Experiments are conducted for both TSF (Time Series Forecasting) and ST-GNN (Spatio-Temporal Graph Neural Networks), with additional ablation studies on covariate analysis to assess multivariate prediction capabilities.