Code for paper "Thunderstorm nowcasting with deep learning: a multi-hazard data fusion model"
https://github.com/MeteoSwiss/c4dl-multi
4文章引用
Leinonen, J., Hamann, U., Sideris, I. V., & Germann, U. (2023). Thunderstorm Nowcasting With Deep Learning: A Multi‐Hazard Data Fusion Model. Geophysical Research Letters, 50(8), e2022GL101626.https://doi.org/10.1029/2022GL101626
5数据源
Weather radar observations were collected from the Swiss operational network (Germann et al., 2016, 2022). These data include radar-measured information about the precipitation rate and the vertical structure of the radar reflectivity, such as echo top heights and the vertically integrated liquid water content, at 1 km horizontal resolution.
Geostationary satellite imagery was obtained from the Spinning Enhanced Visible and InfraRed Imager (SEVIRI; Schmid, 2000) on the MeteoSat Second Generation 3 (MSG-3) satellite. We used the radiances and brightness temperatures from the visible and infrared (IR) bands; the native resolution of these in the study area is approximately 1 km × 2 km for the high-resolution visible (HRV) band and 3 km × 5 km for the others. The bands that consist mostly of reflected solar radiation were normalized with the function f(x) = x/cos θ, where θ is the solar zenith angle; these bands are unavailable at night. Furthermore, we used the Nowcasting Satellite Application Facility (NWCSAF) cloud top height, cloud top temperature, cloud optical thickness and cloud top phase products (Derrien & Le Gléau, 2005; Hamann et al., 2014; Le Gléau, 2016).
Lightning detection measurements
were collected by the European Cooperation for Lightning Detection (EUCLID) network of lightning antennas (Poelman et al., 2016; Schulz et al., 2016) and delivered by Météorage. The original data consist of locations and various properties of lightning strikes. We aggregated these into maps of lightning density and current density, as well as binary occurrence maps used in lightning prediction.
NWP forecasts originated from the Consortium for Small Scale Modeling (COSMO) model (Baldauf et al., 2011) used operationally at MeteoSwiss. We selected various COSMO outputs relevant to thunderstorms, such as the convective available potential energy (CAPE).
Digital elevation model (DEM) data were from the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) global DEM (Abrams et al., 2020) used to model topography in COSMO.
6Neural Network
LHG2022 described a recurrent-convolutional DL model for predicting lightning occurrence (see Figure S2 in Supporting Information S1), based on the model (Leinonen, 2021a, 2021b) used in the Weather4cast 2021 competition (Herruzo et al., 2021) where it outperformed competing architectures such as U-Nets and transformer architectures. We adopt this architecture for each hazard, inheriting the best-performing hyperparameters from LHG2022. The model is built slightly differently for each combination of input data sources, such that only the parts of the model necessary for those inputs are included.
The main architectural change to the DL model in this study is that the prediction of heavy precipitation only uses the last time step of the final layer, which is trained to predict the entire 1-hr accumulation. Furthermore, in contrast to LHG2022 we did not perform model ensembling (Ganaie et al., 2022) due to the required computational cost.
For lightning, we utilize focal loss (Lin et al., 2017) with focusing parameter γ = 2 as the training loss function, so that our results are comparable with LHG2022 where this loss was also adopted. The hail and precipitation targets are defined probabilistically and it is not clear how the focal loss generalizes to such cases. Thus, we use cross entropy (CE) loss, which also performed well in LHG2022 and can be straightforwardly defined for probabilistic targets as:
where is the predicted probability, is the target probability and the sum is over the possible classes . In the case of hail, there are two classes (hail or no hail), while with precipitation, there are four classes as defined by Equation 1.
Training all 96 combinations of targets and data sources takes approximately one month on eight Nvidia V100 GPUs.(配置要求不低啊!) For each target and data source combination, we used the same model architecture and hyperparameters. Ideally, these should be tuned separately for each case to optimize performance, but this would require training each model many times, which would be infeasible with the available resources.
7结果与讨论
Figure 1 (a) The results of lightning prediction. On the left, three input variables (rain rate, lightning, and the satellite HRV channel) are shown at two time steps. On the right, future time steps are shown, with the first row giving the CAPE, the second showing the observed lightning, and the third showing the model prediction of lightning occurrence probability. (b) As (a), but showing the observed and predicted POH instead of lightning. The HRV is not shown as the case occurs at night. (c) As (a) and (b), but showing the probability of accumulated precipitation in the next 60 min exceeding 10 mm. Consequently, only one future time step is shown.Figure 2 The average loss in the test data set for the prediction of (a) lightning, (b) hail and (c) heavy precipitation using different data sources. In every panel, each square shows the loss for the data sources indicated by the combination of the labels for the corresponding row and column (abbreviated to “Rad” for radar, “Sat” for satellite and “Lig” for lightning). For example, the top left corner of each panel shows the model trained with all five data sources, the bottom left shows the model using only radar and satellite data, and the bottom right shows the “ignorant” model that did not receive any inputs. All loss scores are scaled such that the loss of the ignorant model is set to 1.Figure 3 The Shapley values (normalized to a sum of 1) of the different data sources, evaluated with the test data set, for the prediction of (a) lightning, (b) hail and (c) heavy precipitation. For lightning and hail, the Shapley values as a function of lead time are also shown; for heavy precipitation, we only predict the 60-min accumulated precipitation so this cannot be displayed.