Clusters of atmopsheric and oceanic variables and teleconnections that are candidate drivers for Tropical Cyclogenesis

DOI

This project provides the dataset employed for the development of a machine learning framework designed to detect and interpret Tropical Cyclone Genesis (TCG) activity across six major tropical ocean basins: North Atlantic, Northeast Pacific, Northwest Pacific, North Indian, South Indian, and South Pacific. The dataset includes pre-processed environmental and climatic variables relevant to TCG dynamics, aggregated at the basin level with monthly resolution from January 1980 to December 2022. All data are derived from the ERA5 reanalysis dataset, with a spatial resolution of 2.5° × 2.5°. ERA5 reanalysis data were accessed through the DKRZ data pool, made available by DKRZ Data Management. The atmospheric and oceanic variables provided are absolute vorticity at 850 hPa, maximum potential intensity (MPI), mean sea level pressure (MSLP), relative humidity at 700 hPa, sea surface temperature (SST), relative vorticity at 850 hPa, vertical wind shear between 850 and 200 hPa, and vertical velocity at 500 hPa. Several of these variables are derived from ERA5 primary variables and represent physically meaningful diagnostics used widely in tropical cyclone research. To reduce spatial dimensionality, each variable has been clustered within each basin using the K-means algorithm, and the area-weighted mean value of each cluster is reported as a time series. Additionally, the dataset includes monthly values of a suite of large-scale climate indices known to influence tropical cyclone activity: Atlantic Meridional Mode (AMM), Niño3.4, North Atlantic Oscillation (NAO), Pacific Decadal Oscillation (PDO), Pacific-North American Pattern (PNA), Southern Oscillation Index (SOI), Tropical Northern Atlantic Index (TNA), Tropical Southern Atlantic Index (TSA), and the Western Pacific Index (WP). Lastly, for each basin, the dataset contains monthly counts of tropical cyclogenesis events, enabling evaluation of predictive models and interpretability methods. This dataset is intended to support research in seasonal TCG detection, and it enables reproducibility of the methods developed in the associated study.

Identifier
DOI https://doi.org/10.26050/WDCC/CLINT_TC
Metadata Access https://dmoai.cloud.dkrz.de/oai/provider?verb=GetRecord&metadataPrefix=iso19115&identifier=oai:wdcc.dkrz.de:iso_5311644
Provenance
Creator Filippo Dainelli
Publisher World Data Center for Climate (WDCC)
Publication Year 2025
Funding Reference info:eu-repo/grantAgreement/EC/H2020/101003876/BE//CLImate INTelligence: Extreme events detection, attribution and adaptation design using machine learning
Rights CC-BY-4.0: Creative Commons Attribution 4.0 International https://creativecommons.org/licenses/by/4.0/
OpenAccess true
Contact not filled
Representation
Language English
Resource Type collection ; collection
Format NetCDF
Size 28 MB
Version 1
Discipline Earth System Research
Spatial Coverage (-180.000W, -40.000S, 180.000E, 40.000N)
Temporal Coverage Begin 1980-01-01T00:00:00Z
Temporal Coverage End 2022-12-31T00:00:00Z