Raw datasets¶
This is a list of raw weather and climate datasets that are commonly used in ML research. The list is in alphabetical order.
CMIP6¶
Reference: Eyring et al. 2016
Description: Huge archive of global climate model simulations following all kinds of different scenarios.
Examples of papers using this dataset: Ham et al. 2019
ERA5¶
Reference: Hersbach et al. 2020
Description: The ultimate reanalysis dataset covering the last 40 years (1950 to 1978 as a preliminary version) at 0.25 degree global resolution. Hourly data available. Pretty much every variable.
Notes: Care is to be taken for a bunch of surface variables, such as precipitation and wind. These often don’t match direct observations very closely.
Examples of papers using this dataset: WeatherBench
TIGGE¶
Reference: Bougeault et al. 2010
Data: https://apps.ecmwf.int/datasets/data/tigge/levtype=sfc/type=cf/
Description: 15 year archive of operational global ensemble forecasts from different centers (not live).
Examples of papers using this dataset: WeatherBench