Raw datasets

This is a list of raw weather and climate datasets that are commonly used in ML research. The list is in alphabetical order.



  • Reference: Hersbach et al. 2020

  • Data: https://cds.climate.copernicus.eu/

  • Description: The ultimate reanalysis dataset covering the last 40 years (1950 to 1978 as a preliminary version) at 0.25 degree global resolution. Hourly data available. Pretty much every variable.

  • Notes: Care is to be taken for a bunch of surface variables, such as precipitation and wind. These often don’t match direct observations very closely.

  • Examples of papers using this dataset: WeatherBench