Sample data file with TOAR air quality data for machine learning excercise


This file has been obtained from the Tropospheric Ozone Assessment Report database described by Schultz, M.G. et al., Elementa Sci. Anthrop., 2017, doi: It contains 6 years of annual NO2 concentration percentiles at German measurement sites and corresponding station metadata. The intended use of these data is to demonstrate the set-up and training of a simple feed forward neural network, which shall attempt to predict the NO2 statistics based on the station characterisation from the metadata information.

The data are stored as csv file (comma delimited) with 7 header lines plus column headings. Column headings are: year,id,station_id,station_type,station_type_of_area,station_nightlight_1km,station_wheat_production,station_nox_emissions,station_omi_no2_column,station_max_population_density_5km,perc75,perc98. station_id, station_type, and station_type_of_area are string variables, all other columns are numeric. year, id, and station_id should be ignored for the machine learning. perc75 and perc98 are 75%-iles and 98%-iles, respectively and given in units of nmol per mol (equivalent to ppbv).

Metadata Access
Creator Schultz, Martin G.
Publisher EUDAT
Publication Year 2019
Rights info:eu-repo/semantics/openAccess; Creative Commons Attribution (CC-BY)
OpenAccess true
Discipline Other