Annual Survey of Hours and Earnings, 2020: Synthetic Data Pilot

DOI

Abstract copyright UK Data Service and data collection copyright owner.

The Annual Survey of Hours and Earnings, 2020: Synthetic Data Pilot is a synthetic version of the Annual Survey of Hours and Earnings (ASHE) study available via Trusted Research Environments (TREs).  ASHE is one of the most extensive surveys of the earnings of individuals in the UK. Data on the wages, paid hours of work, and pensions arrangements of nearly one per cent of the working population are collected. Other variables relating to age, occupation and industrial classification are also available. The ASHE sample is drawn from National Insurance records for working individuals, and the survey forms are sent to their respective employers to complete. ASHE is available for research projects demonstrating public good to accredited or approved researchers via TREs such as the Office for National Statistics Secure Research Service (SRS) or the UK Data Service Secure Lab (at SN 6689). To access collections stored within TREs, researchers need to undergo an accreditation process. Gaining access to data in a secure environment can be time and resource intensive. This pilot has created a low fidelity, low disclosure risk synthetic version of ASHE data, which can be made available to researchers more quickly while they wait for access to the real data.The synthetic data were created using the Synthpop package in R.  The sample method was used; this takes a simple random sample with replacement from the real values. The project was carried out in the period between 19th December 2022 and 3rd January 2023.  Further information is available within the documentation. User feedback received through this pilot will help the ONS to maximise benefits of data access and further explore the feasibility of synthesising more data in future.

Main Topics:

The ASHE synthetic data contain the same variables as ASHE for each individual, relating to wages, hours of work, pension arrangements, and occupation and industrial classifications.  There are also variables for age, gender and full/part-time status. Because ASHE data are collected by the employer, there are also variables relating to the organisation employing the individual. These include employment size and legal status (e.g. public company).  Various geography variables are included in the data files. The year variable in this synthetic dataset is 2020.

Simple random sample

Compilation/Synthesis

Identifier
DOI http://doi.org/10.5255/UKDA-SN-9045-1
Metadata Access https://datacatalogue.cessda.eu/oai-pmh/v0/oai?verb=GetRecord&metadataPrefix=oai_ddi25&identifier=8aff812363b14525242532daa763bef7e7095e7ac55bbc4b32bc1141da6e7097
Provenance
Creator Office for National Statistics
Publisher UK Data Service
Publication Year 2023
Funding Reference Office for National Statistics
Rights <a href="https://www.nationalarchives.gov.uk/information-management/re-using-public-sector-information/uk-government-licensing-framework/crown-copyright/" target="_blank">© Crown copyright</a>. The use of these data is subject to the <a href="https://ukdataservice.ac.uk/app/uploads/cd137-enduserlicence.pdf" target="_blank">UK Data Service End User Licence Agreement</a>. Additional restrictions may also apply.; <p>The Data Collection is available to UK Data Service registered users subject to the <a href="https://ukdataservice.ac.uk/app/uploads/cd137-enduserlicence.pdf" target="_blank">End User Licence Agreement</a>.</p><p>Registered users must have or gain <a href="https://www.ons.gov.uk/aboutus/whatwedo/statistics/requestingstatistics/secureresearchservice/becomeanaccreditedresearcher" target="_blank">DEA Accredited Researcher Status</a>.</p><p>Use of the data requires approval from the data owner or their nominee.</p>
OpenAccess true
Representation
Language English
Resource Type Numeric
Discipline History; Humanities
Spatial Coverage United Kingdom