This repository is the second updated version of the attribute-linked residential property price dataset in UK Data Service ReShare 854240 (https://reshare.ukdataservice.ac.uk/854240/). As with the first updated version (ReShare 855033 https://reshare.ukdataservice.ac.uk/855033/) in 2021, this updated dataset contains individual property transactions and associated variables from both Land Registry Price Paid Dataset (LR PPD) and the Ministry for Housing, Communities and Local Government (MHCLG) Domestic Energy Performance Certificate (EPC) data. This is a linked result by address matching between LR-PPD data (1/1/1995-27/6/2022) and Domestic EPCs data (the twelfth version: ending with 30/6/2022). It is the whole of the 2022 update house price per square metre dataset published in the Greater London Authority (GLA) London Datastore (https://data.london.gov.uk/dataset/house-price-per-square-metre-in-england-and-wales). The linked dataset in this repository is the uncorrected version, recording almost 20 million transactions with 106 variables in England and Wales between 1/1/1995 and 27/6/2022. We have offered technical validation and data cleaning code in UKDA ReShare 854240 to help users to evaluate the representation and to clean up the data. There is no unique way to clean this raw linked dataset, so we suggest users develop their own clean-up process based on their research requirements. In addition, this repository covers the original LR PPD and Domestic EPCs for the linked data (house price per square metre dataset). Similar to the first updated version, a field header has been added in LR PPD. Six variables (individual lodgement identifier, address, address 1, address 2, address 3, postcode) in Domestic EPCs are removed. A newly created unique identifier (id) is added in Domestic EPCs, this id is newly created for Version 12 Domestic EPCs. It is not the same id as in the Domestic EPCs from UK Data Service ReShare 854240 and ReShare 855033. Since November 2021 DLUCH has published Domestic EPCs with the Unique Property Reference Number (UPRN) hence the dataset in this repository contains the UPRN information from the Domestic EPCs.
This house price per square metre (HPM) dataset was created in August 2022. It is an individual linked result basing on the LR PPD, Domestic EPCs and NSPL downloaded on 13/8/2022. The dataset contains 19.96 million transactions with 106 variables in England and Wales between 1/1/1995 and 24/6/2022. 16 of the 106 variables come from the LR PPD, 86 variables come from Domestic EPCs, one variable (laua) from NSPL and three variables (id, classt, priceper) are created by the first author. Before the data linkage, a unique identifier (id) is created for all the unique EPCs after removing the individual lodgement identifier (LMK_KEY variable). During the data linkage process, the “classt” variable is created to identify 1:1 and 1:n linkage relationships. Once the linkage is complete, a derived house price per square metre variable (priceper) is calculated by dividing the transaction price paid by total floor area. The NSPL (August 2022 version) is used to assign the local authority unit (laua) to the house price per square metre dataset according to postcode information. This version of the dataset has not removed transactions with any improbable price per square metre values (e.g. total floor area values are null, 0). This uncorrected version aims to offer the most flexibility for users. Users are recommended to clean this uncorrected version according to their research need.