Land Registry Price Paid Data (PPD) have been published as open data since 2013. These data have been transformative for house price variation research in the UK as they are a comprehensive record of residential transactions at address level and cover the whole of England and Wales over a period dating back to 1995. Despite the utility of these data, a lack of attribute information relating to the properties, such as total floor area information, is identified as one of the major shortcomings of the PPD data. This means that the impacts of stock mix on broader price patterns cannot be fully accounted for. This research outlines one approach which addresses this deficiency by combining transaction information from the official open Land Registry Price Paid Data (PPD) with property size information form the official open Domestic Energy Performance Certificates (EPCs). A four-stage data linkage is created to generate a new linked dataset, representing 79% of the full market sales in the Land Registry PPD. This new linked dataset details 5,732,838 transactions in England and Wales between 2011 and 2019, along with each property's total floor area and the number of habitable rooms. Codes for other commonly used spatial units from Output Area to Local Authority are also included in the dataset. This offers greater flexibility for the exploration of house price variation in England and Wales at different spatial scales. The data collection includes the scripts used for linkage, as well as the resulting dataset.Current residential house price variation research in the UK is limited by lack of an open and comprehensive house price database that contains both transaction price alongside dwelling attributes such as size. This research outlines one approach which addresses this deficiency in England and Wales through combining transaction information from the official open Land Registry Price Paid Data (PPD) and property size information form the official open Domestic Energy Performance Certificates (EPCs). A four-stage data linkage is created to generate a new linked data, representing 79% of the full market sales in Land Registry PPD. This new linked dataset offers greater flexibility for the exploration of house price (house price per square metre) variation in England and Wales at different spatial scales over postcode unit between 2011 and 2019.
The Land Registry Price Paid Data (PPD) dataset is open, available online (https://www.gov.uk/government/statistical-data-sets/price-paid-data-downloads). The Land Registry PPD records 24,852,949 transactions in England and Wales between 1/1/1995 and 31/10/2019. Domestic Energy Performance Certificates (EPCs) dataset is open and available on-line from the Ministry for Housing, Communities and Local Government - MHCLG. Domestic EPC record a property’s energy performance and its building stock information, such as its total floor area and its number of habitable rooms. The current Domestic EPCs dataset is the third released version and contains certificates issued between 1/10/2008 and 31/8/2019, which records 18,575,357 energy performance data records with 84 fields. These two datasets both contain property information at address level but their address structures are different, thus a matching method containing a four-stage (251 matching rules) process was designed to achieve linkage between them.