The “Company ID Linktables” dataset comprises ID linkage tables for the datasets BVD, MiDi, URS, USTAN,JANIS, BAKIS-M, LEI, RIAD (AnaCredit), SIFCT, SITS, and Schufa. The individual ID linkage tables are two-column tables containing a different company identifier in each column, resulting in a table of ID value pairs, showing which ID values of the two IDs refer to the same real-world company entity. The need for these ID linkage tables originates from the fact that company data is often held in separate databases, which use different company identifiers. The tables are produced by the RDSC through the use of current record linkage techniques, which,among other methods, include comprehensive data cleaning, matching based on common externalIDs, as well as name and place based matching, which includes probabilisic matching using supervisedmachine learning.
The technical properties of this record linkage process are described by Gábor-Tóth, E., Schild, C., and Walter, S. (2023). Linking Deutsche Bundesbank Company Data. Technical Report 2023-05. Deutsche Bundesbank, Research Data and Service Centre.
The size of the generated data overlaps / intersections is analyzed by Gábor-Tóth, E., Schild, C., and Walter, S. (2023). Understanding Overlaps betweenDifferent Company Data. Technical Report 2023-06. Deutsche Bundesbank, Research Data andService Centre.
German companies