About Building Data Infrastructures

The great relevance of data for coming to new scientific insights, making progress in tackling the grand challenges and in making commercial profit has widely been commented. Some of the big questions that are currently being discussed are (a) who will own the relevant data and (b) what kind of facilities need to be developed to make data available. With respect to the first question there is no doubt that data from publicly funded research should in principle be open for broad usage. This implies an answer to the second question in so far as data infrastructures (DI) should avoid dependencies on commercial services and interests. This is the reason that huge investments are currently being made, in particular in Europe, towards the development of an eco-system of data infrastructures that build on public investments that have previously been made.

Building such data infrastructures should respect a balance between three components: scientific interest (S), technology advancement (T), and organizational form (O); the O dimension should follow the high dynamics in the T and S dimensions. In contrast to the US, where we see a reluctance to invest in DI building, the EU and its member states invest large amounts of funds in DI building. However, we can observe that the three dimensions (S, T, O) are not well balanced, a situation which bears high risks.

In this paper some approaches in large data infrastructure initiatives such as EOSC (EC) and NFDI (Germany) are analysed and commented on the risks they take by separating the three dimensions (S, T, O) at least temporarely.