Research data management (RDM) has become an important discipline that enables researchers to effectively organise, preserve and share their research results.
RDM is a new development that aims to prepare researchers for the future by building on the principles of open science. It utilises innovative approaches such as generative artificial intelligence (genAI), which is powered by large language models (LLMs), to complement traditional research methods.
As data-driven research becomes increasingly complex, researchers often have to spend a lot of time learning how to manage, analyse and interpret large amounts of information. Traditional data literacy training can be time-consuming and doesn't always keep pace with evolving technologies and methods of analysis.
Foundation models based on generative AI offer the potential to streamline this learning process. By automating data pre-processing, pattern recognition and even hypothesis generation, these models can lower the technical barriers to entry, allowing researchers to focus more on insights and discovery rather than spending excessive amounts of time mastering data skills.
The objective of this workshop is an exchange of perspectives regarding the implementation of novel RDM approaches using LLMs or not, both past and prospective, in research and practice.
The presentations of the Large Language Models for Research Data Management?! workshop were as follows:
Magnus Bender
Aarhus University, Denmark
Welcome
Ralf Möller
University of Hamburg, Germany
Keynote: Advancing RDM: From Immersion to Argumentation in Science
(download pdf)
Jens Dörpinghaus1,2,3, Michael Tiemann1,2
1University of Koblenz, Germany; 2Federal Institute for Vocational Education and Training (BIBB), Germany; 3Linnaeus University, Sweden
Large Language Models in Labor Market Research Data Management: Potentials and Limitations
Edyta Jurkiewicz-Rohrbacher1,2, Thomas Asselborn2
1University of Regensburg, Germany; 2University of Hamburg, Germany
Challenges in Automatic Speech Recognition in the Research on Multilingualism
(download pdf)
Florian Marwitz, Marcel Gehrke
University of Hamburg, Germany
Improving Accessibility and Reproducibility by Guiding Large Language Models
(download pdf)
Maximilian Plazotta, Meike Klettke
University of Regensburg, Germany
Talk to your database: An open-source in-context learning approach to interact with relational databases through LLMs
(download pdf)
Thomas Asselborn, Magnus Bender, Florian Marwitz, Ralf Möller, Sylvia Melzer
University of Hamburg, Germany
Verbalisation Process of a RAG-Based Chatbot to Support Tabular Data Evaluation for Humanities Researchers
(download pdf)
Magnus Bender
Aarhus University, Denmark
Farewell
This contribution was partially funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany´s Excellence Strategy – EXC 2176 'Understanding Written Artefacts: Material, Interaction and Transmission in Manuscript Cultures', project no. 390893796. The research was mainly conducted within the scope of the Centre for the Study of Manuscript Cultures (CSMC) at University of Hamburg.
This contribution was partially funded by the Danish National Research Foundation (DNRF193) through TEXT: Centre for Contemporary Cultures of Text at Aarhus University.