The humanities meet computer science to create new synergies using computer vision and natural language processing.
Aim & Scope
Historians are increasingly using technologies to evaluate digitised texts in a machine-readable way, as well as techniques from the field of natural language processing (NLP) to analyse the content and context of language in written artefacts. These techniques can be used to analyse large corpora and identify patterns. In general, however, these methods often use training data from current rather than historical data. The use of these methods can lead to biases in the historical record, incurring the risk of false inferences about history. Therefore, the methods used should be fully investigated to account for any biases. In this DL workshop, the challenges of applying computer vision and NLP techniques in the humanities, and first solutions to them, will be presented.
This entry includes the following presentations from the first Data Linking Workshop 2023: Computer Vision and Natural Language Processing – Challenges in the Humanities
Pepper, Welcome (download text, download presentation)
Eva Wilden, Charles Li: Tamilex -- Digital Lexicography (download presentation)
Stefan Baums, Stephen White: Computer Vision and Kharoṣṭhī Paleography (download presentation)
Oskar von Hinüber, Haiyan Hu-von Hinüber, Sylvia Melzer: What the Buddhological Epigraphy can expect from the AI: The Information System "Buddhist Bronzes Inscriptions" (download presentation)
Kathrin Holz: The Proto-Śāradā Project: Towards the edition of a new collection of administrative letters and documents from pre-modern South Asia (download presentation)
Ines Konczak-Nagel, Erik Radisch:
The Kucha Mural Information System: Taxonomy and Semi-Automated Image Recognition (download presentation 1, download presentation 2)
Ralf Möller: Aligned AI and the role of the humanities: Training AI systems using human feedback (download presentation)
Isabelle Marthot-Santaniello: The application of NLP in combination with Computer Vision for analysing ancient Greek handwritings on papyri (download presentation)
Olga Serbaeva: Some features of the 17th century Newārī script: READ-based statistical approach to palaeography (download presentation)
Lena Hinrichsen: OCR technologies in research practice (download presentation)
Oliver Hellwig: Web-based information systems for Indian scripts and texts (download presentation)
Simon Schiff, Ralf Möller: Persistent Data, Sustainable Information (download presentation)
Hamid Reza Hakimi, Lisa Mischer, Tariq Yousef, Maxim Romanov: Finding and Linking Information in Arabic Historical Texts (download presentation)
Sylvia Melzer: Building Information Systems on Demand with ChatGPT? (download presentation)
Martin Braun, Hannes Fellner, Bernhard Koller: A Digital Paleography of Tarim Brahmi (download presentation)
Hussein Mohammed: Computer vision beyond OCR: potentials and challenges for the study of written artefact (download presentation)
Pepper, Bye (download text, download presentation)
The workshop was supported by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany's Excellence Strategy - EXC 2176 'Understanding Written Artefacts: Material, Interaction and Transmission in Manuscript Cultures', project no. 390893796.