Knowledge-Enhanced Neural Networks for Machine Reading Comprehension [Source Code and Additional Material]

DOI

Machine Reading Comprehension is a language understanding task where a system is expected to read a given passage of text and typically answer questions about it. When humans assess the task of reading comprehension, in addition to the presented text, they usually use the knowledge that they already know, such as commonsense and world knowledge, or language skills that they previously acquired - understanding the events and arguments in a text (who did what to whom), their participants and the relation in discourse. In contrast, neural network approaches for machine reading comprehension focused on training end-to-end systems that rely only on annotated task-specific data.

In this thesis, we explore approaches for tackling the reading comprehension problem, motivated by how a human would solve the task, using existing background and commonsense knowledge or knowledge from various linguistic tasks.

First, we develop a neural reading comprehension model that integrates external commonsense knowledge encoded as a key-value memory. Instead of relying only on document-to-question interaction or discrete features, our model attends to relevant external knowledge and combines this knowledge with the context representation before inferring the answer. This allows the model to attract and imply knowledge from an external knowledge source that is not explicitly stated in the text but is relevant for inferring the answer. We demonstrated that the proposed approach improves the performance of very strong base models for cloze-style reading comprehension and open-book question answering. By including knowledge explicitly, our model can also provide evidence about the background knowledge used in the reasoning process.

Further, we examined the impact of transferring linguistic knowledge from low-level linguistic tasks into a reading comprehension system using neural representations. Our experiments show that the knowledge transferred from the neural representations trained on these linguistic tasks can be adapted and combined together to improve the reading comprehension task early in training and when trained with small portions of the data.

Last, we propose to use structured linguistic annotations as a basis for a Discourse-Aware Semantic Self-Attention encoder that we employ for reading comprehension of narrative texts. We extract relations between discourse units, events, and their arguments, as well as co-referring mentions, using available annotation tools. The empirical evaluation shows that the investigated structures improve the overall performance (up to +3.4 Rouge-L), especially intra-sentential and cross-sentential discourse relations, sentence-internal semantic role relations, and long-distance coreference relations. We also show that dedicating self-attention heads to intra-sentential relations and relations connecting neighboring sentences is beneficial for finding answers to questions in longer contexts. These findings encourage the use of discourse-semantic annotations to enhance the generalization capacity of self-attention models for machine reading comprehension.

Identifier
DOI https://doi.org/10.11588/data/HU3ARF
Related Identifier IsCitedBy https://doi.org/10.18653/v1/D18-1260
Related Identifier IsCitedBy https://doi.org/10.18653/v1/P18-1076
Related Identifier IsCitedBy https://doi.org/10.18653/v1/D19-1257
Metadata Access https://heidata.uni-heidelberg.de/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.11588/data/HU3ARF
Provenance
Creator Mihaylov, Todor
Publisher heiDATA
Contributor Mihaylov, Todor
Publication Year 2024
Funding Reference German Research Foundation Research Training Group Adaptive Preparation of Information from Heterogeneous Sources (AIPHES) GRK 1994/1
Rights CC BY 4.0; info:eu-repo/semantics/openAccess; http://creativecommons.org/licenses/by/4.0
OpenAccess true
Contact Mihaylov, Todor (Heidelberg University, Department of Computational Linguistics)
Representation
Resource Type Dataset
Format application/zip
Size 22075394; 327922; 53992427; 448657076
Version 1.1
Discipline Other