Machine Reading Comprehension is a language understanding task where a system
is expected to read a given passage of text and typically answer questions
about it.
When humans assess the task of reading comprehension, in addition to the
presented text, they usually use the knowledge that they already know, such as
commonsense and world knowledge, or language skills that they previously
acquired - understanding the events and arguments in a text (who did what to
whom), their participants and the relation in discourse.
In contrast, neural network approaches for machine reading comprehension
focused on training end-to-end systems that rely only on annotated
task-specific data.
In this thesis, we explore approaches for tackling the reading comprehension
problem, motivated by how a human would solve the task, using existing
background and commonsense knowledge or knowledge from various linguistic
tasks.
First, we develop a neural reading comprehension model that integrates
external commonsense knowledge encoded as a key-value memory. Instead of
relying only on document-to-question interaction or discrete features, our
model attends to relevant external knowledge and combines this knowledge with
the context representation before inferring the answer. This allows the model
to attract and imply knowledge from an external knowledge source that is not
explicitly stated in the text but is relevant for inferring the answer. We
demonstrated that the proposed approach improves the performance of very
strong base models for cloze-style reading comprehension and open-book
question answering.
By including knowledge explicitly, our model can also provide evidence about
the background knowledge used in the reasoning process.
Further, we examined the impact of transferring linguistic knowledge from
low-level linguistic tasks into a reading comprehension system using neural
representations. Our experiments show that the knowledge transferred from the
neural representations trained on these linguistic tasks can be adapted and
combined together to improve the reading comprehension task early in training
and when trained with small portions of the data.
Last, we propose to use structured linguistic annotations as a basis for a
Discourse-Aware Semantic Self-Attention encoder that we employ for reading
comprehension of narrative texts. We extract relations between discourse
units, events, and their arguments, as well as co-referring mentions, using
available annotation tools. The empirical evaluation shows that the
investigated structures improve the overall performance (up to +3.4 Rouge-L),
especially intra-sentential and cross-sentential discourse relations,
sentence-internal semantic role relations, and long-distance coreference
relations. We also show that dedicating self-attention heads to
intra-sentential relations and relations connecting neighboring sentences is
beneficial for finding answers to questions in longer contexts. These findings
encourage the use of discourse-semantic annotations to enhance the
generalization capacity of self-attention models for machine reading
comprehension.