SRL4ORL: Improving Opinion Role Labeling Using Multi-Task Learning With Semantic Role Labeling [Source Code]

DOI

This repository contains code for reproducing experiments done in Marasovic and Frank (2018).

Paper abstract:

For over a decade, machine learning has been used to extract opinion-holder-target structures from text to answer the question "Who expressed what kind of sentiment towards what?". Recent neural approaches do not outperform the state-of-the-art feature-based models for Opinion Role Labeling (ORL). We suspect this is due to the scarcity of labeled training data and address this issue using different multi-task learning (MTL) techniques with a related task which has substantially more data, i.e. Semantic Role Labeling (SRL). We show that two MTL models improve significantly over the single-task model for labeling of both holders and targets, on the development and the test sets. We found that the vanilla MTL model, which makes predictions using only shared ORL and SRL features, performs the best. With deeper analysis, we determine what works and what might be done to make further improvements for ORL.

Data for ORL

Download MPQA 2.0 corpus. Check mpqa2-pytools for example usage. Splits can be found in the datasplit folder.

Data for SRL The data is provided by: CoNLL-2005 Shared Task, but the original words are from the Penn Treebank dataset, which is not publicly available.

How to train models?

python main.py --adv_coef 0.0 --model fs --exp_setup_id new --n_layers_orl 0 --begin_fold 0 --end_fold 4

python main.py --adv_coef 0.0 --model html --exp_setup_id new --n_layers_orl 1 --n_layers_shared 2 --begin_fold 0 --end_fold 4

python main.py --adv_coef 0.0 --model sp --exp_setup_id new --n_layers_orl 3 --begin_fold 0 --end_fold 4

python main.py --adv_coef 0.1 --model asp --exp_setup_id prior --n_layers_orl 3 --begin_fold 0 --end_fold 10

Identifier
DOI https://doi.org/10.11588/data/LWN9XE
Related Identifier https://doi.org/10.18653/v1/N18-1054
Metadata Access https://heidata.uni-heidelberg.de/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.11588/data/LWN9XE
Provenance
Creator Marasovic, Ana
Publisher heiDATA
Contributor Marasovic, Ana; Anette Frank
Publication Year 2019
Rights info:eu-repo/semantics/openAccess
OpenAccess true
Contact Marasovic, Ana (Department of Computational Linguistics, Heidelberg University, Germany)
Representation
Resource Type Dataset
Format application/zip
Size 14676065
Version 1.0
Discipline Other