This repository contains code for reproducing experiments done in Marasovic and Frank (2018).
Paper abstract:
For over a decade, machine learning has been used to extract opinion-holder-target structures from text to answer the question "Who expressed what kind of sentiment towards what?". Recent neural approaches do not outperform the state-of-the-art feature-based models for Opinion Role Labeling (ORL). We suspect this is due to the scarcity of labeled training data and address this issue using different multi-task learning (MTL) techniques with a related task which has substantially more data, i.e. Semantic Role Labeling (SRL). We show that two MTL models improve significantly over the single-task model for labeling of both holders and targets, on the development and the test sets. We found that the vanilla MTL model, which makes predictions using only shared ORL and SRL features, performs the best. With deeper analysis, we determine what works and what might be done to make further improvements for ORL.
Data for ORL
Download MPQA 2.0 corpus.
Check mpqa2-pytools for example usage.
Splits can be found in the datasplit folder.
Data for SRL
The data is provided by: CoNLL-2005 Shared Task, but the original words are from the Penn Treebank dataset, which is not publicly available.
How to train models?
python main.py --adv_coef 0.0 --model fs --exp_setup_id new --n_layers_orl 0 --begin_fold 0 --end_fold 4
python main.py --adv_coef 0.0 --model html --exp_setup_id new --n_layers_orl 1 --n_layers_shared 2 --begin_fold 0 --end_fold 4
python main.py --adv_coef 0.0 --model sp --exp_setup_id new --n_layers_orl 3 --begin_fold 0 --end_fold 4
python main.py --adv_coef 0.1 --model asp --exp_setup_id prior --n_layers_orl 3 --begin_fold 0 --end_fold 10