CorPipe 24 Multilingual CorefUD 1.2 Model (corpipe24-corefud1.2-240906)

PID

The corpipe24-corefud1.2-240906 is a mT5-large-based multilingual model for coreference resolution usable in CorPipe 24 (https://github.com/ufal/crac2024-corpipe). It is released under the CC BY-NC-SA 4.0 license.

The model is language agnostic (no corpus id on input), so it can be in theory used to predict coreference in any mT5 language.

This model jointly predicts also the empty nodes needed for zero coreference. The paper introducing this model also presents an alternative two-stage approach first predicting empty nodes (via https://www.kaggle.com/models/ufal-mff/crac2024_zero_nodes_baseline/) and then performing coreference resolution (via http://hdl.handle.net/11234/1-5673), which is circa twice as slow but slightly better.

Identifier
PID http://hdl.handle.net/11234/1-5672
Related Identifier https://arxiv.org/abs/2410.02756
Related Identifier http://hdl.handle.net/11234/1-5369
Related Identifier https://github.com/ufal/crac2024-corpipe
Metadata Access http://lindat.mff.cuni.cz/repository/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:lindat.mff.cuni.cz:11234/1-5672
Provenance
Creator Straka, Milan
Publisher Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Publication Year 2024
Rights Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0); http://creativecommons.org/licenses/by-nc-sa/4.0/; PUB
OpenAccess true
Contact lindat-help(at)ufal.mff.cuni.cz
Representation
Language Catalan; Valencian; Czech; German; English; Spanish; Castilian; French; Hungarian; Lithuanian; Bokmål, Norwegian; Norwegian Bokmål; Norwegian Nynorsk; Nynorsk, Norwegian; Polish; Russian; Turkish; Church Slavic; Old Slavonic; Church Slavonic; Old Bulgarian; Old Church Slavonic; Greek, Ancient (to 1453)
Resource Type toolService
Format text/plain; charset=utf-8; application/zip; downloadable_files_count: 1
Discipline Linguistics