2 datasets found

Keywords: sentences

  • COSTRA 1.0: A Dataset of Complex Sentence Transformations

    COSTRA 1.0 is a dataset of Czech complex sentence transformations. The dataset is intended for the study of sentence-level embeddings beyond simple word alternations or standard...
  • Corpus of contemporary blogs

    In NLP Centre, dividing text into sentences is currently done with a tool which uses rule-based system. In order to make enough training data for machine learning, annotators...