21 datasets found

Keywords: information structure

Filter Results
  • Information structure and historical English OV/VO variation

    This dataset contains the data that is used in: Struik, Tara and Ans van Kemenade. Information structure and OV word order in Old and Middle English: a phase-based approach. To...
  • B2 Hausa

    Hausa: complete set, status: final, manually transcribed, glossed and translated to English, annotated wrt. morphology, parts of speech, syntax, gramm. function, sem. roles,...
  • B1 Yom

    The data sets for each language consist of a small number of mini-dialogues, chosen out of the 189 entries within the Focus Translation Task (cf. Skopeteas et al. 2006: 209ff.)...
  • B1 Foodo

    The data sets for each language consist of a small number of mini-dialogues, chosen out of the 189 entries within the Focus Translation Task (cf. Skopeteas et al. 2006: 209ff.)...
  • B4 Ludolf

    The texts of this corpus, Ludolf von Sudheims Reise ins Heilige Land (Ludolf of Sudheim's Journey to the Holy Land), is a journey diary describing the adventures of a group of...
  • B4 Otfrid

    Das Referenzkorpus Altdeutsch erfasst und annotiert die ältesten Sprachdenkmäler des Deutschen vom Beginn der kontinuierlichen schriftlichen Überlieferung um 750 bis etwa 1050...
  • A5 Hausa News

    This corpus of news articles from the online news service of Deutsche Welle contains 4 texts with a total of 2017 tokens.   CLARIN Metadata summary for A5 Hausa News...
  • B4 Heliand

    Heliand 1, 4 and 5: complete text, status: final, digitalization, translation to Modern German, manually annotated with parts of speech, syntactic categories, grammatical...
  • B1 Aja

    The data sets for each language consist of a small number of mini-dialogues, chosen out of the 189 entries within the Focus Translation Task (cf. Skopeteas et al. 2006: 209ff.)...
  • B4 Tatian Corpus of Deviating Examples 2.1

    The present corpus, the Tatian Corpus of Deviating Examples T-CODEX 2.1, provides morpho-syntactic and information structural annotation of parts of the Old High German...
  • A5 Hausa Umarnin Uwa

    This corpus of Umarnin Uwa film transcripts contains 47 transcripts with a total of 10194 tokens. It provides information including automatic POS tagging, speaker and...
  • B4 Historisches Predigtenkorpus zum Nachfeld

    HIPKON is the first corpus based on only one text type (sermons) and on one dialect area, Upper German (Bavarian-Alemannic). The sermons cover the time from Middle High German...
  • B2 Marghi

    Full set: all focus related experiments, status: work in progress, large parts elicited, most of the data transcribed, partly annotated. CLARIN Metadata summary for B2 Marghi...
  • B1 Fon

    The data sets for each language consist of a small number of mini-dialogues, chosen out of the 189 entries within the Focus Translation Task (cf. Skopeteas et al. 2006: 209ff.)...
  • B2 Guruntum

    Guruntum sample: sample, status: final, manually transcribed, glossed and translated to English, annotated wrt. morphology, parts of speech, syntax, gramm. function, sem. roles,...
  • B7 Wolof (Wikipedia)

    The corpus comprises out of a collection of texts from the Wolof Wikipedia, randomly chosen for their near-standard like orthography and language, and treating different topics....
  • B2 Bura

    Full set: all focus related experiments, status: work in progress, large parts elicited, most of the data transcribed, partly annotated CLARIN Metadata summary for B2 Bura...
  • B4 Sächsische Weltchronik

    The corpus contains a chronic from the 13th century in Middle Low German. Es handelt sich um eine Chronik, in Mittelniederdeutsch, 13 Jh. Beschreibung der Textzeugen usw. in:...
  • B7 Wolof (web)

    The corpus comprises out of a collection of texts from discussion forums in the web, randomly chosen for their near-standard like orthography and language, and treating...
  • B4 Muspilli

    Complete text, status: work in progress, digitalization, translation to English, manually annotated with parts of speech, syntactic category, grammatical function, clause...
You can also access this registry using the API (see API Docs).