9 datasets found

Keywords: Gauhati University

Filter Results
  • Assamese Corpus

    Assamese Corpus was developed in the NLP Lab of Gauhati University. Total size of Assamese Corpus (in terms of words) is 1.6 million (1613551 words). The Corpus is prepared...
  • Assamese-English Bilingual Dictionary

    The Bilingual dictionary is created for Assamese-English.. In the Bilingual dictionary English meaning of Assamese words are given with POS of the words. These Assamese NLP...
  • Assamese POS Tagger

    Assamese POS tagger is a CRF++ based POS Tagger. CRF++ is a customizable open source Conditional Random Fields for tagging/labeling continuos text. CRF++ is implemented for...
  • Assamese spell variation list

    A spelling variant of a word occurs when a word may not have only a single correct spelling. There are many different ways in which it can be spelled in linguistics. A spell...
  • Assamese Stopwords

    The most frequently occurring words in a context are the stopwords. They do not play an important role in retrieving information. As Stopwords do not contribute any important...
  • Assamese Named Entities

    A list comprising of 104138 Assamese named entities was developed. The list also comprises of NEs which are categorized as Organization(সদৌ অসম ছাত্ৰ সন্থা), Person...
  • Assamese Root Words

    This list comprises of Assamese root words. Size of the Assamese Root Word List is 15,750 words These Assamese NLP resources including the Tools and Applications are...
  • Assamese POS-Tagged Text

    Assamese POS tagger is a CRF++ based POS Tagger. Raw text is given to this CRF++ based POS tagger to get POS tagged data. Standard POS tagset is used. These Assamese NLP...
  • Assamese Multi Word Expressions

    Multiword Expressions are sequence of words, separated by space delimiter (or any) which determines a unique meaning instead of words' individual meanings. A list comprising of...
You can also access this registry using the API (see API Docs).