6 datasets found

Keywords: embeddings

Filter Results
  • PaECTER embeddings for Patstat 2023 Fall (USPTO and EPO)

    PaECTER embeddings for all patent families as of Patstat 2023 Fall release. File "PaECTER_EPO_patstat-2023-fall.parquet" contains embeddings for all patent families with an EPO...
  • Pat-SPECTER embeddings for Patstat 2023 Fall (USPTO and EPO)

    Pat-SPECTER embeddings for all patent families as of Patstat 2023 Fall release. File "Pat-SPECTER_EPO_patstat-2023-fall.parquet" contains embeddings for all patent families with...
  • KGR10 FastText Polish word embeddings

    Distributional language model (both textual and binary) for Polish (word embeddings) trained on KGR10 corpus (over 4 billion of words) using Fasttext with the following variants...
  • KGR10-RoBERTa

    Polish RoBERTa model pre-trained on KGR10 corpora.
  • LitLat BERT

    Trilingual BERT-like (Bidirectional Encoder Representations from Transformers) model, trained on Lithuanian, Latvian, and English data. State of the art tool representing...
  • Lithuanian Word embeddings

    GloVe type word vectors (embeddings) for Lithuanian. Delfi.lt corpus (~70 million words) and StanfordNLP were used for training. The training consisted of several stages: 1)...
You can also access this registry using the API (see API Docs).