Dataset - B2FIND

Adolescents Mental Health and Cognitive Ability

The data set includes data about the Palestinian children mental health status and cognitive ability living in political violent environment Schoolchildren Mental Health and...

Digital soil mapping predicted on mid-infrared (MIR) spectroscopy measurement...

Soil information is valuable for many disciplines (e.g. agriculture, geomorphology, geology, archaeology) and can be used to produce maps or statistics on soil productivity. As...

Soil properties predicted on mid-infrared (MIR) spectroscopy measurements in ...

Soil information is valuable for many disciplines (e.g. agriculture, geomorphology, geology, archaeology) and can be used to produce maps or statistics on soil productivity. As...

MAAT, MAP and pH in soils and peats

Branched glycerol dialkyl glycerol tetraethers (brGDGTs) are a family of bacterial lipids which have emerged over time as robust temperature and pH paleoproxies in continental...

Compilation of Branched GDGT data from globally distributed altitudinal trans...

Branched glycerol dialkyl glycerol tetraethers (brGDGTs) are a family of bacterial lipids which have emerged over time as robust temperature and pH paleoproxies in continental...

Electrolyzers-HSI: Close-Range Multi-Scene Hyperspectral Imaging Benchmark Da...

Electrolyzers-HSI Dataset Description: The Electrolyzers-HSI dataset is a multiscene RGB-Hyperspectral benchmark dataset comprising 55 scene of shredded Electrolyzers samples....

The manifest and store data of 870,515 Android mobile applications

We built a crawler to collect data from the Google Play store including the application's metadata and APK files. The manifest files were extracted from the APK files and then...

Dataset for systematic review on risk of bias of studies on prediction models...

This record includes the dataset collected for the systematic review titled “Risk of bias in studies on prediction models developed using supervised machine learning techniques:...

Python functions -- cross-validation methods from a data-driven perspective

This is the organized python functions of proposed methods in Yanwen Wang PhD research. Researchers can directly use these functions to conduct spatial+ cross-validation,...

Spatial+ Cross-Validation (SP-CV) experiments datasets and codes

This data includes all datasets and codes for implementing spatial+ cross-validation experiments. Except for datasets and code, Reademe.txt explains each file's meaning,...

Mapping tick dynamics and tick bite risk using data-driven approaches and vol...

This deposit contains the materials used during the development of this PhD thesis. During this research, we applied machine learning methods to obtain new insights about tick...

WMT18 Quality Estimation Shared Task Training and Development Data

Training and development data for the WMT18 QE task. Test data will be published as a separate item. This shared task will build on its previous six editions to further examine...

WMT16 Quality Estimation Shared Task Training and Development Data

Training and development data for the WMT16 QE task. Test data will be published as a separate item. This shared task will build on its previous four editions to further examine...

WMT17 Quality Estimation Shared Task Training and Development Data

Training and development data for the WMT17 QE task. Test data will be published as a separate item. This shared task will build on its previous five editions to further examine...

WMT16 APE Shared Task Data

Training, development and text data (the same used for the Sentence-level Quality Estimation task) consist in English-German triplets (source, target and post-edit) belonging to...

WMT16 APE Shared Task Data - Reference sentences

Training, development and test data consist in German sentences belonging to the IT domain and already tokenized. These sentences are the references of the data released for the...

Corpus of contemporary blogs

In NLP Centre, dividing text into sentences is currently done with a tool which uses rule-based system. In order to make enough training data for machine learning, annotators...

SnakeCLEF 2021

The dataset with 409,679 images belonging to 772 snake species from 188 countries and all continents (386,006 images with labels targeted for development and 23,673 images...

WMT17 Quality Estimation Shared Test Data

Test data for the WMT17 QE task. Train data can be downloaded from http://hdl.handle.net/11372/LRT-1974 This shared task will build on its previous five editions to further...

WMT18 Quality Estimation Shared Task Test Data

Test data for the WMT18 QE task. Train data can be downloaded from http://hdl.handle.net/11372/LRT-2619. This shared task will build on its previous six editions to further...

308 datasets found