Expression and gene network datasets for benchmarking phenotype prediction

DOI

Companion simulated dataset for the preprint "Should we really use graph neural networks for transcriptomic prediction?" together with data obtained from the datasets:

BreastCancer (from [1]; available at http://mypathsem.bioinf.med.uni-goettingen.de/resources/glrp [accessed 2022-09-27]) CancerType (from [2]; available at https://drive.google.com/drive/folders/1_Cnvab7mIwCrNJyY-J4aR2ck9i72KH8t?usp=sharing [accessed 2022-10-13]) F1000 (from [3]; available from authors).

R, 4.2.0

Julia, 1.7.3

Scripts used to generate and preprocess the dataset are also included in the repository. This dataset is associated to the script repository https://forgemia.inra.fr/nathalie.villa-vialaneix/gnn.git.

Identifier
DOI https://doi.org/10.57745/BZ0TTC
Metadata Access https://entrepot.recherche.data.gouv.fr/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.57745/BZ0TTC
Provenance
Creator Brouard, Céline; Mourad, Raphaël; Vialaneix, Nathalie ORCID logo
Publisher Recherche Data Gouv
Contributor Vialaneix, Nathalie; Institut national de recherche pour l'agriculture, l'alimentation et l'environnement (INRAE); Entrepôt-Catalogue Recherche Data Gouv
Publication Year 2023
Funding Reference INRAE: MathNum division Délégation Raphaël Mourad
Rights etalab 2.0; info:eu-repo/semantics/openAccess; https://spdx.org/licenses/etalab-2.0.html
OpenAccess true
Contact Vialaneix, Nathalie (Université de Toulouse, INRAE, UR MIAT, 31326 Castanet-Tolosan, France)
Representation
Resource Type Dataset
Format text/html; text/tsv; application/zip; text/comma-separated-values; text/tab-separated-values; application/x-xz
Size 1602405; 2324682; 1104821; 1416917; 4908856; 60903376; 884289; 1218; 1122; 5890; 22996392; 12422363; 494815672; 3216; 290; 40000; 36121; 32661; 14616; 7963; 7606
Version 3.0
Discipline Mathematics; Life Sciences; Medicine