Données de réplication: Modeling skin sensitization based on bone marrow-derived dendritic cells (BMDC) assay

DOI

This dataset was used for training of a quantitative structure-activity relationship (QSAR) model that predicts skin sensitization according to bone marrow-derived dendritic cells (BMDC) assay.

File: skin_sens_bmdc.sdf

The dataset is provided as one file, skin_sens_bmdc.sdf in MDL SDF format (see for instance https://discover.3ds.com/sites/default/files/2020-08/biovia_ctfileformats_2020.pdf). Descriptions of the fields are given below:

SMILES (string) SMILES (Simplified Molecular Input Line Entry System) is a representation of a molecule in string format.

CAS number (string) CAS (Chemical Abstract Service) is a unique and unambiguous identifier of a molecule or a substance. Compound_name (string) A common chemical name of a compound or a subtance. LLNA_potency (nominal) Skin sensitization potency according to LLNA assay categories. Levels are: NS (non-sensitizer), Weak, Moderate, Strong, Extreme. LLNA_class (integer) Binary skin sensitization classification based on LLNA assay: 0 - non-sensitizer; 1 - sensitizer. BMDC_class (integer) Binary skin sensitization classification based on BMDC assay: 0 - non-sensitizer; 1 - sensitizer.

File: All_Sensitization_Labels.sdf

The file All_Sensitization_Labels.sdf contains all compounds with the sensitization labels interpreted from different end points. For logical variables, 0 means "non-sensitizing" and 1 means "sensitizing".

Compound_name (string) A common chemical name of a compound or a substance LLNA_Call_ICE (string) LLNA call from the ICE database LLNA_pEC3_ICE (float) LLNA pEC3 from the ICE database Class_LLNA (logical, 0 or 1) Sensitization class for LLNA Class_LuSens_ICE (logical, 0 or 1) Sensitization class for LuSens Class_U-SENS_ICE (logical, 0 or 1) Sensitization class for U-SENS Class_hCLAT_ICE (logical, 0 or 1) Sensitization class for hCLAT Class_mMUSST_ICE (logical, 0 or 1) Sensitization class for mMUSST Class_DPRA_ICE (logical, 0 or 1) Sensitization class for DPRA Class_KeratinoSens_ICE (logical, 0 or 1) Sensitization class for KeratinoSens Class_BMDC (logical, 0 or 1) Sensitization class for BMDC Prediction_PredSkin (logical, 0 or 1) Sensitization class predicted by the PredSkin model Class_Human_PredSkin (logical, 0 or 1) Sensitization class for human from PredSkin dataset Class_LLNA_PredSkin (logical, 0 or 1) Sensitization class for LLNA from PredSkin dataset Confidence_PredSkin (nominal, "Low", "Medium", "High") Sensitization class confidence of the prediction by the PredSkin model. All entries have the confidence equal to "High" CAS (string) Chemical Abstrasct Service Registry Number of the molecule or substance CAS_Tropsha (string) Chemical Abstrasc Service Registry Number of the molecule or substance according to PredSkin training set

File: data_preparation_and_analyis.knwf

The file data_preparation_and_analysis.knwf is a workflow for the KNIME software version 4.6.5 and later (https://www.knime.com/). The workflow was developed and used for pairwise comparison of different skin sensitization assay labels. It is shipped with the raw data.

Updates: 22/02/2024 - 4-acetoxybenzoic acid LLNA label 1 (old) changed to 0 (new).

Identifier
DOI https://doi.org/10.57745/PPAMKY
Metadata Access https://entrepot.recherche.data.gouv.fr/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.57745/PPAMKY
Provenance
Creator CHEDIK, Lisa ORCID logo; BAYBEKOV, Shamkhal ORCID logo; MARCOU, Gilles ORCID logo; COSNIER, Frédéric (ORCID: 0000-0002-8596-927X); MOUROT-BOUSQUENAUD, Mélanie ORCID logo; JACQUENET, Sandrine ORCID logo; VARNEK, Alexandre (ORCID: 0000-0003-1886-925X); BATTAIS, Fabrice ORCID logo
Publisher Recherche Data Gouv
Contributor Marcou, Gilles; VARNEK, Alexandre; Université de Strasbourg; Centre national de la recherche scientifique; Institut national de recherche et de sécurité pour la prévention des accidents du travail et des maladies professionnelles; Entrepôt-Catalogue Recherche Data Gouv
Publication Year 2023
Rights etalab 2.0; info:eu-repo/semantics/openAccess; https://spdx.org/licenses/etalab-2.0.html
OpenAccess true
Contact Marcou, Gilles (UMR7140 CNRS, University of Strasbourg); VARNEK, Alexandre (Laboratory of Chemoinformatics, UMR 7140 ; University of Strasbourg, CNRS ; France)
Representation
Resource Type Dataset
Format application/octet-stream; text/plain
Size 179360; 36488433; 4355; 170050
Version 2.0
Discipline Chemistry; Natural Sciences