CosMoPoly tangible chemical space

DOI

This deposit contains the tangible chemical space of the project COSMOPOLY (ANR APG 2022). This chemical library enumerates biobased polyfunctional molecules (dedicated to cosmetics) obtained from a building-blocks library available from agro-industries, biorefineries and chemical companies. This library contains carbohydrates, fatty acids, organic acids, alcohols, polyols, phenolic acids, … presumably accessible at industrial scales.

The "cosmopoly_library.csv" file contains systematically enumerated molecular structures forming the CosMoPoly library. Each row in the file represents a single molecule, with data organised into four comma-separated columns. The structure of the file is outlined below:

File Structure:

First Column ("Product")

This column contains the molecular structure represented in SMILES notation.

Second ("bb1") and Third ("bb2") Columns

These columns contain the names of the building blocks used to generate the compound listed in the first column ("Product").

Fourth Column ("bb3")

If the molecule was generated via acetalization followed by esterification, this column contains the name of the third building block (bb3).
If the molecule was generated using only two building blocks, this column states "None".

Additional Information:

The last 49 rows of the file contain the SMILES representations of the building blocks themselves. The second column ("bb1") in these rows lists their corresponding names. When a building block is ambiguous, because of its undefined stereochemistry, it is described as a list of the matching chemical names seperated by semi-columns

This document is used to illustrate the manuscript: "Spherical GTM: a new proposition for visualisation of chemical data" to be considered for publication in Molecular Informatics.

Identifier
DOI https://doi.org/10.57745/RAHSKA
Metadata Access https://entrepot.recherche.data.gouv.fr/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.57745/RAHSKA
Provenance
Creator ASGARKHANOVA, Farah ORCID logo; MARCOU, Gilles ORCID logo; VOLKOV, Mikhail ORCID logo; MUZARD, Murielle ORCID logo; PLANTIER-ROYON, Richard ORCID logo; RÉMOND, Caroline ORCID logo; HORVATH, Dragos ORCID logo; VARNEK, Alexandre (ORCID: 0000-0003-1886-925X)
Publisher Recherche Data Gouv
Contributor MARCOU, Gilles; KLIMCHUK, Olga; Université de Strasbourg; Centre national de la recherche scientifique; Université de Reims Champagne-Ardenne; Institut National De Recherche Pour L’agriculture, L’alimentation Et L’environnement; Entrepôt-Catalogue Recherche Data Gouv
Publication Year 2025
Funding Reference Agence nationale de la recherche ANR-22-CE43-0005
Rights etalab 2.0; info:eu-repo/semantics/openAccess; https://spdx.org/licenses/etalab-2.0.html
OpenAccess true
Contact MARCOU, Gilles (UMR7140 CNRS, University of Strasbourg); KLIMCHUK, Olga (Laboratory of Chemoinformatics, UMR 7140 ; University of Strasbourg, CNRS ; France)
Representation
Resource Type Dataset
Format text/comma-separated-values; text/plain
Size 2578483; 1099
Version 1.1
Discipline Chemistry; Natural Sciences