This deposit contains the tangible chemical space of the project COSMOPOLY (ANR APG 2022). This chemical library enumerates biobased polyfunctional molecules (dedicated to cosmetics) obtained from a building-blocks library available from agro-industries, biorefineries and chemical companies. This library contains carbohydrates, fatty acids, organic acids, alcohols, polyols, phenolic acids, … presumably accessible at industrial scales.
The "cosmopoly_library.csv" file contains systematically enumerated molecular structures forming the CosMoPoly library. Each row in the file represents a single molecule, with data organised into four comma-separated columns. The structure of the file is outlined below:
File Structure:
First Column ("Product")
This column contains the molecular structure represented in SMILES notation.
Second ("bb1") and Third ("bb2") Columns
These columns contain the names of the building blocks used to generate the compound listed in the first column ("Product").
Fourth Column ("bb3")
If the molecule was generated via acetalization followed by esterification, this column contains the name of the third building block (bb3).
If the molecule was generated using only two building blocks, this column states "None".
Additional Information:
The last 49 rows of the file contain the SMILES representations of the building blocks themselves.
The second column ("bb1") in these rows lists their corresponding names.
When a building block is ambiguous, because of its undefined stereochemistry, it is described as a list of the matching chemical names seperated by semi-columns
This document is used to illustrate the manuscript: "Spherical GTM: a new proposition for visualisation of chemical data" to be considered for publication in Molecular Informatics.