Supplementary data of the manuscript --- Abstract ------
We present a new version of a barcoding reference library dedicated to diatoms, Diat.barcode v12, with newly published sequences, annotated with ecological, and biological traits and curated by a college of experts. We used this library in two different areas, one where the taxonomic coverage of the library was good (mainland France) and another where it was poor (French Guyana) with about 320 diatom samples collected for river monitoring. We show that a direct bioinformatic assignment of environmental sequences to traits has a strong interest in French Guyana where species knowledge is poor and therefore the proportion of assigned environmental sequences is much lower (12.8%) than trait assignation (30%). Using co-correspondence analyses, we show that species assignation dataset and trait assignation datasets were significantly correlated in 7 out of 13 cases in French Guyana, whereas they were always significantly correlated in Mainland France. This can be interpreted as an important loss of ecological information with species assignation in French Guyana, which is not observed in mainland France. This shows the value for ecological studies to use direct assignation of environmental sequences to traits in regions where taxonomic knowledge is poor.