Reads and pairwise distances from 10 samples of diatoms in Geneva lake

DOI

This dataset contains 55 hdf5 files related to 10 samples (one per month) of benthic diatoms collected in Geneva lake at monthly interval in the same location (close to UMR Carrtel on the shore of the lake). For each sample, DNA has been extracted, a fragment amplified (a marker of 312 bp in rbcL fragment), and sequenced. Next, all pairwise distances between reads have been computed (from Smith-Waterman local alignment score), within and between samples. This has led to 55 hdf5 files organized each as follows as far as h5 datasets are concerned:

sequence identifiers (seqid): one h5 dataset if within a sample, two if between samples

sequences (word): one h5 dataset if within a sample, two if between samples

pairwise distances between sequences (h5 dataset distances).

Pairwise distances have been computed through DARI project i2015037360 (8 millions of hours, 2016, give, to AF) at IDRIS on Turing and Ada machines.

As there are 10 samples, there are 10 files for within sample distances, and 45 files (n(n-1)/2 with n=10) for between samples istances. There are 55 hdf5 samples, labeled L1 to L10 within each sample, and Lx_Ly beween samples, with x < y . (Note that the files are ordered according to lexicographic order of their names).

Identifier
DOI https://doi.org/10.57745/NKTRHO
Metadata Access https://entrepot.recherche.data.gouv.fr/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.57745/NKTRHO
Provenance
Creator Franc, Alain; Frigerio, Jean-Marc; Chancerel, Emilie; Salin, Franck; Thérond, Sylvie; Rimet, Frédéric; Bouchez, Agnès
Publisher Recherche Data Gouv
Contributor Franc, Alain; Entrepôt-Catalogue Recherche Data Gouv
Publication Year 2023
Rights etalab 2.0; info:eu-repo/semantics/openAccess; https://spdx.org/licenses/etalab-2.0.html
OpenAccess true
Contact Franc, Alain (INRAE & INRIA)
Representation
Resource Type Dataset
Format application/x-h5
Size 34064771769; 6451276062; 13740080192; 8328514151; 6244196587; 11816045616; 6518511010; 8465774138; 10538890165; 9343714795; 6567849372; 12257260159; 18883962560; 9075440582; 16345073033; 9470451915; 12069747016; 15093474001; 12808636293; 9442418377; 6787363699; 14165720766; 12148294868; 6712828112; 8573001119; 10722367940; 9592607225; 6776740556; 24091615034; 26684611512; 13321554348; 17010031107; 21304021038; 19098836515; 13395477425; 7345805066; 14827907665; 8911064337; 11166441865; 10054193740; 7064876653; 12627394166; 19484821099; 14916125445; 13501194535; 9953902108; 19550582619; 24419279083; 17723625894; 12471159851; 16139491679; 22600364618; 11382806565; 7970868984; 15792256939
Version 1.1
Discipline Geosciences; Earth and Environmental Science; Environmental Research; Natural Sciences