Amplicon, cell count and biovolume, and metabolomic data, and the predicted protein database used for metaproteomic analyses of algal-dominated Greenland Ice Sheet samples

DOI

Data published here are various datasets used in the publication Algal (meta)proteomes uncover cellular adaptations to life on the Greenland Ice Sheet, by Feord et al., submitted for publication. Four datasets are presented in this data publication: i) amplicon sequencing (16S and 18S), ii) cell count and biovolumes of algae morphotypes quantified with a FlowCam, iii) raw and normalized metabolomic data (quantified with LC-MS and GC-MS), and iv) file containing a predicted protein database. The protein data used in Feord et al. (submitted), is available on ProteomeXchange Consortium (http://proteomecentral.proteomexchange.org) via the PRIDE partner repository with the dataset identifier PXD057047 (username: reviewer_pxd057047@ebi.ac.uk and password: kwg7a3NHfhwg).

All data except dataset iv originate from samples collected on the Greenland Ice Sheet in the Summer of 2021 during the DEEP PURPLE ERC ice camp (GR21). This field location (61°05’ N,46°50’ W) is described in Feord et al. (submitted). Datasets i-iii are three different analyses of the same two samples: one snow sample collected on the 24th July 2021 and one ice sample collected on the 7th August 2021. Both samples were high in algal biomass, with the snow sample being visibly red due to pigment-rich snow algae and the ice sample visible purple/brown due to pigment-rich glacier ice algae. All collection, extraction, and analyses methods are described and referenced Feord et al. (submitted).

Analysis and replication within the samples are:

i. Amplicon sequencing (for both 18S and 16S sequencing): SNOW one biological replicate sequenced = one sequencing reaction, and ICE: sequenced with three biological replicates (labelled a,bc) = three sequencing reactions. Raw sequencing data is provided as fastq.gz files and abundance tables as .txt files.

ii. Cell counts and biovolume with FlowCam: SNOW: one biological replicates measured in technical triplicates = three measurements (labelled 1,2,3) and ICE: three biological replicates (labelled a,b,c) measured in technical triplicate (labelled 1,2,3) = nine measurements. Data is provided as .txt files and .png files.

iii. Metabolomic analyses: SNOW: five biological replicates (labelled red_RS1-5) measured in three/four technical replicates (labelled F1-F4) = 19 measurements, and ICE: three biological replicates (labelled GIA_RS1-3) measured in technical triplicates (labelled F1-F3) = nine measurements. Raw data is provided as .mzML files and processed data and tables with sample explanation files are provided as .txt files.
Data iv) is a FASTA file (.fa) with the predicted protein database used to identity proteins from peptide data in Feord et al. (submitted). The database was built by translating open reading frames (ORFs) assembled from previously sequenced polyA-isolated metatranscriptomes from Greenland Ice Sheet samples published by Perini et al. (2024), using the samples MG3, MG5, MG6, MG7, MG8, MG11, MG12, MG14, MG19, MG22, MG23, MG24, MG25, MG26. MG27, MG28, MG30, MG31 from that paper. Assembly, identification of ORFs, and dereplication is described by Feord et al. (submitted)

Identifier
DOI https://doi.org/10.5880/GFZ.3.5.2024.003
Related Identifier IsSupplementTo https://doi.org/10.1038/s41522-025-00770-2
Related Identifier Cites https://doi.org/10.1007/s11306-024-02147-6
Related Identifier Cites https://doi.org/10.1186/s40168-024-01796-y
Metadata Access http://doidb.wdc-terra.org/oaip/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=oai:doidb.wdc-terra.org:8555
Provenance
Creator Feord, Helen ORCID logo; Keuschnig, Christoph ORCID logo; Peter, Elisa K. ORCID logo; Jaeger, Carsten; Lisec, Jan ORCID logo; Mourot, Rey ORCID logo; Peters, Ravi Sven; Benning, Liane G. ORCID logo
Publisher GFZ Data Services
Contributor Feord, Helen
Publication Year 2025
Funding Reference European Research Council http://dx.doi.org/10.13039/501100000781 Crossref Funder ID 856416 DEEP PURPLE
Rights CC BY 4.0; http://creativecommons.org/licenses/by/4.0/
OpenAccess true
Contact Feord, Helen (GFZ Helmholtz Centre for Geosciences, Potsdam, Germany)
Representation
Resource Type Dataset
Discipline Biology; Life Sciences
Spatial Coverage Study area on the Greenland Ice Sheet