An accurate assignment test for extremely low-coverage whole-genome sequence data

Genomic assignment tests can provide important diagnostic biological characteristics, such as population of origin or ecotype. In ancient DNA research, such characters can provide further information on population continuity, evolution, climate change, species migration, or trade, depending on archaeological context. Yet, assignment tests often rely on moderate- to high-coverage sequence data, which can be difficult to obtain for many ancient specimens and in ecological studies, which often use sequencing techniques such as ddRAD to bypass the need for costly whole-genome sequencing. We have developed a novel approach that efficiently assigns biologically relevant information (such as population identity or structural variants) in extremely low-coverage sequence data. First, we generate databases from existing reference data using a subset of diagnostic SNPs associated with a biological characteristic. Low coverage alignment files from ancient specimens are subsequently compared to these databases to ascertain allelic state yielding a joint probability for each association. To assess the efficacy of this approach, we assigned inversion haplotypes and population identity in several species including Heliconius butterflies, Atlantic herring, and Atlantic cod. We used both modern and ancient specimens, including the first whole-genome sequence data recovered from ancient herring bones. The method accurately assigns biological characteristics, including population membership, using extremely low-coverage (e.g. 0.0001x) based on genome-wide SNPs. This approach will therefore increase the number of ancient samples in ecological and bioarchaeological research for which relevant biological information can be obtained.

Identifier
Source https://data.blue-cloud.org/search-details?step=~0121223AB0E30D7BD363AE82AE955D86BCA42AF0361
Metadata Access https://data.blue-cloud.org/api/collections/1223AB0E30D7BD363AE82AE955D86BCA42AF0361
Provenance
Instrument Illumina HiSeq 2500; ILLUMINA
Publisher Blue-Cloud Data Discovery & Access service; ELIXIR-ENA
Contributor Archaeogenomics group, Department of Biosciences, University of Oslo
Publication Year 2024
OpenAccess true
Contact blue-cloud-support(at)maris.nl
Representation
Discipline Marine Science
Temporal Point 2021-06-04T00:00:00Z