base-selective adaptors impact in 2b-RAD studies

The cost of population genomics studies on non-model species has been reduced in the last decade, yet the species genome size and the sequencing depth for a correct genotyping imposes a trade-off that remains challenging cost wise for most species. Among RAD sequencing methods for genome reduction analyses, 2b-RAD methodologies can provide further secondary reduction by using base-selective adaptors that allow adjusting the study costs to the research question and budget, although its impact on genotyping is unknown. Here we provide empirical evaluation of the performance of using fully degenerate and base-selective 2b-RAD adaptors in library construction and posterior genotyping, using the invasive ascidian Styela plicata as model. We built libraries with the two types of adaptors for the same individuals, and compared the number of loci and genotypes obtained for the same sequencing effort in the two datasets. We applied different filters for missing data frequently used in population genomics and analyzed the two datasets independently or combined. The number of loci present in all individuals was larger when using fully degenerate adapters and smaller, but for the combined dataset with higher locus mean depth in all cases, when using base-selection. Between the two adaptor type libraries, including only loci present in all the individuals, we found identical genotypes in 92% of the loci from the same individual. Most genotyping mismatches could be attributed to low sequencing power, and only 0.72% of the genotyping mismatches were attributable toproduced by the use of base-selective adaptors. When allowing loci present in 75% of samples (as is common practice in population genomic studies), only 70% of the loci had coincident genotypes for the same individual, further reduced to 35% when loci belonging to 50% of allowed. We show that genotyping discrepancies are mostly caused by missing data and that 4 million reads are necessary for reliable genotyping of loci in S. plicata (400Mb genome size) when fully degenerate adaptors are used, while only 1.2 million reads are necessary when using base-selective adaptors. Our work proves that genotyping using fully degenerate and base-selective adaptors is potentially coincident at 99% of the shared loci representing no significant bias in heterozygosity, although it is needed an optimal locus coverage. We concluded that 2b-RAD libraries using base-selective adaptors can be safely used in population genomics of species with large genome sizes, to reduce costs and ensure enough read depth for a correct genotyping without a bias in genetic diversity and differentiation.

Identifier
Source https://data.blue-cloud.org/search-details?step=~0129C27688F38C33555F1EEC12A24851A5B5AE97809
Metadata Access https://data.blue-cloud.org/api/collections/9C27688F38C33555F1EEC12A24851A5B5AE97809
Provenance
Instrument Illumina HiSeq 2500; ILLUMINA
Publisher Blue-Cloud Data Discovery & Access service; ELIXIR-ENA
Publication Year 2024
OpenAccess true
Contact blue-cloud-support(at)maris.nl
Representation
Discipline Marine Science
Temporal Point 2022-06-21T00:00:00Z