Background
Advances in high-fidelity long-read (HiFi-LR) sequencing technologies offer unprecedented opportunities to uncover the genomic diversity of complex microbial environments, such as soil. While short-read (SR) sequencing has historically enabled broad insights at gene-level diversity, its limited read length constrains the reconstruction of complete genomes. Conversely, HiFi-LR sequencing enhances the quality and completeness of metagenome-assembled genomes (MAGs), enabling higher-resolution taxonomic and functional annotation. However, the high cost and relatively low throughput of HiFi-LR sequencing can limit genome recovery, particularly at the binning stage, where coverage depth is critical.
Results
Here, we present a novel hybrid strategy that differs from classical hybrid assemblies, where SR and LR reads are jointly used at the assembly step. Instead, we use high-depth SR data to inform the binning of HiFi-LR contigs. Using both SR and HiFi-LR metagenomic datasets generated from a tunnel-cultivated soil sample, we demonstrate that SR-derived coverage profiles significantly improve the binning of HiFi-LR assemblies. This results in a substantial increase in the number and quality of recovered MAGs compared to using HiFi-LR data alone.
Conclusion
Our findings highlight that, even in the context of HiFi reads, combining SR and LR remains beneficial in highly diverse environments, such as soil, not for hybrid assembly per se, but to enhance the downstream binning process. This cost-effective hybrid binning approach provides a practical framework for maximising genome recovery in complex microbiomes.