![]() Duplex proximity sequencing (Pro-Seq) 24 and SaferSeqS 25 use multiplexed PCR, limiting their applications to small targeted panels with deep sequencing. To date, several methods have sought to overcome the low efficiency of generating a duplex consensus, but they are limited to either targeted panels or shallow sequencing. ![]() For example, duplex sequencing 20, which has been the gold standard of high-accuracy sequencing and used by other recent methods 21, 22, tags double-stranded UMIs on each original duplex to achieve >1,000-fold higher accuracy, but recovering both strands among many other strands could require 100-fold excess reads 23, which severely limits its utility. While it is possible to use unique molecular identifiers (UMIs) to separately track both strands of each DNA molecule to detect true mutations 19, this does not solve the underlying limitation of NGS-duplex dissociation. ![]() Without a complementary strand for comparison, errors introduced on either strand due to base damage, PCR and sequencing 18 can be disguised as real mutations. Yet, its accuracy is limited by the need to dissociate Watson and Crick strands of each DNA duplex. NGS affords high throughput by reading short, clonally amplified DNA fragments in massively parallel fluorescence analysis. Next-generation sequencing (NGS), on the other hand, continues to offer superior read accuracy and throughput 17 but is not configured to sequence single duplexes-at least not without severely compromising its throughput or utility. However, in practice, they lack the required accuracy and throughput 15, 16. In principle, single-molecule sequencing technologies (for example, PacBio and Oxford Nanopore Technologies) can keep single DNA duplexes intact throughout their workflows to sequence them in whole to resolve true mutations on both strands apart from false mutations on either strand. CODEC enables more precise genetic testing and reveals biologically significant mutations, which are commonly obscured by NGS errors.ĭiscovering extremely low-abundance mutations as rare as within a single double-stranded DNA molecule (a ‘single duplex’) is crucial to finding diagnostic 1, 2, predictive 3, 4 and prognostic 5, 6 biomarkers understanding cancer evolution 7, 8 and somatic mosaicism 9, 10 and studying infectious diseases 11, 12 and aging 13, 14. CODEC detected genome-wide, clonal hematopoiesis mutations from single DNA molecules, single mutated duplexes from tumor genomes and liquid biopsies, microsatellite instability with 10-fold greater sensitivity and mutational signatures, and specific tumor mutations with up to 100-fold fewer reads. CODEC revealed mutation frequencies of 2.72 × 10 −8 in sperm of a 39-year-old individual, and somatic mutations acquired with age in blood cells. CODEC affords 1,000-fold higher accuracy than NGS, using up to 100-fold fewer reads than duplex sequencing. Here we present Concatenating Original Duplex for Error Correction (CODEC), which confers single duplex resolution to NGS. Next-generation sequencing (NGS) affords tremendous throughput but cannot directly sequence double-stranded DNA molecules (‘single duplexes’) to discern the true mutations on both strands. ĭetecting mutations from single DNA molecules is crucial in many fields but challenging. ![]() HRDetect’s breast cancer signature set is available at. CODECsuite is available at 10.5281/zenodo.7705860 and, which also contains the end-to-end Snakemake workflow and code for CODEC-MSI. Ĭode required to reproduce the analyses in this paper is available online. ![]() NA12878 PacBio data was downloaded from GIAB. GUID: 7735BEA4-5475-47D5-85B5-40154528B595 Data Availability StatementĭNA sequencing data and results generated for this study such as Mutect2 MAF files will be available from dbGaP under accession code phs003255.v1.p1. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |