Abstract Body

Monitoring new mutations in SARS-CoV-2 is crucial for identifying diagnostic and therapeutic targets and important insights to achieve a more effective COVID-19 control strategy. Next-generation sequencing (NGS) has been widely used for whole-genome sequencing of SARS-CoV-2. However, NGS methods may be limited by the complexity of workflow, which limits scalability. Here, we address this limitation by designing a workflow optimized for high-throughput studies.

We utilized modified ARTIC network v3 primers for SARS-CoV-2 whole-genome amplification. Similar to a previously reported tailed PCR approach, libraries were prepared by a 2-step PCR method but optimized to improve amplicon balance, integrate robotic liquid handlers, and minimize amplicon dropout for viral genomes harboring primer-binding site mutation(s). Sequencing was performed on the Illumina NovaSeq 6000 and the Illumina MiSeq. An in-house analysis pipeline utilized the BWA aligner and iVar software. Assay precision was assessed with unique clinical samples. Assay sensitivity was assessed with serial dilutions of clinical samples. Robustness was assessed by sequencing samples and controls on the NovaSeq from multiple prior ARTIC v3 runs.

Intra-assay (n=188) and inter-assay (n=168) precision at the amino acid substitution level was 99.8% and 99.5%, respectively. Over 98.2% (111/113) of samples with a cycle threshold (Ct) <28 yielded a near-complete (?97%) consensus sequence, and 98.7% (147/149) of samples with a Ct <30 yielded ?90% consensus coverage. 2,688 samples and controls were sequenced in a single NovaSeq run yielding a 94.3% (2,416/2,562) sample pass rate. The optimized workflow gave more complete SARS-CoV-2 genome consensus sequences for most viral clades than the original ARTIC v3 workflow (Table). From over 65,000 clinical samples sequenced in 2021, we observed clade and lineage prevalence in-line with those documented by the CDC in 2021, including the Alpha clade that peaked at 65.3% in May, and the Delta clade that attained near-100% prevalence in September.

We present an optimized workflow to process up to 2,688 samples in a single NovaSeq 6000 run without compromising sensitivity or robustness and with fewer amplicon dropout events compared to the standard ARTIC protocol. We additionally report results for over 65,000 SARS-CoV-2 clinical specimens collected in the United States between January and September of 2021, as part of an ongoing national genomics surveillance effort.