Abstract Body

HIV-1 evolves rapidly, increasing its genetic diversity and complexity, with group M containing >80 subtypes and circulating recombinant forms (CRFs). Subtype determination is important epidemiologically and can impact treatment and vaccine development. Multiple automated subtyping tools are available; however, differences in their subtype/CRF assignments in a predominantly subtype B setting or with large surveillance data have not been fully evaluated.

We included polymerase sequences ≥500-bp in length reported to the U.S. National HIV Surveillance System for HIV-1-infected persons (one sequence/person). We assigned HIV-1 subtype or CRF using COMET (COntext-based Modeling for Expeditious Typing), REGA V3, and SCUEAL (Subtype Classification Using Evolutionary ALgorithms). For sequences not classified as subtype B by all three methods (including those classified as non-B), we performed phylogenetic analysis using a fast, novel method (phylopartitioning) that combined FastTree approximate maximum likelihood inference using 2,864 curated reference sequences with cluster analysis to identify subtype using Phylopart. We compared results of these subtyping approaches.

Of 71,659 sequences, subtype B classification varied by method (COMET:94.8%; REGA:91.6%; SCUEAL:89.6%, p<0.0001). In all, 95.7% were determined to be subtype B by at least one method, and 85.6% were classified as subtype B by all three methods. Of 67,973 sequences assigned as subtype B by COMET, 99.3% were assigned to subtype B by REGA, SCUEAL, or both. Of 6,624 sequences assigned to subtype B by COMET that were not subtype B by all three tools, 3,798 (57.3%) were B by REGA but not SCUEAL, 2,319 (35.0%) were B by SCUEAL but not REGA, and 475 (7.2%) were not B by either REGA or SCUEAL.  Of these 6,624, 6,580 (99.3%) were subtype B by phylopartitioning. For non-B subtypes/CRFs, agreement between the three methods also varied, with almost 90% of COMET and REGA assignments, but only 65.4% of SCUEAL assignments, matching results from phylopartitioning. REGA and SCUEAL identified a higher percentage of all sequences as unique recombinants than COMET (REGA: 4.4%; SCUEAL: 6.7%; COMET: 1.2%).

In a setting dominated by subtype B, overall results varied by subtyping method. REGA and SCUEAL reported a high number of unique recombinants. COMET and phylopartitioning, on the other hand, identified a larger number of subtype B sequences.