Abstract Body

The human genome encodes seven APOBEC3 (A3) enzymes, at least four of which (A3D, A3F, A3G, and A3H) can induce G-to-A mutations in HIV-1 genomes. These enzymes leave two distinct hypermutation signatures: GG-to-AG and GA-to-AA. The former signature is dominant in viral sequences hypermutated in vitro by A3G, whereas the latter is prevalent in sequences hypermutated in vitro by A3D, A3F, or A3H. Current, in vitro based models posit that all HIV-restrictive A3 enzymes are ubiquitously active and cooperate to hypermutate HIV. However we have discovered that an in vivo hypermutated virus typically bears a dominant GG-to-AG signature or a dominant GA-to-AA signature, which would be expected from independent encounters with A3G or A3D/F/H, but not from both enzyme classes simultaneously. This hypermutation bias towards GG-to-AG or GA-to-AA suggests the existence of a mechanism that prevents A3 proteins from simultaneously targeting HIV-1.

We performed four independent analyses: 1) We analyzed all reported in vivo hypermutated HIV-1 sequences (1164 sequences from 988 patients) using two independent methods (non-alignment-based and alignment-based); 2) We analyzed all of the 564 SNPs of the A3 locus in 2504 individuals from 26 populations (1000 Genomes Project). 3) We quantified, using RNAseq data, all of the reported A3 transcripts in 461 donors from the 1000 Genome Project ; 4) We quantified, the linkage disequilibrium in 120 kb A3 locus.

By analyzing A3 SNPs, RNAseq, and hypermutated viral sequences from thousands of HIV-1 patients and healthy donors, we have generated three independent datasets that indicate the source of skewed hypermutation patterns is natural genetic variations in A3G and A3H. First, only one hypermutation signature predominates in most clinical HIV-1 isolates. Second, A3G and A3H form two continuous haplotype blocks as a result of strong genetic linkage. Block 1 is prevalent outside Africa (particularly Asia) and contains the hypo-functional A3H HapI (GKE). Block 2 is prevalent in Africa and contains the hyper-functional A3H HapII (RDD). Third, A3H HapI and HapII and their respective A3G haplotypes a-g-t-t-t and g-c-c-c-c are expressed differentially.

Overall, these results indicate that A3G and A3H are expressed differentially in different human populations and that these enzymes are the main sources HIV-1 hypermutation. The mutually exclusive function of A3G and A3H may be a source of weakness in our immunity to HIV-1.