HIV-phyloTSI: Subtype-independent estimation of time since HIV-1 infection for cross-sectional measures of population incidence using deep sequence data
Estimating the time since HIV infection (TSI) at population level is essential for tracking changes in the global HIV epidemic. Most methods for determining TSI give a binary classification of infections as recent or non-recent within a window of several months, and cannot assess the cumulative impact of an intervention. We developed a Random Forest Regression model, HIV-phyloTSI, which combines measures of within-host diversity and divergence to generate continuous TSI estimates directly from viral deep-sequencing data, with no need for additional variables. HIV-phyloTSI provides a continuous measure of TSI up to 9 years, with a mean absolute error of less than 12 months overall and less than 5 months for infections with a TSI of up to a year. It performs equally well for all major HIV subtypes based on data from African and European cohorts. We demonstrate how HIV-phyloTSI can be used for incidence estimates on a population level.
- Type: Whole Genome Sequencing
- Archiver: European Genome-Phenome Archive (EGA)
Click on a Dataset ID in the table below to learn more, and to find out who to contact about access to these data
| Dataset ID | Description | Technology | Samples |
|---|---|---|---|
| EGAD50000001308 | Illumina NovaSeq 6000 | 102 | |
| EGAD50000001309 | Illumina NovaSeq 6000 | 312 | |
| EGAD50000001310 | Illumina NovaSeq 6000 | 113 |
