Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Commonality despite exceptional diversity in the baseline human antibody repertoire


In principle, humans can produce an antibody response to any non-self-antigen molecule in the appropriate context. This flexibility is achieved by the presence of a large repertoire of naive antibodies, the diversity of which is expanded by somatic hypermutation following antigen exposure1. The diversity of the naive antibody repertoire in humans is estimated to be at least 1012 unique antibodies2. Because the number of peripheral blood B cells in a healthy adult human is on the order of 5 × 109, the circulating B cell population samples only a small fraction of this diversity. Full-scale analyses of human antibody repertoires have been prohibitively difficult, primarily owing to their massive size. The amount of information encoded by all of the rearranged antibody and T cell receptor genes in one person—the ‘genome’ of the adaptive immune system—exceeds the size of the human genome by more than four orders of magnitude. Furthermore, because much of the B lymphocyte population is localized in organs or tissues that cannot be comprehensively sampled from living subjects, human repertoire studies have focused on circulating B cells3. Here we examine the circulating B cell populations of ten human subjects and present what is, to our knowledge, the largest single collection of adaptive immune receptor sequences described to date, comprising almost 3 billion antibody heavy-chain sequences. This dataset enables genetic study of the baseline human antibody repertoire at an unprecedented depth and granularity, which reveals largely unique repertoires for each individual studied, a subpopulation of universally shared antibody clonotypes, and an exceptional overall diversity of the antibody repertoire.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: Uniqueness of the repertoires of individual subjects.
Fig. 2: Clonotype and sequence diversity amongst the 10 subjects.
Fig. 3: Shared clonotypes and sequences amongst the 10 subjects.

Data availability

Sequence data that support the findings in this study are available at the NCBI Sequencing Read Archive ( under BioProject number PRJNA406949. Raw and processed datasets are available at


  1. Rajewsky, K. Clonal selection and learning in the antibody system. Nature 381, 751–758 (1996).

    ADS  CAS  Article  Google Scholar 

  2. Alberts, B. et al. The Generation of Antibody Diversity (Garland Science, New York, 2002).

    Google Scholar 

  3. Boyd, S. D. & Crowe, J. E. Jr. Deep sequencing and human antibody repertoire analysis. Curr. Opin. Immunol. 40, 103–109 (2016).

    CAS  Article  Google Scholar 

  4. Briney, B. & Burton, D. Massively scalable genetic analysis of antibody repertoires. Preprint at (2018).

  5. Briney, B., Le, K., Zhu, J. & Burton, D. R. Clonify: unseeded antibody lineage assignment from next-generation sequencing data. Sci. Rep. 6, 23901 (2016).

    ADS  CAS  Article  Google Scholar 

  6. Morbach, H., Eichhorn, E. M., Liese, J. G. & Girschick, H. J. Reference values for B cell subpopulations from infancy to adulthood. Clin. Exp. Immunol. 162, 271–279 (2010).

    CAS  Article  Google Scholar 

  7. Morisita, M. Measuring of the dispersion of individuals and analysis of the distributional patterns. Mem. Fac. Sci. Kyushu Univ. Ser. E 2, 5–235 (1959).

    Google Scholar 

  8. Horn, H. S. Measurement of ‘overlap’ in comparative ecological studies. Am. Nat. 100, 419–424 (1966).

    Article  Google Scholar 

  9. Setliff, I. et al. Multi-donor longitudinal antibody repertoire sequencing reveals the existence of public antibody clonotypes in HIV-1 infection. Cell Host Microbe 23, 845–854 (2018).

    CAS  Article  Google Scholar 

  10. Chao, A. Estimating the population size for capture–recapture data with unequal catchability. Biometrics 43, 783–791 (1987).

    MathSciNet  CAS  Article  Google Scholar 

  11. Kaplinsky, J. & Arnaout, R. Robust estimates of overall immune-repertoire diversity from high-throughput measurements on samples. Nat. Commun. 7, 11881 (2016).

    ADS  CAS  Article  Google Scholar 

  12. Chao, A. & Chiu, C.-H. Nonparametric Estimation and Comparison of Species Richness (John Wiley & Sons, 2016).

  13. Eren, M. I., Chao, A., Hwang, W.-H. & Colwell, R. K. Estimating the richness of a population when the maximum number of classes is fixed: a nonparametric solution to an archaeological problem. PLoS ONE 7, e34179 (2012).

    ADS  CAS  Article  Google Scholar 

  14. DeKosky, B. J. et al. In-depth determination and analysis of the human paired heavy- and light-chain antibody repertoire. Nat. Med. 21, 86–91 (2015).

    CAS  Article  Google Scholar 

  15. Arnaout, R. et al. High-resolution description of antibody heavy-chain repertoires in humans. PLoS ONE 6, e22365 (2011).

    ADS  CAS  Article  Google Scholar 

  16. Marcou, Q., Mora, T. & Walczak, A. M. High-throughput immune repertoire analysis with IGoR. Nat. Commun. 9, 561 (2018).

    ADS  Article  Google Scholar 

  17. Morea, V., Tramontano, A., Rustici, M., Chothia, C. & Lesk, A. M. Conformations of the third hypervariable region in the VH domain of immunoglobulins. J. Mol. Biol. 275, 269–294 (1998).

    CAS  Article  Google Scholar 

  18. Finn, J. A. et al. Improving loop modeling of the antibody complementarity-determining region 3 using knowledge-based restraints. PLoS ONE 11, e0154811 (2016).

    Article  Google Scholar 

  19. Briney, B. S., Willis, J. R., Finn, J. A., McKinney, B. A. & Crowe, J. E. Jr. Tissue-specific expressed antibody variable gene repertoires. PLoS ONE 9, e100839 (2014).

    ADS  Article  Google Scholar 

  20. van Dongen, J. J. M. et al. Design and standardization of PCR primers and protocols for detection of clonal immunoglobulin and T-cell receptor gene recombinations in suspect lymphoproliferations: report of the BIOMED-2 Concerted Action BMH4-CT98-3936. Leukemia 17, 2257–2317 (2003).

    Article  Google Scholar 

  21. Masella, A. P., Bartram, A. K., Truszkowski, J. M., Brown, D. G. & Neufeld, J. D. PANDAseq: paired-end assembler for Illumina sequences. BMC Bioinformatics 13, 31 (2012).

    CAS  Article  Google Scholar 

  22. Meyerhans, A., Vartanian, J. P. & Wain-Hobson, S. DNA recombination during PCR. Nucleic Acids Res. 18, 1687–1691 (1990).

    CAS  Article  Google Scholar 

  23. Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J. 17, 10–12 (2011).

    Article  Google Scholar 

  24. Rogers, T. F. et al. Zika virus activates de novo and cross-reactive memory B cell responses in dengue-experienced donors. Sci. Immunol. 2, eaan6809 (2017).

    Article  Google Scholar 

Download references


The authors thank all of the study subjects for their participation and the Genomic Services Laboratory at the HudsonAlpha Institute for Biotechnology for their sequencing expertise. This work was supported by the National Institute of Allergy and Infectious Diseases (Center for HIV/AIDS Vaccine Immunology and Immunogen Discovery, UM1AI100663 (D.R.B.); Center for Viral Systems Biology, U19AI135995 (B.B.)), the International AIDS Vaccine Initiative (IAVI) through the Neutralizing Antibody Consortium SFP1849 (D.R.B.), and the Ragon Institute of MGH, MIT and Harvard (D.R.B.).

Author information

Authors and Affiliations



B.B. and D.R.B. planned and designed the experiments. B.B., A.I. and C.J. performed experiments. B.B. analysed data. B.B. and D.R.B. wrote the manuscript. All authors contributed to manuscript revisions.

Corresponding authors

Correspondence to Bryan Briney or Dennis R. Burton.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Nearly full-length antibody gene amplification from biological and technical replicate samples.

a, Schematic of biological and technical replicate samples. Biological replicates (columns) are derived from distinct cell aliquots, so identical clonotypes or sequences found in multiple biological replicates must arise from different cells. Technical replicates (rows) were amplified using discrete RNA aliquots from a single-cell aliquot. b, Strategy for nearly full-length antibody heavy chains. Black arrows indicate primers. Primers in the cDNA synthesis step anneal to the heavy-chain constant region (CH) and add the first unique molecular identifier (UMI) and the Illumina read 1 primer annealing site. Primers in the second-strand synthesis step anneal to the framework 1 region of the variable gene and add a second UMI and the Illumina read 2 primer annealing site.

Extended Data Fig. 2 V and J frequency correlations of technical and biological replicates.

For each subject, the frequency of V and J combinations was compared for technical replicates (left panels) or biological replicates (right panels). The coefficient of determination (r2) is shown for each plot.

Extended Data Fig. 3 Nucleotide mutation frequencies.

a, The distribution of nucleotide mutations in sequences that encode IgM are shown. On the right, the number of unmutated sequences containing no mutations in the variable-gene segment is also plotted. b, The distribution of nucleotide mutations in sequences that encode IgG are shown. On the right, the mean mutation frequency for the IgG population of each subject is shown. Each line represents a single subject. For legibility, the legend is split between the two plots. Although only five subjects are shown in the legend of each plot, data from all ten subjects is present in each plot.

Extended Data Fig. 4 Cross-subject repertoire similarity.

Pairwise Morisita–Horn similarity comparisons between each subject and all other subjects. Similarity was computed using the frequency of V-gene, J-gene and CDRH3 length combinations. Each line represents the mean of 20 independent repertoire samplings (with replacement). The shading surrounding the mean line indicates the 95% confidence interval.

Extended Data Fig. 5 Collapsing sequences into clonotypes.

a, To demonstrate the effect of collapsing an expanded clonal lineage into clonotypes, we selected a previously reported lineage of Zika-specific monoclonal antibodies isolated from the plasmablast population of an acutely infected patient24. Of 119 sequences, 89 were unique at the nucleotide level. b, Sequences encoding the same V gene, J gene and an identical CDRH3 amino acid sequence were collapsed into clonotypes, and the sequence phylogeny was coloured by clonotype. A total of 119 sequences were collapsed into 18 clonotypes. c, Sequences were collapsed into clonotypes, allowing a single mismatch in the CDRH3 amino acid sequence, and the sequence phylogeny was coloured by clonotype. A total of 119 sequences were collapsed into 10 clonotypes. d, The clonotype fraction (number of clonotypes divided by the total number of filtered sequences), when collapsing clonotypes while allowing zero or one mismatch in the CDRH3 amino acid sequence for each subject in this study. e, Number of total clonotypes recovered when allowing zero or one mismatch in the CDRH3 amino acid sequence for each subject in this study.

Extended Data Fig. 6 Capture–recapture frequency.

a, Recapture frequency for each subject. Lines represent the mean of 10 random samplings (without replacement) for all subsample fractions except compete sampling (1.0). b, Mean recapture frequency for each subsample fraction.

Extended Data Fig. 7 Relative light-chain diversity estimation.

Using previously reported datasets of paired heavy and light antibody chains, clonotype diversity was estimated for heavy and light chains using both Chao 2 and Recon estimators. Estimates are shown in filled or unfilled points. Lines indicate the least-squares polynomial best fit (degree = 2) and is extrapolated to include both the lowest (1.17 × 108) and highest (9.06 × 108) number of UMI-corrected sequences from the 10 sequenced subjects.

Extended Data Fig. 8 Variance between inferred V(D)J recombination models.

a, Frequency of clonotype sharing between observed human subjects (black), synthetic datasets generated with IGoR’s default recombination model (red), synthetic datasets generated with subject-specific recombination models (blue) or synthetic datasets generated with a combined-subject recombination model (purple). b, Combined Kullback–Leibler divergence (KL divergence) between pairs of subject-specific models (blue), between subject-specific models and IGoR’s default model (red), or between subject-specific models and the combined-subject model (purple). c, Combined KL divergence between pairs of subject-specific models, separated by event type.

Extended Data Table 1 Demographic information and sequencing statistics per subject
Extended Data Table 2 Primers used for antibody gene amplification

Supplementary information

Source data

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Briney, B., Inderbitzin, A., Joyce, C. et al. Commonality despite exceptional diversity in the baseline human antibody repertoire. Nature 566, 393–397 (2019).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

Further reading


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing