This is my response to Dennis Venema’s Biologos blog that he posted after I published my email and blog at Nature Ecology and Evolution Community about his book Adam and the Genome. This text is also placed as a comment under Dr Venema’s blog post on the Biologos blog page.
Dear Dennis,
I am glad that we are now establishing a dialogue about the scientific credibility of a bottleneck of two at some point in the history of the human lineage. I am hoping that during the course of this discussion we will be able to examine in detail the claims that you make in chapter three of Adam and the Genome, and that you will respond to all the critiques and questions that I have raised in my email to you and my blog at Nature Ecology and Evolution Community.
This Part I of your response is helpful in that it clears up some areas of potential misunderstanding between us, and points me to two arguments that are not made explicitly in your book chapter. I trust I can look forward to the subsequent Parts for your responses to the majority of the issues I have raised.
I will work through your blog in this comment, seeking to be as constructive as possible in my reading of it.
Scientific Confidence vs. Scientific Certainty
I am happy to take your point that you do not believe that science has DISPROVEN that a bottleneck of two individuals could have happened in the human lineage. Your position is that you are as certain that it has not happened as you are certain that the earth rotates around the sun. I am sorry if I mischaracterised your position as being more certain than it actually is.
10,000 individuals?
In your blog you say: “I do not claim this [heliocentric level of] certainty for the oft-cited ~10,000 figure, as Buggs seems to imply”.
I am happy to take this point, but I should explain why I got the impression from your book chapter that you hold pretty strongly to the 10,000 figure. In your book chapter
you argue that multiple independent methods converge on a figure of 10,000, and even predict that one method that gives a lower figure is likely to be revised upwards. Here are the relevant quotations from your chapter:
it is worth at least sketching out a few of the methods geneticists use that support the conclusion that we descend from a population that has never dipped below about 10,000 individuals.
Then you mention evidence from allelic diversity, and state:
these methods indicate an ancestral population size for humans right around that 10,000 figure.
Then you present an argument from linkage disequilibrium, and state:
The results indicate that we come from an ancestral population of about 10,000 individuals— the same result we obtained when using allele diversity alone.
Then you say more about linkage disequilibrium and state:
The researchers found that, during this period, humans living in sub-Saharan Africa maintained a minimum population of about 7,000 individuals, and that the ancestors of all other humans maintained a minimum population of about 3,000—once again, adding up to the same value other methods arrive at.
Then you describe the PSMC method and state:
Taken together, this is in good agreement with previous, less powerful methods, with a combined minimum size of around 6,900 individuals. These numbers may shift upward, however, as we sequence more and more individuals from both groups.
This is why I got the impression that you attached a high degree of certainty to the 10,000 figure. This impression came across especially strongly in the last two statements above, when you were (I think incorrectly, as I argue in my blog) adding up numbers from two populations to come to a 10,000 figure, and then suggesting that the PSMC method’s result might be revised upwards (towards 10,000, presumably). I think that if you re-read your chapter yourself you will agree that the 10,000 figure comes across quite strongly, and sounds to the reader like a very precise measurement of past human population size.
However, I am willing to take your point that you do not attach such a high level of certainty to the 10,000 figure as you attach to there never having been a bottleneck of two. That seems a reasonable position to hold.
Heterozygosity and population bottlenecks
The majority of your blog is taken up with the topic of genetic diversity. I think that we are largely in agreement here. I am glad that you agree with the points I made about the amount of heterozygosity that can be carried through a short, sharp bottleneck. I do not dispute that allelic diversity can provide stronger evidence for a past bottleneck than heterozygosity can. In my blog I stated this clearly: “A sharp bottleneck will affect allelic richness more than heterozygosity”. I am grateful that you have helped out non-scientists who are seeking to follow our debate by giving a simple “Genetics 101” explanation of why this is so in your blog.
Although we are in agreement about the relative merits of heterozygosity and allelic diversity in detecting bottlenecks, misunderstanding between us has arisen for two reasons: (1) ambiguous usage of the term “genetic variability” in your book chapter, and (2) the choice of Tasmanian Devils in your book chapter as an example of the consequences of a population bottleneck. I will explain both of these below.
(1) I commented on heterozygosity in my email and blog because in your book chapter you refer many times to “genetic variability”. As you know, in scientific population genetics literature the term “genetic variability” does not refer only to allelic diversity. Genetic variability of populations is measured in many ways: heterozygosity, allelic diversity, private allele frequency, gene diversity, fixation indices, inbreeding coefficients etc. I did not realise that when you use the term in your chapter you intend only to refer to allelic diversity. That is not the way the term is normally used in the field. I therefore assumed that you were also referring to heterozygosity. It is a pity that this ambiguity was present, but I understand that it is hard to write about science at a popular level without the occasional ambiguity slipping in that a specialist will stumble on.
(2) I also got the impression you are including heterozygosity within your definition of genetic variability because of your choice of Tasmanian Devils as an exemplar of a species that has undergone a bottleneck. This exemplar takes up quite a large proportion of the early part of your chapter. It is well known that Tasmanian devils have low heterozygosity as well as low allelic diversity – they have much lower levels of heterozygosity than humans (see this paper). The low heterozygosity within Tasmanian Devils appears to be partly responsible for the low fitness of their populations, likely due to several prolonged bottlenecks. In fact, you say of the Tasmanian Devils: “most of them have exactly the same alleles with only rare differences.” That sounded to me as I read the chapter to be popular-science-level statement that they have low heterozygosity.
For these reasons, I thought you were including heterozygosity in your chapter as one aspect of genetic variability. However I am willing to take your point that you were not, now that you have clearly stated this. I am happy to put this down to a communication issue. I misread your chapter, and did not realise you mean “allelic diversity” whenever you say “genetic variability”. I did not realise that when you bring up the example of Tasmanian Devils you are leaving to one side the issue of their low heterozygosity.
Now that we have cleared up this point, I think we can leave the issue of heterozygosity behind us, as we seem to be in full agreement about it.
Allelic diversity and bottlenecks
Now to look in more detail at the points you raise about allelic diversity. This is where I think your argument is strongest, so I would like to examine it in some detail. To do this full justice, I want to start with what you say about this in your book chapter. One of your most explicit statements about this in your book chapter is as follows:
…scientists have many other methods at their disposal to measure just how large our population has been over time. One simple way is to select a few genes and measure how many alleles of that gene are present in present-day humans. Now that the Human Genome Project has been completed and we have sequenced the DNA of thousands of humans, this sort of study can be done simply using a computer. Taking into account the human mutation rate, and the mathematical probability of new mutations spreading in a population or being lost, these methods indicate an ancestral population size for humans right around that 10,000 figure. In fact, to generate the number of alleles we see in the present day from a starting point of just two individuals, one would have to postulate mutation rates far in excess of what we observe for any animal.
As I note in my blog, you give no citation to the scientific literature to back up this point, so it is hard for me to interact with you on it. I would invite you again to make such a citation so that we can discuss this point further.
In your recent blog you have now made a similar claim, and given more detail:
So, a bottleneck to two individuals would leave an enduring mark on our genomes – and one part of that mark would be a severe reduction in the number of alleles we have – down to a maximum of four alleles at any given gene. Humans, however, have a large number of alleles for many genes – famously, there are hundreds of alleles for some genes involved in immune system function. These alleles take time to generate, because the mutation rate in humans is very low. This high allele diversity is thus the first indication that we did not pass through a severe population bottleneck, but rather a relatively mild one (estimated, as we have discussed, at about 10,000 individuals by current methods).
Would I be correct in assuming that this statement in your blog is intended to illuminate the passage I quoted above from your book chapter? If so, this is helpful as you give a link in the blog to an online primer about Human leukocyte antigen (HLA) genes, suggesting that your argument relates to these genes. But the online primer has nothing in it about models of past human bottlenecks. I would invite you to make a more explicit argument on this point, as I think this is the strongest argument that is available to you against a bottleneck of two. As you mention HLA genes in your blog, it sounds to me as if your argument may rest on Ayala et al (1994) but this paper was published before the human genome project, so I assume you must have a more up to date source that you drew on for your book chapter. Please could you let me know what it is so that I can follow up your argument?
I realise that some of your non-biologist readers may think I am being rather pedantic in asking for a citation when you are making what appears to be a very straightforward case from allele numbers. But biologist readers will know that very few things in this area are straightforward, and without a citation I have to treat your claims as unsubstantiated. For example, if your argument is from HLA genes, I have already mentioned these briefly in my blog, and why their rapid rates of evolution may prevent them from making strong argument against a bottleneck::
Hyper-variable loci like MHC genes [of which HLA genes are a type] or microsatellites have so many alleles that they seem to defy the idea of a single couple bottleneck until we consider that they have very rapid rates of evolution, and could have evolved very many alleles since a bottleneck.
Also as I wrote in a comment on the Skeptic Zone:
MHC loci are pretty exotic. Several studies show that they evolve fast and may be under sexual selection, pathogen-mediated selection, and frequency-dependent selection; they may also have heterozygote advantage (see e.g. http://rspb.royalsocietypublishing.org/content/277/1684/979). The maintenance of MHC polymorphism is still “an evolutionary puzzle” (https://www.nature.com/articles/ncomms1632). There is some evidence for convergent evolution of HLA genes (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1918223/, http://onlinelibrary.wiley.com/doi/10.1111/j.1600-065X.1999.tb01381.x/full, https://link.springer.com/article/10.1007/BF00189233, https://link.springer.com/article/10.1007%2Fs002510050028). If the whole case for large human ancestral population sizes rests on MHC loci, I think this is inadequate to prove the point, given our current state of knowledge on MHC evolution.
I look forward to hearing more from you on this topic in future blog posts.
Rare alleles
Finally in your blog you make an argument from frequencies of rare alleles. This is an argument that is not mentioned in your book chapter, as far as I am aware. You state in your blog:
Another effect that a bottleneck to two individuals would produce is that there would be no rare alleles after the bottleneck. All alleles would have a frequency of at least 25%. As the population expanded after such an event, those alleles would stay common, and only new mutations would produce less common alleles. What we observe in humans in the present day is that many alleles are rare – even exceedingly rare. The distribution of alleles in present-day humans looks like it comes from an old, large population – not one that passed through an extreme bottleneck within the last few hundred thousand years, which is when our species is found in the fossil record. Thus the observation that we have many alleles of certain genes and the distribution of allele frequencies both support the hypothesis that humans come from a population, rather than a pair.
I agree with everything you are saying, up until the full stop after “exceedingly rare”. That is my understanding of the patterns of human genetic diversity also. However, beyond this point I need you to give a citation to the scientific literature to support your claims that the distribution of alleles in humans is inconsistent with “an extreme bottleneck within the last few hundred thousand years”. This is an interesting claim and one I would like to follow up, but without a citation this is an unsubstantiated assertion. I think I may have partly anticipated this argument in my blog when I wrote: “We need to bear in mind that explosive population growth in humans has allowed many new mutations to rapidly accumulate in human populations (A. Keinan and A. G. Clark (2012) Science 336: 740-743).”
Conclusion
I am grateful to you for beginning to respond to the objections I have raised to his book chapter in my email and Nature Eco Evo blog. I am glad we have cleared up the issue of heterozygosity and appear to be in agreement about it. I invite you to make clear citations to the scientific literature to back up several key points that you make in your book chapter and in this current blog. I note that you have not yet addressed the majority of my criticisms of your book chapter. I look forward to your responses to the objections I have raised to your use of: (1) the example of the Tasmanian Devils, (2) PSMC analysis, (3) the linkage disequilibrium study by Tenesa and colleagues, and (4) incomplete lineage sorting.