Responding to Felsenstein, Schaffner and Harshman at The Skeptical Zone

Here is the text of a comment I posted at The Skeptical Zone in response to comments by Joe Felsenstein, Steve Schaffner and John Harshman on my Nature Ecology and Evolution blog on human bottlenecks:

Thank you all for interacting with my Nature Ecology and Evolution Community blog, and thanks to Vincent Torley for posting here. Vincent kindly sent me a personal email pointing out this thread to me and asking me to specifically interact with comments made by Steve Schaffner and Joe Felsenstein. I will also comment on John Harshman’s comments as he is making the strongest case against a bottleneck of two, which was not mentioned explicitly by Dennis Venema in his book chapter.

First, I note that both Schaffner and Felsenstein agree with my point that the bottleneck hypothesis has not been directly tested.

Schaffner: “Buggs is right that existing tests have not been tested rigorously against an ancient Adam and Eve scenario. On the other hand, no one has shown that such a scenario would be undetectable by those tests either; it’s just not a scenario most geneticists are interested in.”

Felsenstein: “Most of the effort in analyzing these data has been to infer the past history of population size, rather than to make statements about A&E.”

Felsenstein goes further than this and suggests that it could never be entirely disproven: “If one poses the problem as whether we can absolutely certainly rule out A&E, that is asking for more than science can deliver. But if we ask whether it is made very improbable, that is not as hard to establish.”

Second, I note that neither of them are defending the PSMC argument, and Felsenstein implies that it is not necessarily reliable and new methods need to be developed

Felsenstein: “There are coalescent methods that use more information than PSMC, which only uses 2 haploid genomes at a time. Those methods need more development, but when they get it there will be more focused analyses.”

Thirdly, I note that no one so far has challenged my statements about the Tenesa at al paper based on linkage disequilibrium (I would welcome more comments on this paper).

Fourth, I need to respond to Schaffner’s comment: “The argument about total heterozygosity is a red herring. A tight but short bottleneck has a relatively modest effect on heterozygosity, but a dramatic effect on the distribution of allele frequencies. A bottleneck of two individuals eliminates all frequencies at less than 25% frequency, which is where the great majority of variants are.”

I don’t think it is a red herring, as (1) it is worth pointing out to the non-specialist that the results of a short bottleneck are far less devastating than a long one, where inbreeding would eventually eliminate all allelic variation; (2) during a bottleneck of two, most variability has to be carried by heterozygosity within the two individuals, so this is important; (3) because there are four DNA bases, two individuals can potentially carry all possible alleles at any given SNP locus. In response to his final sentence above, I note that it is the high frequency alleles in the human population that most require explanation in terms of ancestral variation, and those alleles at less than 25% frequency are also the ones most easily explained by recent mutation.

Fifth, I’m really glad to hear that Steve Schaffner has done some simulations. This is exactly what I think needs to be done to nail down this issue.

Schaffner: “I doubt anyone has ever made a formal test, but having played around with simulations I think an absolute minimum of several hundred thousand years would be required to generate something like the observed frequency spectrum for humans; half a million years is a more plausible lower bound… It’s not clear how many creationists are interested in a half million year old Adam anyway.”

I very much hope that Schaffner will write these up for publication. I agree with his final comment, but a creationist (in the conventional sense of the word) would not be concerned about this entire topic as it assumes common ancestry and creationism can have genetic diversity front-loaded into Eve’s ova anyway, thereby avoiding the whole issue of genetic diversity. I suspect many Christians, Jews and Muslims would be interested in the idea of a half million year old ancestral bottleneck of two.

Sixth, to come to John Harshman’s comments: John is making an argument from huge allelic diversity of HLA (human MHC) alleles and signatures of incomplete lineage sorting among these. This is the strongest argument being made in this thread against an ancestral bottleneck. Schaffner and Felsenstein have commented on this:

Schaffner (on HLA): Buggs is suggesting that some of the alleles shared between species could represent homoplasies, presumably as a result of similar selective pressures. How likely that is depends on how complex the alleles are.

Felsenstein: “Just saying that rates of mutation are high and so the pattern could occur is insufficient. One needs a more quantitative analysis with estimated mutation rates. HLA is a hard case for A&E — it would be even better to find more polymorphisms involving multiple haplotypes and put all that information together.”

I agree with both of these points. More work is needed to show in detail whether or not MHC diversity renders a bottleneck impossible. My major overaching point here would be that we need to look at genome-wide patterns of polymorphism to get a reliable picture of past effective population sizes. MHC loci are pretty exotic. Several studies show that they evolve fast and may be under sexual selection, pathogen-mediated selection, and frequency-dependent selection; they may also have heterozygote advantage (see e.g. The maintenance of MHC polymorphism is still “an evolutionary puzzle” ( There is some evidence for convergent evolution of HLA genes (,,, If the whole case for large human ancestral population sizes rests on MHC loci, I think this is inadequate to prove the point, given our current state of knowledge on MHC evolution. As Felsenstein says “— it would be even better to find more polymorphisms involving multiple haplotypes and put all that information together.”