Category: Research

  • More on human-chimp genomic similarity

    More on human-chimp genomic similarity

    The Live Science website has just published an article on Do humans and chimps really share nearly 99% of their DNA? subtitled: “The frequently cited 99% similarity between human and chimp DNA overlooks key differences in the genomes.

    This includes an email interview with a leading figure in the field, Tomas Marques-Bonet. Tomas was the final author of the 2013 Nature paper Great ape genetic diversity and population history. Since then he has published numerous papers on both ape and human genomes.

    The Live Science article reports:

    But the 99% figure is misleading because it focuses on stretches of DNA where the human and chimp genomes can be directly aligned and ignores sections of the genomes that are difficult to compare, Tomas Marques-Bonet, head of the Comparative Genomics group at the Institute of Evolutionary Biology (CSIC/UPF) in Barcelona, Spain, told Live Science in an email.


    Sections of human DNA without a clear counterpart in chimp DNA make up approximately 15% to 20% of the genome, Marques-Bonet said. For example, some bits of DNA are present in one species but missing in the other; these are known as “insertions and deletions.” In the course of evolution from a common ancestor, some pieces of DNA in one species broke off and reattached elsewhere along the chromosome.


    So, while earlier studies suggested a 98% to 99% similarity, comparisons that include harder-to-align regions push that difference closer to 5% to 10%, Marques-Bonet said. “And if we account for the regions still too complex to align properly with current technology, the true overall difference is likely to exceed 10%,” he said.


    In fact, a 2025 study found that human and chimpanzee genomes are approximately 15% different when compared directly and completely. But if this direct method is used, then there is even a lot of variability within species themselves — up to 9% among chimpanzees, the 2025 study found.”

    Another article reporting Tomas Marques-Bonet’s comments can be found on primatology.net.

    For more on the 2025 study, see my earlier blog post here.

  • How much of a human genome is identical to a chimpanzee genome?

    How much of a human genome is identical to a chimpanzee genome?

    Back in 2018 I wrote a blog post entitled “How similar are human and chimpanzee genomes?” This reported my analysis of the data available at the time, concluding:

    “The percentage of nucleotides in the human genome that had one-to-one exact matches in the chimpanzee genome was 82.34%.”

    This was based on human and chimpanzee genome assemblies hg38 and pantro6.

    Critics of my analysis said that this figure could not be trusted because both the human and chimpanzee genome assemblies were incomplete at the time.

    Since then, much more complete genome assemblies have been published for both species.

    Last month, the Nature paper “Complete sequencing of ape genomes” reported various comparisons of telomere-to-telomere genome assemblies.

    When the latest human genome assembly was used as a target, and the latest chimpanzee assembly was aligned to it, the authors report gap divergence of 13.3% and single nucleotide variant divergence of 1.6% (these results can be found in Supplementary Figure III.11 and 12 respectively, for hg0002 vs PanTro3).

    As I understand their methods, gap divergence was based on counting base (A, T, G or C) positions in the human genome that have no aligning base from the chimp genome in the whole genome alignment. Single nucleotide variant divergence was based on counting bases that align to a different base (e.g. an A aligning to a T). The authors calculated these divergences for each 1 million base segment of the human genome then averaged them all to get a genome-wide figure.

    Adding the average gap divergence and average single nucleotide variant divergence together gives a total difference of 14.9%.

    Thus, as I understand it, for the latest assemblies, 85.1 % of the nucleotides in the human genome have one-to-one exact matches in the chimpanzee genome.

    This is clearly a slightly higher figure than the 82.3% that I calculated in 2018. But it is not far off.

    The new result of 85.1% is just for autosomes (non-sex chromosomes). The same Nature paper reports 4.18% and 75.6% gap divergence and 1.15% and 3.98% single nucleotide variant divergence for X and Y chromosomes respectively.

    At some point I would like to repeat exactly the same analysis as I did in 2018 on the 2025 data. But until then, the figures reported by the authors of the Nature paper provide a helpful comparison.

  • Public lecture “Trees of Life: Do they exist?”

    Public lecture “Trees of Life: Do they exist?”

    In gave my inaugural lecture as Professor of Evolutionary Genomics at Queen Mary University of London on 16th November 2022, the film of which can be viewed below.

    Inaugural lectures are a chance to give a personal view on one’s research field, at a level that will be understood by the whole university and the general public.

    My Vice-Principal asked me to be more personal than usual in this inaugural, speaking about my Christian faith as well as my research as a biologist.

    I decided to do this by placing side-by-side “tree-of-life” concepts from the Bible and from The Origin of Species. By comparing and contrasting the evidence for these very different trees of life, I tried to help the audience understand how I think through things as both a biologist and a Christian.

    Whether or not this worked, you can judge for yourself.

    The lecture drew on articles I have published in Nature Ecology and EvolutionNature Plants, and American Journal of Botany. I describe work by others in Nature and Nature CommunicationsThe Origin of Species provided my starting point on Darwin’s tree of life simile. The works of Richard Dawkins, especially The Greatest Show on Earth and The God Delusion, provided helpful material in both sections of the lecture. On the Biblical tree of life, I used an argument by Peter J. Williams, (whose research recently featured in Nature) developed in his book Can We Trust the Gospels? I also refer to research by Elizabeth Barnes on inclusion in the biological sciences.

    I have been a full professor at Queen Mary for over four years now, but there is a back-log of inaugural lectures, and many never happen at all. So it was a great privilege to be invited to give this.

  • Natural v. Artificial Selection

    Natural v. Artificial Selection

    Last week I published a short article in Molecular Ecology on evidence for natural selection. It has proven difficult to show natural selection occurring in real time in wild populations. New approaches may help, and these are being pioneered in studies of Soay sheep. While commenting on these new approaches, I make several general points about the evidential case for natural selection.

    Perhaps the more broadly interesting of these is a critique of the argument by analogy to natural selection. I suggest that, although widely used, the analogy has severe limitations. You can read this critique from the second to the seventh paragraph of my article, which is available open access here.

  • Why phylogenetics is difficult

    Why phylogenetics is difficult

    Here is a short video I made for one of my MSc classes, explaining why building phylogenetic trees is not easy.

    For examples of this phenomenon in my own research see here and here.

  • Lost elms of Kent

    Lost elms of Kent

    Mature elm trees in the English landscape are something I and many other have never seen. Dutch Elm Disease killed them all in the 1960s. Only the older generation can remember what we have lost. Browsing through some local photos from the 1930s this weekend, my eyes were opened to the size and grace of the elms that once existed. Here are some of those photos, from Capel, Kent. Beneath each one I show a picture of what the scenes look like today.

    (more…)
  • How to lead a journal club

    How to lead a journal club

    Getting together to discuss a published paper is a classic way of keeping on top of the literature and training students how to read it.

    During my postgraduate studies I went to a journal club every week organised by my PhD supervisor. It was here that I learned how to read a scientific paper, and gained confidence in critiquing published studies. Now I run a journal club for my MSc students, and sometimes (but not often enough!) for my PhD students. Here are some tips I give my students before they lead a journal club meeting.

    (more…)
  • “Abundant bioactivity” of random DNA sequences?

    This blog was written for the Nature Ecology and Evolution Community where it is posted here.

    Probing the claims of a recent study

    Readers of this blog will be aware of the recent Nature Ecology and Evolution paper entitled “Random sequences are an abundant source of bioactive RNAs or peptides”. Rafik Neme, the first author, posted an engaging Behind the Paper blog here.

    On a quick look, I thought the study might be the beginnings of the solution to the mystery of orphan genes. (I posted about orphan genes here a few months ago.) The paper appears to demonstrate that an unexpectedly high percentage of random 150 base-pair DNA sequences are functional when expressed in E. coli. If true, this would suggest that de novo gene evolution could occur easily from junk DNA. (more…)

  • Darwin’s abominable mystery

    One of the hidden gems of Royal Botanic Gardens Kew is its library. I spent several happy hours there researching a recent letter to Nature Ecology and Evolution, published in June under the title “The deepening of Darwin’s abominable mystery“.

    The brightest moment came when a helpful librarian found me an 1838 reprint of a lecture by palaeobotanist Adolphe Brongniart: a lecture that I had not even known existed. Not only did this turn out to be the lecture that had provided Darwin with key information about the plant fossil record as he wrote his notebooks on the transmutation of species, but it was reprint sent by the author to J. S. Henslow, Darwin’s botany professor at Cambridge. It probably came to Kew via his son-in-law Joseph Hooker, Director of Kew, to whom Darwin famously wrote in 1879 that “the rapid development as far as we can judge of all the higher plants within recent geological times is an abominable mystery” .

    At moments like this, history is tangible. And it is sometimes at such moments that we have to reassess our anachronistic understandings of past science. This is the message of my Nature Ecology and Evolution letter. In it I argue that Darwin’s “abominable mystery” has been misunderstood ever since it came into the public sphere in 1903. This is because it has been assumed that by “higher plants,” Darwin meant “angiosperms”; that is, all flowering plants.

    The 1903 edition of Darwin’s letter

    Between 1879 and 1903 Darwin’s letter to Hooker about the “abominable mystery” lay unpublished. When it was published in 1903, the editors, A. C. Seward and Francis Darwin left the readers in little doubt that Darwin meant “angiosperms” when he said “higher plants”. This can be seen in their page-header shown below (N.B. the word “abominable is at the end of the previous page).

    Later in the book, the editors described their own view of the plant fossil record in a footnote saying: “No satisfactory evidence has been brought forward of the occurrence of fossil Angiosperms in pre-Cretaceous rocks. The origin of the Monocotyledons and Dicotyledons [i.e. the two major groups of Angiosperms as they knew them] remains one of the most difficult and attractive problems of Palaeobotany.” (p.239).

    The understanding of the fossil record that A. C. Seward and Francis Darwin had in 1903 is still the understanding held by the majority of palaeobotanists today. In fact, Nature Plants recently published a through debunking of all claimed pre-Cretaceous angiosperms by some of the world’s most authoritative experts in the palaeo-flora of the lower Cretaceous.

    So from 1903 to today, everyone has thought that Darwin’s “abominable mystery” is about the origin of all angiosperms, suddenly and in great diversity in the Cretaceous, with no obvious ancestral lineage.

    It has therefore been assumed that any reliable pre-Cretaceous angiosperm fossil, or any clear progenitor lineage, or the identification of a sister lineage to the angiosperms, would be a major step towards the solution of the abominable mystery.

    The problem is – as I discovered from Henslow’s copy of Brongniart and several other nineteenth century sources – Darwin’s understanding of the plant fossil record was rather different. Darwin thought that there were pre-Cretaceous angiosperms, and that there was a clear progenitor lineage for the Cretaceous diversity of “higher plants”.

    Discovering Darwin’s view

    This surprising discovery first began to dawn on me when I read an 1879 lecture by John Ball, upon which Darwin was commenting in his “abominable mystery” letter to Hooker. Ball’s essay is a bit obscure to the modern reader as he refers to two groupings of angiosperms: “exogens” and “endogens”. These are obsolete terms, from de Candolle, for monocotyledons and dicotyledons (to make it even harder on the modern reader, the term dicotyledon is also obsolete, although most botanists at least still understand the term).

    John Ball (1879) describes the plant fossil record as showing that exogens (monocotyledons) have a long fossil record, and the endogens (dicotyledons) suddenly evolved from them and diversified rapidly in the Cretaceous. In the context of this lecture, therefore, Darwin seemed to be referring to the sudden appearance of diverse dicotyledons – rather than all angiosperms – in the fossil record as the “abominable mystery”.

    I wanted to be sure that this was the case, by both making sure that this was actually Darwin’s own view, and tracing back the original sources of this view of the plant fossil record.

    This proved easier than expected, because the Darwin Correspondence Project have in the last two years published letters between Darwin and two palaeobotanists Oswald Heer (in 1875), and Gaston de Saporta (in 1876). In these letters, it is clear that Darwin believed that it was dicotyledons that appeared suddenly in the Cretaceous, not angiosperms as a whole. This was partly on the basis of original findings described to him by Oswald Heer, who wrote to Darwin: “The interval from the Devonian to the Cretaceous is immensely long, and, as far as we know until now, the vegetable kingdom then consisted only of cryptograms, confiers & cycads and a few monocots. In the upper Cretaceous , however, the flora suddenly underwent a great transformation and…there appear for the first time the (angiosperm.) Dicotyledonae”.

    As I was browsing through volumes of Darwin’s Correspondence in the Kew library, I noticed near them a plump hardback Concordance to Darwin’s notebooks. On a whim, I looked up the word “angiosperm”, and found nothing at all. The I looked up the term “dicotyledonous” and found a line written by Darwin in the late 1830s that seemed to be about the fossil record.

    Did Kew have a copy of Darwin’s notebooks? Yes they did, published in the Bulletin of the British Museum in 1960. I found the reference, and to my intense interest read:

    “L’Institut 1837 p. 319 Brongniart – no dicotyledonous plants and few monocot in coal formation? …p. 320 Says coniferous structure intermediate between vascular or Crypogram (original Flora) and Dicotyledones, which nearly first appear (p. 321) at Tertiary epochs.”

    Here is a scan of the page in Darwin’s first notebook on the transmutation of species (reproduced with permission from J. van Wyhe ed., Darwin Online)

    What this meant is that in 1837 or 1838, decades before his correspondence with Oswald Heer and Gaston de Saporta, and his reading of John Ball’s essay, Darwin thought that monocotyledons had a much longer fossil record than dicotyledons.

    Now I wanted to find what Darwin had been reading. This appeared easy, as Gavin de Beer, the 1960 editor of Darwin’s notebooks, had provided a footnote referring to “A. Brongniart. “Végétaux fossiles”, L’Institut, Paris 5, 1837, 220, p. 318.”

    I requested this book from the Kew librarians, and looked up page 318. There was nothing relevant there about the monocot fossil record. I scoured the volume, and could not find what Darwin was referring to.

    Then I noticed that the librarian had also brought me another book, a much thinner volume, belonging to J. S. Henslow. It contained a reprint of a lecture given by Brongniart at L’Institut de Paris in 1837, with the catchy title: Considerations sur la nature des vegetaux qui ont couvert la surface de la terre aux diverses epoques de sa formation.

    I picked this up. It had fewer than 300 pages, so it seemed futile to look for Darwin’s references. But then on pages 19, 20 and 21 respectively I found the information that Darwin had noted as being on pages 319, 320, and 321 of the edition that he had been reading. It was this lecture by Brongniart, rather than his more famous magnum opus Végétaux fossiles, that Darwin was referring to in his notebooks.

    So from the 1830s through to the 1870s, Darwin was hearing from leading palaeobotanists that monocotyledonous angiosperms preceded dicotyledonous angiosperms in the fossil records. There was no doubt left in my mind that this was Darwin’s view.

    Darwin considered the “abominable mystery” to be the sudden appearance and diversification of dicotyledons in the Cretaceous, with monocotyledonous angiosperms present in the fossil record back to the carboniferous.

    Below is a diagram taken from an 1885 textbook “Sketch of Paleobotany” by Frank L. Ward, which depicts the plant fossil record as it was understood in Darwin’s time. I have highlighted in green the part that has not been widely believed since around 1900.

    What this means for today

    This means that if Darwin were here today, and we persuaded him that Brongniart, Heer and de Saporta had mistaken the identity of the pre-Cretaceous “monocot” fossils that they had found in pre-Cretaceous rocks, Darwin would consider the mystery to be a lot more abominable in 2017 than it had seemed to him in 1879. Far from being solved, his abominable mystery has got a lot deeper.

    Thus, many of the solutions we are seeking for the mystery today – such as pre-Cretaceous angiosperms – even if they were found, would simply restore the mystery to the depth that it appeared to have for Darwin. They wouldn’t solve it.

    We often think that the history of science is a tale of inexorable progress, where mysteries get smaller and smaller. Darwin’s abominable mystery is a reminder that sometimes the mysteries get bigger.

    This blog was written for the Nature Ecology and Evolution Community where it is posted here

  • The evolutionary mystery of orphan genes

    Every newly sequenced genome contains genes with no traceable evolutionary descent – the ash genome was no exception

    This week in Nature I and my co-authors published the ash tree genome. Within it we found 38,852 protein-coding genes. Of these one quarter (9,604) were unique to ash. On the basis of our research so far, I cannot suggest shared evolutionary ancestry for these genes with those in ten other plants we compared ash to: coffee, grape, loblolly pine, monkey flower, poplar, tomato, Amborella, Arabidopsis, barrel medic, and bladderwort. This is despite the fact that monkey flower and bladderwort are in the same taxonomic order (Lamiales) as ash. (more…)

  • Phenotypic plasticity drives cichlid radiations?

    At the Royal Society last month, I was listening to proponents of the “extended evolutionary synthesis” (EES). Patrick Goymer has blogged this meeting for Nature Ecology & Evolution, and tweets from it can be found on Storify. The debates have rumbled on in the back of my mind since, especially the contention that phenotypic plasticity is too neglected in evolutionary biology. I was therefore fascinated to stumble upon a paper in press at Molecular Ecology which suggests an impressive case of phenotypic plasticity accelerating evolution. Ralf Schneider and Axel Meyer argue that rapid, convergent radiations of cichlid fish in East African Lakes have been greatly facilitated by morphological plasticity, and its fixation as regulatory networks degenerate. “The cichlids of Africa’s lakes impress us mightily with what evolution can do in a short space of time”, wrote Richard Dawkins in The Greatest Show on Earth (Bantam Press, 2009). Will these radiations become textbook examples of the EES in action?

    This blog was first posted here at Nature Ecology & Evolution Community on 8 December 2016

  • Telegraph article: British woodlands need diversity from around the world

    This article was written for The Daily Telegraph and is published online here.

    Foreign tree species are needed to help preserve Britain’s woodlands from disease, argues Dr Richard Buggs.

    Trees in Britain do not have enough genetic diversity to cope with a global influx of pathogens.

    As global trade introduces new pests and diseases, we face ecological and economic disaster as one after another tree species succumb to imported diseases. (more…)