Made up of 27 million people from 70,000 generations, it dates back up to two million years

An international team of scientists has created the largest family tree of humanity to date. This network, which tracks with unprecedented precision how individuals around the world are related to each other, is made up of 27 million people over 70,000 generations, stretching back to the dawn of humankind. To do this, the researchers have used 4,000 modern genomes from 215 different populations and samples from several ancient individuals, including Neanderthals and a Denisovan. Published this Thursday in the journal ‘Science’, the study provides information on key events in our history, such as migration out of Africa.

“The work we have published is the first attempt to create a single ‘family tree’, actually a network of trees, that captures the relationships between all individuals from the most recent times to our earliest history,” he explains to this newspaper. Gil McVean, professor of statistical genetics at the University of Oxford.

During the last two decades, advances in genetic research have allowed large projects such as the UK Biobank, the world’s largest biological bank, in the United Kingdom, or commercial programs to discover ‘DNA relatives’, such as 23&me, to use the ‘code of bars’ of hundreds of thousands of people. However, this latest ‘family tree’ has been built from a few thousand individuals, including some prehistoric. Among them, are three Neanderthals (Chagyrskaya – 80,000 years, Vindija – 50,000, Altai – 110,000) and a single Denisovan (64,000 years). They also used data from early modern humans, including a family of four (parents and two children) from the Afanásievo culture in the Altai Mountains (5,000 years), Ust’Ishim man from Siberia (45,000 years), Loschbour man from Luxembourg (8,000) and an individual known as LBK from Stuttgart, Germany (7,000).

In this way, scientists went back in time to 2 million years, long before anatomically modern humans appeared between 200,000 and 300,000 years ago. Because combining genomic sequences from many different databases was challenging, the researchers used big data.

“Our method uses DNA sequences to learn about ancestral relationships between individuals. What we’re trying to do is trace how the genetic mutations our ancestors suffered from, and the parts of the genome in which they occurred, have been passed down from generation to generation to the present day,” explains McVey. “What’s different about our approach is that we can do this at the scale of the entire genome and across thousands (potentially millions) of humans. In addition, we can estimate the date and approximate geographic location of the ancestors. We can see the full complexity of how genetic material has been shared throughout evolution. Without this method, we have very partial indications of the same events and processes, but it is not possible to put the whole story together », he adds.

Of the same opinion is Jasmin Rees, from University College London, who has written an article in ‘Science’ that accompanies the study. “Without methods of this efficiency, tree reconstruction for so many samples would be nearly impossible and incredibly impractical. These developments have really opened up a lot of potential for this type of study, allowing the inference of genealogies to an extent that was not possible before,” she tells ABC.

The roots, of Sudan

The ‘map’ provides information on key events in human history. For example, the researchers found that the oldest roots of human variation can be traced back to northeast Africa, in a region centered on present-day Sudan, more than a million years ago. They have also been able to see how ancient human species, such as the Denisovans, have left genetic descendants .worldwide (outside of Africa), but in very different patterns. For example, people in Papua New Guinea and Oceania have a large amount of Denisovan ancestry (over 10%), but it is also found among Europeans. “Both of these observations have been made before (to some extent),” McVean acknowledges, “although we believe our approach illuminates these events in a simple and direct way.”

Furthermore, as Rees points out, “geographic inference from the tree can show key migrations. For example, we can see movement out of Africa (and the focus of inferred ancestry within Northeast Africa before that migration), migrations through Papua New Guinea, and migrations through the Americas.”

This pedigree can also be used to study how genetic variants that influence health emerged and spread around the world, such as those that increase the risk of a severe response to Covid-19 or those that increase our chances of autoimmune diseases.

The team plans to make the tree even more complete by continuing to incorporate genetic data as it becomes available. The authors state that because tree sequences store data very efficiently, they could easily accommodate millions of additional genomes. The goal is to generate a single, unified map that explains the decline of all human genetic variations that we see today.