Human genome: 1 million sequences will open the doors of genetics

April 17 2021

Medicine, Technology

20 years after the unveiling of the human genome, science can now sequence many more in less time: it will soon have the data it needs to unlock many secrets of genetics.

The first draft of the human genome was published exactly 20 years ago. It took nearly three years to complete, at a cost of nearly a billion dollars. The Project Human genome it allowed scientists to read, almost from end to end, the 3 billion base pairs of DNA (or “letters”) that biologically define a human being.

It was an epochal undertaking. That project allowed a new generation of researchers to identify new targets for cancer treatments, engineered mice with human immune systems and even build one Web page where you can navigate the entire human genome as if it were Google Maps.

The first complete human genome was generated from a handful of anonymous donors. The aim was to produce a reference genome that represented more than a single individual. As expected, it was not enough to understand the vast diversity of human populations around the world. No two people are alike and no two genomes are alike. If researchers wanted to understand humanity in all its diversity with greater precision, a single human genome would not be enough.

Thousands or millions of them would have to be sequenced: and this is precisely the purpose of a project currently underway.

Understanding genetic diversity

The richness of genetic diversity among people is what makes each person unique. But genetic changes also cause many ailments and make some groups of people more susceptible to certain diseases than others.

At the time of the human genome project, researchers were also sequencing the complete genomes of simpler organisms such as mice , fruit flies , yeasts e some plants . The enormous effort that went into generating these first genomes led to a revolution in the technology required to read genomes. A technology that has advanced to the point that today it does not take years and millions of euros to sequence an entire human genome. Now they need it a few days and costs less than a thousand euros.

Thousands of genomes

Advances in technology have allowed scientists to sequence the complete genomes of thousands of individuals from around the world. Initiatives such as the Genome Aggregation Consortia they are making great efforts to collect and organize this scattered data. So far, that group has been able to collect nearly 150.000 genomes. Within this dataset, researchers found more than 241 million differences in people's genomes, with an average of one variant every eight base pairs .

Most of these variations are very rare and will have no effect on a person. However, hidden among them are variants with important physiological and medical consequences. For example, some variants of the BRCA1 gene predispose certain groups of women, such as Ashkenazi Jews, to cancer ovaries and breasts. Other variants in that gene carry some Nigerian women to experience higher than normal mortality for breast cancer.

How to identify these variants of the human genome?

The best way researchers can identify these types of variants at the population level is through studies that compare the genomes of large groups of people with a control group. But diseases are complicated. An individual's lifestyle, symptoms and time of onset can vary greatly, and the effect of genetics on many diseases is difficult to distinguish. The predictive power of current genomic research is too low to rule out many of these effects because there are not enough genomic data .

Understanding the genetics of complex diseases, particularly those related to genetic differences between ethnic groups, is essentially a big data problem. And researchers need more data. Much more data.

1.000.000 genomes

To address the need for more data, the National Institutes of Health initiated a program called all of us . The project aims to collect genetic information, medical records and health habits from surveys and wearable devices of over one million people in the United States over the course of 10 years. It opened to the public in 2018, and more than 270.000 people have contributed samples since then.

The great potential of this project lies in the possibility of doing research by crossing the most disparate data. A neuroscientist could look for genetic variations associated with depression by considering, for example, exercise levels. An oncologist might look for variants related to skin cancer risk based on ethnic differences.

With one million human genomes we will have an extraordinary wealth of data to discover the effects of genetic variation on disease, not only for individuals, but also within different groups of people.

The dark forest of the human genome

Another advantage of this project is that it will allow scientists to learn about parts of the human genome that are currently very difficult to study. Most of the genetic research has been on the parts of the genome that code for proteins. However, these represent only the1,5% of the human genome.

One promising piece of research focuses on RNA, a molecule that turns messages encoded in a person's DNA into proteins. However, RNAs that come from 98,5% of the non-protein producing human genome have a host of other functions. Some of these ANNs are involved in processes such as the way in where the cancer spreads , embryonic development or control of the X chromosome in females. Because the All of Us project includes all coding and non-coding parts of the genome, it will be by far the largest dataset available to shed light on these mysterious RNAs.

The first human genome gave birth to 20 years of incredible scientific advancements. A much larger set of data will open up the cure for many complex diseases. Thanks to big data projects like All of Us, researchers will model our health with genetics individual.

Gianluca Riccio, creative director of Melancia adv, copywriter and journalist. He is part of the Italian Institute for the Future, World Future Society and H+. Since 2006 he has directed Futuroprossimo.it, the Italian Futurology resource.

To report research, discoveries and inventions, contact the editorial team! Follow Futuro Prossimo on Whatsapp: exclusive news and updates (free).

FP on Fatto Quotidiano
Alberto Robiati and Gianluca Riccio guide readers through scenarios of the future: the opportunities, risks and possibilities we have to create a possible tomorrow.

On the same theme:

The last

Human genome: 1 million sequences will open the doors of genetics

Medicine, Technology

Share

Understanding genetic diversity

Thousands of genomes

How to identify these variants of the human genome?

1.000.000 genomes

The dark forest of the human genome

The first human genome gave birth to 20 years of incredible scientific advancements. A much larger set of data will open up the cure for many complex diseases. Thanks to big data projects like All of Us, researchers will model our health with genetics individual.

The very long farewell of the Y chromosome: will males disappear from the world?

In vitro gametogenesis, more hope for biological children from two men

Synmoss, towards the first plant with an artificial genome

Herculaneum reveals Plato's secrets: his end reconstructed

Recycle plastic endlessly: new advanced recycling technologies

Tesla, the cracks in the myth: declining sales, declining confidence, Musk in the crosshairs

Cancer, frontier test detects it in a few minutes with micro drops of blood

Shrinkflation, France throws itself into combat against this practice