The release of a "draft" of human genome sequencing in 2001 was a momentous moment. It triggered a profound shift in our understanding of genetics Human. It paved the way for remarkable advances in the study of biology and in the treatment of disease.
Some sections, however, had not obtained sequencing, and other information was incorrect. Exactly two decades later we got a new version. It was published as a pre-press by an international consortium of researchers. It has yet to be peer reviewed, but it finally seems "complete".
Sequencing: why did it take us so long?
Technological limitations have limited the sequencing of the human genome to only the “euchromatic” portion of the genome, the 92% of our genome where the majority of genes are found. Practically, the most active sector in the production of gene products such as RNA and proteins.
The new sequencing is expected to fill the remaining gaps, providing all 3.055 billion base pairs (“letters”) of our DNA code in its entirety. Data made publicly available, in the hope that other researchers will use it to further their research.
The heterochromatic part
Much of the newly sequenced material is the “heterochromatic” part of the genome, which is more “tightly packed” than the euchromatic genome and contains many highly repetitive sequences that are very difficult to read accurately.
It was once thought that these regions did not contain any important genetic information, but now it is known that they contain genes involved in fundamental processes such as the formation of organs during embryonic development. Among the 200 million newly sequenced base pairs are around 115 genes that are expected to be involved in the production of proteins.
Two key factors made this great achievement possible:
A very special cell
The newly published genome sequence was created using human cells derived from a very rare type of tissue called hydatiform wheel. A condition that occurs when a fertilized egg loses all the genetic material given to it by its mother. Most cells contain two copies of each chromosome, one from each parent and the chromosome from each parent contributing to a different DNA sequence. A cell of a complete hydatiform mole he has only two copies of his father's chromosomes and the genetic sequence of each pair of chromosomes is identical.
This made whole genome sequencing much easier to put together.
Many, many advances in sequencing technology
After decades of very slow progress, the Human Genome Project reached its turning point in 2001, paving the way for a method called “shotgun sequencing“. It involved breaking the genome into very small fragments of around 200 base pairs, cloning them inside the bacteria to decipher their sequences, and then putting them back together like a giant jigsaw puzzle.
This is the main reason why the original draft sequencing only covered the euchromatic regions of the genome: only these regions could be reliably sequenced using "the shotgun".
The latest sequence was obtained using two new complementary DNA sequencing technologies. One was developed by PacBio and allows you to sequence longer DNA fragments with very high precision.
The second, developed by Oxford Nanopore, produces very long stretches of continuous DNA sequence. These new technologies allow the pieces of the puzzle to be thousands or even millions of base pairs long, making them easier to assemble.
The new information has the potential to advance our understanding of human biology, including how chromosomes function and maintain their structure. It will also improve our understanding of genetic conditions such as Down syndrome that have an underlying chromosomal abnormality.
Come on, but now the genome is completely sequenced!
Um, no. Not even now.
One obvious omission is the Y chromosome, because the hydatidiform molar cells used to compile the sequencing contained only two identical copies of the X chromosome. But this work is ongoing, and the researchers predict that their method can also accurately sequence the Y chromosome, despite has highly repetitive sequences.
However, there is still a way to go
While sequencing the (nearly) complete genome of a human cell is a truly impressive landmark, it is only one of several crucial steps towards fully understanding human genetic diversity. The next work will be to study the genomes of different populations (the complete hydatiform molar cells were European).