Produced by: Science Popularization China Author: Li Lei Producer: China Science Expo In "The Gene Kingdom II", we introduced the ups and downs from genes to genomes. With the joint efforts of six countries, we launched the Human Genome Project and announced the human genome to the world in 2001. But do we already have a complete human genome? The answer is no. The human genome is not complete In fact, after the completion of the Human Genome Project, there are still many shortcomings, a typical example being that the "human genome" is incomplete. We imagine the genome to be a line composed of ATGC from beginning to end, but this is not actually the case. The genome is distributed in the human body in the form of chromosomes, and the human body has a total of 23 pairs of chromosomes. Therefore, if the human genome is a small area, then because the genome itself is divided into different chromosomes, we can understand it as different unit buildings. Theoretically, the human genome should be 23 pairs, that is, 22 pairs of unit buildings called autosomes, plus the sex chromosome building composed of two sex chromosomes, X chromosomes and Y chromosomes, and an additional mitochondrial gene building. These together make up the human genome. However, in reality, the genome we obtained is not just divided into these unit buildings. Each unit building itself has some floors that are suspended. These suspended floors are not non-existent, but our technology at the time could not read them. For example, the most typical example is the repetitive sequence. Although our genome is composed of ATGC, some DNA repetitive sequences often appear. For example, sometimes 2-20 nucleotide units are repeated hundreds or thousands of times. According to scientists, this form of repetition can be divided into at least common repetitions and segmental repetitions. Let's take two specific examples, a sequence with dozens of Ts - "TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT", and D21S11 (variable number of TCTA and TCTG repeats (TCTA)n(TCTG)n(TCTA)nTA(TCTA)nTCA(TCTA)nTCCATA(TCTA)n) commonly used in paternity testing. At this time, our sequencing cannot identify its location and specific information. Not only that, these sequences are often located in some special positions, such as the centromere in the middle of the chromosome and the telomeres at the end. As a result, due to the limitations of the technology at the time, the human genome we obtained had countless gaps. How big are these gaps? Together, they account for about 8% of the human genome. Genome sequence (Image source: science) To address this problem, scientists have been trying to find ways to fill these gaps. Since 2003, an international research team has formed the Telomere to Telomere Consortium (T2T) to work hard to decipher the sequences of these complex regions. At the beginning, the consortium's work progressed slowly because these regions are often repeated regions, and it is difficult for our computers to distinguish the order and specific composition of the repeated regions. You may be able to guess what brought about the turning point. Yes, it was technological innovation again. It was not until the emergence of a new sequencing technology that it brought hope for solving this problem. This technology is long-fragment DNA sequencing technology, also known as the third-generation sequencing technology. In the past, whether it was first-generation sequencing or second-generation sequencing, the length of a unit was about a few hundred bases. This meant that if repeated fragments appeared repeatedly in a gene, we would not be able to proceed because we could not distinguish their order and specific composition. The new sequencing technology can read tens of thousands or even hundreds of thousands of words from beginning to end at one time. Such long regions will basically not be repeated on the genome, while those short repetitive regions are covered in them. Therefore, with the help of this new technology, scientists have successfully translated the remaining 8% of the information on the genome, forming the most complete human genome to date. We can see that the completion of the Human Genome Project was achieved through the joint efforts of first-generation sequencing, second-generation sequencing, and third-generation sequencing. However, we cannot judge the superiority of sequencing technology based solely on the time of its appearance. We must know that without any of these sequencing technologies, the complete interpretation of the human genome cannot be completed. Of course, it must be pointed out that even today our human genome cannot be said to be 100% completely sequenced, and there is still a little bit that needs to be supplemented, but according to current progress, this problem may be completely solved in the next few years. The future is here? Too early to tell Now that we have built the most complete human genome to date, is this mission accomplished? In fact, this is only the first step in a long journey. When the Human Genome Project was completed in 2001, the industry cheered as if the future had arrived. However, scientists soon discovered a series of problems. First of all, the human genome that we sequenced that year is called the reference genome. Of course, you can also understand it as the "standard version", but it is different for each individual. Each person's genome and the reference genome are not 100% similar, but there will be some differences, which we call "gene variation" or "gene mutation." These variations are also the fundamental factors in the formation of this diverse world . There are no two people in the world who are exactly the same. Even strictly speaking, even identical twins do not have exactly the same genes. To put it more exaggeratedly, not all the cells in your body necessarily have the same genome. Gene mutation (Image source: wiki) The main reason for this phenomenon is gene mutation, and there are many factors that can cause gene mutation, including physical factors such as radiation, chemical factors such as various carcinogens, and biological factors such as virus-induced mutations. Even without these factors, our genetic mechanism will cause mutations. The so-called gene replication is the process of changing from one to many, which is not 100% accurate and will cause random errors. Although the human body has a repair mechanism, this repair is not 100% accurate. Ultimately, even if there are no mutagenic factors, the genes will still mutate. As a result, the DNA of the same person in different parts of the body at different times may not be exactly the same. Of course, it should be pointed out that gene mutation is a neutral term, and we should not be afraid of it. The emergence of many advantages is also due to mutations , and each person generally carries millions of single nucleotide polymorphisms (which can be commonly understood as mutations). So, what can we do after we have interpreted these codes? Don’t worry, we will reveal it next time. Editor: Sun Chenyu |
<<: Ibuprofen, the pain "savior", are you really familiar with it?
>>: Rabbits love carrots? The truth is unbelievable!
Marine creatures that look like panda skeletons, ...
In this article, the author introduces how to imp...
According to British media reports on November 9, ...
It's really a coincidence that these days, va...
The Automobile Market Research Branch of the Chin...
Online content operations can be understood as th...
This course will teach you how to draw line drawin...
Is it safe to buy clothes through mini programs? ...
So far, people's understanding of time has al...
Everyone is a manager of time, but most people ju...
Undoubtedly, e-commerce is the most important app...
The title is the source of the playback volume. J...
The mass of the Earth is 5.965x10∧24 kilograms, w...
The launch of mini programs has brought convenien...
In a broad sense, all manual interventions around...