Next-Generation Sequencing Researchers Are Closer to Completing the Human Genome Sequence

Photo of author

( — December 16, 2021) — Two decades ago, the Human Genome Project released the first sequences of the human genome closely followed by the biotech firm Celera Genomics, which played a key role in reducing the cost of sequencing, partly using data from the Human Genome Project, to create their cheaper human genome sequence. But technological limitations left the researchers with several gaps in their final sequence. For example, their ability to piece together reads of DNA from regions that contained long sections of repeated base pairs caused particular issues. As a result, when they completed the first draft of the sequence, approximately 15% was missing.

Over the years, researchers have worked to solve the rest of the puzzle. But even the most recent human genome, finished in 2013 and patched in 2019, lacks 8% of the full sequence because of difficulties surrounding the sequencing of heterochromatin and other complicated sections of DNA. However, geneticists from the Telomere-to-Telomere (T2T) Consortium have now utilized new technology to complete even more of the human genome sequence, leaving only the Y chromosome.  

The international journal of life sciences BioTechniques closely follows updates in next-generation sequencing and has published an update on this human genome sequencing progress.

T2T Consortium Progress on the Human Genome Sequence

The T2T Consortium refers to an international association of approximately 30 institutions. Karen Miga (University of California Santa Cruz), Adam Phillippy (National Human Genome Research Institute), and Evan E Eichler (University of Washington School of Medicine) launched the Consortium to develop research into ‘unmappable’ centromere regions.

In May 2021, the Consortium published a preprint entitled ‘The complete sequence of a human genome’, in which the researchers report that they have sequenced the remainder of the human genome. The preprint publicizes the addition of 200 million DNA base pairs and 115 protein-coding genes to the human genome sequence. This marks a 4.5% increase in the number of base pairs (up to 3.05 billion) and a 0.4% increase in the number of protein-coding genes (up to 19,969). This preprint is yet to be peer-reviewed.

New Next-Generation Sequencing Technologies

The Consortium utilized new next-generation sequencing technologies from Pacific Biosciences (CA, USA) and Oxford Nanopore (UK) in their latest draft of the human genome sequence, T2T-CHM13. These long-read sequencing technologies can sequence long stretches of DNA, up to 20,000 base pairs at the same time, which have been isolated from cells. On the other hand, traditional sequencing methods can only read a few hundred base pairs at the same time. Researchers then reassemble the base pairs like puzzle pieces. The larger stretches of DNA enabled by long-read technologies are easier to put together because they are more likely to contain sequences that overlap.

Instead of taking DNA from a living human, the researchers utilized a cell line derived from a complete hydatidiform mole (a type of tissue that forms in humans when a sperm fertilizes an egg that doesn’t have a nucleus). This meant that the researchers didn’t need to distinguish between chromosomes from two individuals because the sperm cell only carried an X chromosome. However, this also meant that the new sequence didn’t cover the Y chromosome, which typically triggers male biological development.

Also, despite the progress made, the Consortium estimates that approximately 0.3% of the sequence could contain errors because of challenges associated with passaged cell lines and problematic areas of the genome where it was difficult to perform quality checks.

The Future of the Human Genome Sequence

The genomics researchers are now working on sequencing the Y chromosome using the same methods and are planning to sequence a genome that contains chromosomes from two parents.

Meanwhile, the Consortium has paired up with the Human Pangenome Reference Consortium to sequence more than 300 genomes from around the world over the next three years. Together, the Consortiums will use T2T-CHM13 as a reference to understand which parts of the genome tend to differ from individual to individual.

Now that there are so many tools and resources available for next-generation sequencing, this research should identify any links between newly sequenced areas and human diseases. As a result, human genome sequencing could become a routine practice further down the line.

About BioTechniques

When BioTechniques printed its first issue in 1983, it was the first publication to review lab methodologies instead of treatments. Today, the open-access, peer-reviewed journal is the go-to resource for scientists who work across a multitude of disciplines, like chemistry, physics, computer science, and plant and agricultural science. These professionals delve into the methods and techniques (and the reproducibility of these) that are key to the future of science and medicine. They not only read but use the journal to better their practices. These scientists use BioTechniques to keep up to date with ever-evolving scientific and medical techniques like CRISPR gene editing, chromatography, western blotting, polymerase chain reaction, and next-generation sequencing.

Journal aside, BioTechniques also publishes a wealth of information on its multimedia website, which welcomes a growing community of online users. Here, scientists, industry experts, and lab workers access articles, eBooks, interviews, videos, webinars, and podcasts to fuel their knowledge and continual professional development. They can also take part in meaningful discussions that line the future of scientific and medical progression.

BioTechniques is a Future Science Group journal. This progressive scientific and medical publisher is home to 34 journals, including Nanomedicine, Regenerative Medicine, and Future Oncology. Future Science Group is dedicated to the progression of research, development, and clinical practice through the publication of high-quality journals and tools that improve access to the latest research and improve communication between professionals. The group has published more than 50,000 articles to date and receives more than 5 million article downloads each year.