Skip to main content

Scientists Solve One of Genomics

Scientists Solve One of Genomics’ Biggest Challenges by Using HiFi Sequencing to Distinguish Highly Similar Paralogous Genes


MENLO PARK, Calif., March 17, 2025 (GLOBE NEWSWIRE) -- PacBio (NASDAQ: PACB), a leading provider of high-quality, highly accurate sequencing platforms, today announced a newly published study in Nature Communications unveiling a powerful new method for analyzing some of the most complex regions of the human genome. Led by researchers from PacBio, GeneDx, and a global consortium of genomics experts, the study utilizes Paraphase, an informatics tool that, when paired with HiFi long-read sequencing, allows for high-precision variant detection and copy number analysis in 316 previously inaccessible segmental duplication regions, including 9 challenging medically-relevant genes.

Segmental duplications (SDs) are highly similar, duplicated regions of the genome that have posed persistent challenges for genetic analysis. These regions contain hundreds of genes critical to human health—including those implicated in spinal muscular atrophy (SMN1/SMN2), congenital adrenal hyperplasia (CYP21A2), and red-green color blindness (OPN1LW/OPN1MW)—but their high sequence similarity makes accurate mapping and variant detection nearly impossible with short-read sequencing. Paraphase, combined with HiFi sequencing, overcomes these challenges by phasing haplotypes across paralogous gene families, providing a more complete and accurate view of genetic variation. This is enabled by the length and accuracy of reads from HiFi sequencing.

Study Reveals Previously Inaccessible Regions of the Genome


By applying Paraphase to 160 long (>10 kb) segmental duplication regions spanning 316 genes, the researchers revealed new insights into genetic variation across five ancestral populations.

Among the key findings:Newly Identified De Novo Variants in SDs in Parent-Offspring Trios: Analysis of 36 trios uncovered 7 previously undetected de novo single nucleotide variants (SNVs) and 4 de novo gene conversion events, two of which were non-allelic—a level of detail not possible with traditional sequencing approaches.

Copy Number Variability Across Populations: The study profiled the copy number distributions of paralog groups across populations, showing high copy number variability in many gene families in SDs. It also provided a new approach for identifying false duplications in the reference genome.
Gene Conversion Drives Sequence Similarity between Genes and Paralogs: The team identified 23 paralog groups with strikingly low genetic diversity between genes and paralogs, indicating that frequent gene conversion and/or unequal crossing-over may have played a role in preserving highly similar gene copies over time.

“For decades, sequencing technologies have struggled to provide reliable data on paralogous genes—some of the most medically relevant but hardest to analyze regions of the genome,” said Dr. Michael A. Eberle, Vice President of Bioinformatics at PacBio and senior author of the study. “With Paraphase and HiFi sequencing, we now have a scalable way to accurately genotype SD-encoded genes across diverse populations, filling in long-standing gaps in genomic research and improving our ability to identify disease-linked variants.”

The study also highlights how Paraphase can disentangle medically important gene families that have long required specialized, multi-step assays like MLPA and Sanger sequencing. For example, in the CYP21A2/CYP21A1P region—where mutations cause congenital adrenal hyperplasia—the researchers characterized a previously overlooked duplication allele carrying both a functional CYP21A2 copy and a CYP21A2(Q319X) pseudogene copy, which could have led to misclassification in standard tests.

“This study demonstrates that when we use HiFi sequencing we see a much richer and more complex picture of genetic variation,” said Dr. Xiao Chen, lead author of the study and principal scientist at PacBio. “Paraphase enables the precise resolution of genetic regions that have been largely inaccessible until now, providing new opportunities for disease research, population genetics, and potentially even clinical testing.”

“Long-read genome sequencing offers the ability to detect variants that are difficult to identify using other testing methods, particularly in regions with highly similar sequence,” said Dr. Paul Kruszka, MD, FACMG, Chief Medical Officer at GeneDx. “This work may enhance variant detection, resolve complex genomic regions, and provide more answers for patients and families, so we are encouraged by the prospect of the data.”

The full study, “Genome-wide profiling of highly similar paralogous genes using HiFi sequencing,” is now available in Nature Communications.

About PacBio


PacBio (NASDAQ: PACB) is a premier life science technology company that is designing, developing and manufacturing advanced sequencing solutions to help scientists and clinical researchers resolve genetically complex problems. Our products and technologies stem from two highly differentiated core technologies focused on accuracy, quality and completeness which include our HiFi long-read sequencing and our SBB® short-read sequencing technologies. Our products address solutions across a broad set of research applications including human germline sequencing, plant and animal sciences, infectious disease and microbiology, and oncology. For more information, please visit www.pacb.com 

Genomics, DNA sequencing, gene expression, epigenetics, CRISPR, genome editing, transcriptomics, proteomics, personalized medicine, bioinformatics, genetic variation, whole genome sequencing, next-generation sequencing, pharmacogenomics, functional genomics, population genetics, evolutionary genomics, structural genomics, metagenomics, synthetic biology

#Genomics #DNASequencing #GeneExpression #Epigenetics #CRISPR #GenomeEditing #Transcriptomics #Proteomics #PersonalizedMedicine #Bioinformatics #GeneticVariation #WGS #NGS #Pharmacogenomics #FunctionalGenomics #PopulationGenetics #EvolutionaryGenomics #StructuralGenomics #Metagenomics #SyntheticBiology


International Conference on Genetics and Genomics of Diseases

Visit: genetics-conferences.healthcarek.com

Award Nomination: genetics-conferences.healthcarek.com/award-nomination/?ecategory=Awards&rcategory=Awardee

Award registration: genetics-conferences.healthcarek.com/award-registration/

For Enquiries: contact@healthcarek.com

Get Connected Here
---------------------------------
---------------------------------
in.pinterest.com/Dorita0211
twitter.com/Dorita_02_11_
facebook.com/profile.php?id=61555903296992
instagram.com/p/C4ukfcOsK36
genetics-awards.blogspot.com/
youtube.com/@GeneticsHealthcare

Comments

Popular posts from this blog

Fruitful innovation

Fruitful innovation: Transforming watermelon genetics with advanced base editors The development of new adenine base editors (ABE) and adenine-to-thymine/ guanine base editors (AKBE) is transforming watermelon genetic engineering. These innovative tools enable precise A:T-to-G and A:T-to-T base substitutions, allowing for targeted genetic modifications. The research highlights the efficiency of these editors in generating specific mutations, such as a flowerless phenotype in ClFT (Y84H) mutant plants. This advancement not only enhances the understanding of gene function but also significantly improves molecular breeding, paving the way for more efficient watermelon crop improvement. Traditional breeding methods for watermelon often face challenges in achieving desired genetic traits efficiently and accurately. While CRISPR/Cas9 has provided a powerful tool for genome editing, its precision and scope are sometimes limited. These limitations highlight the need for more advanced gene-e...

Genetic factors with clinical trial stoppage

Genetic factors associated with reasons for clinical trial stoppage Many drug discovery projects are started but few progress fully through clinical trials to approval. Previous work has shown that human genetics support for the therapeutic hypothesis increases the chance of trial progression. Here, we applied natural language processing to classify the free-text reasons for 28,561 clinical trials that stopped before their endpoints were met. We then evaluated these classes in light of the underlying evidence for the therapeutic hypothesis and target properties. We found that trials are more likely to stop because of a lack of efficacy in the absence of strong genetic evidence from human populations or genetically modified animal models. Furthermore, certain trials are more likely to stop for safety reasons if the drug target gene is highly constrained in human populations and if the gene is broadly expressed across tissues. These results support the growing use of human genetics to ...

Genetics study on COVID-19

Large genetic study on severe COVID-19 Bonn researchers confirm three other genes for increased risk in addition to the known TLR7 gene Whether or not a person becomes seriously ill with COVID-19 depends, among other things, on genetic factors. With this in mind, researchers from the University Hospital Bonn (UKB) and the University of Bonn, in cooperation with other research teams from Germany, the Netherlands, Spain and Italy, investigated a particularly large group of affected individuals. They confirmed the central and already known role of the TLR7 gene in severe courses of the disease in men, but were also able to find evidence for a contribution of the gene in women. In addition, they were able to show that genetic changes in three other genes of the innate immune system contribute to severe COVID-19. The results have now been published in the journal " Human Genetics and Genomics Advances ". Even though the number of severe cases following infection with the SARS-CoV-...