Seeing the Unseen: Reporting on Cancer Fusions with Long-Read Sequencing and Genomenon CKB

February is recognized as Cancer Prevention Month, with World Cancer Day on February 4 serving as a global reminder that improving outcomes in cancer depends on earlier detection, better diagnostics, and faster translation of data into clinical decisions. In precision oncology, early detection no longer stops at identifying the presence of cancer but increasingly depends on detecting the right genomic variants and interpreting them with enough clinical context to influence care. Structural variants, particularly gene fusions, are a primary example of this challenge.

One such example is the NUP98::NSD1 fusion, a recurrent structural variant most commonly associated with acute myeloid leukemia (AML). This fusion is consistently linked to aggressive disease biology and poor prognosis. Yet despite its clinical importance, NUP98::NSD1 is also a fusion that highlights two persistent bottlenecks in cancer genomics: reliable detection and meaningful interpretation.

From a detection standpoint, NUP98::NSD1 exemplifies why structural variants remain difficult for many short-read sequencing workflows. Short-read technologies are highly effective for identifying small variants, but they often struggle when breakpoints fall within large intronic regions, repetitive sequence, or complex rearrangements. In these scenarios, evidence for a fusion may appear fragmented with split reads that fail quality thresholds, discordant pairs that lack specificity, or signals that are filtered out entirely in clinical pipelines optimized to minimize false positives. The result is not necessarily an incorrect call, but often no call at all.

Example PacBio long-read somatic workflow in Golden Helix’s VSWarehouse NGS platform.

Long-read sequencing fundamentally changes this dynamic. By generating reads that can span fusion breakpoints directly, long-read approaches allow structural variants to be observed rather than inferred. For fusions like NUP98::NSD1, this can mean the difference between an inconclusive rearrangement signal and a definitive fusion call with clearly resolved breakpoints. Importantly, this improvement is not just about finding more variants, it is about increasing diagnostic yield for variants that meaningfully impact patient care. In aggressive malignancies such as AML, resolving that uncertainty early can have downstream implications for both prognosis and treatment strategy.

Detection, however, is only the first half of the problem. Once a fusion like NUP98::NSD1 is identified, the immediate question becomes: what does this mean clinically? This is where the distinction between variant catalogs and clinical knowledgebases becomes critically important.

The Mitelman Database of Fusions has long been a foundational resource for cancer cytogenetics. It excels at documenting the existence of chromosomal abnormalities and gene fusions, linking them to tumor types and primary literature. For NUP98::NSD1, Mitelman provides confirmation that the fusion has been observed, identifies the associated disorder, and directs users to relevant publications. This information is invaluable for establishing biological precedent and historical context.

What Mitelman is not designed to do is answer the questions that arise at the point of clinical reporting. It does not assess therapeutic relevance, categorize response types, indicate regulatory approval status, or tier evidence in a way that supports standardized decision-making. In other words, it tells you that a fusion exists, but not necessarily what to do with it.

For a fusion like NUP98::NSD1, this distinction matters. CKB enables the fusion to be contextualized within known disease biology and clinical outcomes, supporting consistent interpretation across cases and institutions. Instead of manually synthesizing disparate publications under time pressure, analysts can rely on a curated framework that standardizes how evidence is evaluated and communicated. This does not replace clinical judgment, but it accelerates the path from detection to insight.

Viewed through the lens of Cancer Prevention Month, this alignment of technology and interpretation becomes especially relevant. Awareness campaigns emphasize early detection because delays cost lives. In genomics, delays are not only caused by late testing but also arise when variants are missed due to technical limitations or when results lack sufficient context to prompt action. Long-read sequencing helps address the first problem by improving detection of complex, clinically meaningful structural variants. Clinical knowledgebases like Genomenon CKB address the second by transforming variant calls into interpretable, actionable findings.

NUP98::NSD1 serves as a clear example of how these components work together. Long-read sequencing increases confidence in detecting a challenging fusion that may evade short-read workflows. Clinical curation then translates that detection into meaningful context, clarifying risk, prognosis, and relevance. Together, they ensure that when a high-impact cancer-driving alteration is present, it is not only identified but understood.

As precision oncology continues to evolve, structural variants will increasingly define disease biology and therapeutic direction. Ensuring that our diagnostic workflows can both detect and interpret these variants is not simply a technical optimization, it is clinically imperative. During Cancer Prevention Month, that imperative is worth underscoring: better outcomes depend not just on awareness of cancer, but on the tools we choose to see it clearly and act on it decisively.

Contact Our Team Today