Clinical validation of AI-based tools in genetics


Artificial intelligence can speed up diagnostic processes. Dr. Ansgar Lange explains the role AI plays in genomics and sequencing.

The cost of sequencing a genome is falling rapidly. So quickly that in recent years, a balance in genomics has shifted: while for a long time it was the equipment and the real cost of sequencing that inflated the price of a sequenced genome, it has quickly become the cost of human resources.

In this article, we look at developments in AI and genetics and a new generation of software using AI and machine learning to solve this bottleneck in genomics. In addition, we recently conducted a study to validate the performance of AION, our variant interpretation platform.

A bit of background: recent developments in genetic testing and the need for AI

Genetic diseases are “caused” by mutations or variants in the human genome. A geneticist must find the causative variant (or pathogen) to diagnose a genetic disease. The human genome consists of approximately 3 billion base pairs, and these causal variants can be found in any part of the human genome. For a long time, genome sequencing was an expensive process: the first draft of a human genome under the Human Genome Project cost $300 million.

A d

In 2006, the cost of sequencing a human genome was estimated at $20-25 million, although this is a hypothetically calculated cost. Fortunately, the cost of genome sequencing has steadily fallen since the $150 million Human Genome Project in 2003 – with the most recent price in 2022 at $200. It’s not hard to imagine what this means for market demand: the steep decline in costs has led to strong growth in the next-generation sequencing (NGS) market, with an estimated compound annual growth of more than 18 % between 2022 and 2030. In other words, genetic testing is increasingly available.

These developments mean that the cost driver of genetic sequencing is no longer the sequencing itself. Increasingly, the analysis of all this data is becoming a bottleneck for laboratories. Sequencing a genome or an exome generates a long list of variants that must be interpreted by a specialist, since many of these variants are benign (not harmful). So, in simple terms, generating the data gets cheaper every year, while interpreting that data – finding the pathogenic variant – is just as laborious and expensive as ever.

This is where a new generation of AI-driven software comes in, helping data interpreters, variant scientists, to speed up the process of analyzing sequenced genomes or exomes. As more and more countries introduce large-scale genetic testing, for example, through newborn screening programs, AI-based tools that focus on the current interpretation bottleneck are required.

Clinical validation of AI-based variant interpretation tools: our study

AION is one such platform: a rare disease variant interpretation platform supported by machine learning algorithms. Before diving into its validation study, let’s take a quick look at how it works. It aims to support variant scientists in this interpretation process which is currently the bottleneck of the rapidly growing sequencing market.

AION comprises two main components: First, all mutations or variants are classified as pathogenic or benign – but the causal impact of less than 1% of all mutations is known. A greater number of variants are called VUS: variants of uncertain significance. This algorithm supports categorization by extrapolating knowledge based on public datasets and the latest research, identifying potentially pathogenic variants.


What would Mona Lisa look like with a body?  DALL-E 2 has an answer
What would Mona Lisa look like with a body?  DALL-E 2 has an answer

Second, a second algorithm matches these potentially harmful variants to the symptoms experienced by a patient to rank the most relevant variants for the human expert to review. This is of particular interest, remembering that the vast majority of variants are SUVs, which provides direction for further clinical investigation by a human expert.

The study aims to validate the clinical performance of our platform. To do this, we analyzed data from patients with rare diseases from the Genomics England 100,000 Genomes project. The Nostos Genomics computational genomics team analyzed rare disease cases from the 100,000 Genomes Project on AION and analyzed its performance. The goal here was not to find new diagnoses but to measure the performance of the tool based on already diagnosed cases, to see how an AI-based tool performs compared to a human expert.

The results are impressive: AION has identified the causative pathogenic variants in over 91% of cases and over 93% if parental data is available. This means that in more than 9 cases out of 10, the causal variant was found in the rank of priority variants according to age and ethnicity. Automated variant interpretation, driven by AI, provides clinical performance comparable to that of a human expert. It can support the analysis of clinical genetic tests, which can reduce the time and costs associated with this crucial process due to the interpretation bottleneck described earlier.

Future potential of AI in genetics

In the previous paragraphs, we have covered the need for AI-based decision support tools with the rapid growth of the sequencing industry and the validation of their performance. Further adoption of these tools may lead to greater availability of genetic testing globally. Falling sequencing costs may open the door for global health markets to invest in rare disease diagnostics.

However, many low- and middle-income countries do not yet have the clinical genetic expertise to analyze the multitude of variants that would result from sequencing their populations. AI-based tools need to be simple to integrate, given that many labs lack the technical expertise to implement complex computational pipelines and interfaces. In other words: AI decision support tools are democratizing genetic expertise, supporting laboratories around the world with equitable access to high-quality genome interpretation.

Second, AI algorithms allow labs to push the boundaries of their diagnostic practice by giving variant scientists more time to focus on complex cases. Although variant scientists supported by AI-based interpretation can resolve rare disease cases faster, a number of patients may not have a clear causative variant.

These complex cases are often characterized by numerous variants of uncertain significance (VUS). Further investigations of these SUVs are performed using functional screenings or additional family sequencing, which may provide sufficient evidence for reclassification. By prioritizing these SUVs with a pathogenicity score, AI tools add nuance to the SUV category and point to the most valuable variants for further investigation, allowing genetics services to make the best use of resources. The more trained and sophisticated the algorithm, the better those rankings will be, allowing variant scientists to make even more diagnostics in the future – backed by AI.


  • Berger, Bonnie and Yun William Yu. 2022. “Navigating Bottlenecks and Tradeoffs in Genomic Data Analysis.” Natural journals. Genetics, December, 1–16.
  • Kris A. Wetterstrand, MS 2019. “The Cost of Sequencing a Human Genome.” NHGRI. March 13, 2019.
  • “Next Generation Sequencing Services Market Report, 2030.” nd Accessed January 9, 2023.
  • “Press release.” nd Accessed January 9, 2023.
  • Richards, S., N. Aziz, S. Bale, D. Bick, S. Das, J. Gastier-Foster, WW Grody, et al. 2015. “Standards and Guidelines for the Interpretation of Sequence Variants: A Joint Consensus Recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology.” Genetics in Medicine: Official Journal of the American College of Medical Genetics 17 (5).

Leave a Comment

Your email address will not be published. Required fields are marked *