Skip to content

Diagnostics World – Comparison of AI Digital Pathology Tools Finds Means to Measure Performance

Diagnostics World – Comparison of AI Digital Pathology Tools Finds Means to Measure Performance

In a research project comparing the performance of 10 digital pathology tools capable of evaluating HER2 status from a common set of about 1,100 breast cancer samples, a key finding was the high level of agreement between their results and those of expert human pathologists—at least when the tumor marker was highly expressed. The greatest variability of results was seen at the non- and low (1+) levels of expression, according to Jeff Allen, Ph.D., president and CEO of Friends of Cancer Research, which sponsored and led the Digital PATH Project.

Each of the participating digital pathology tools use artificial intelligence (AI), which can recognize patterns on digitized slides and be used to indicate the extent of expression of biomarkers like HER2, says Allen. He’ll be presenting preliminary results of the study at the upcoming Next Generation Dx Summit.

HER2—human epidermal growth factor receptor 2—has been the target of multiple drugs for more than 25 years, but only recently has the concept of “HER2-low” breast cancer gained recognition, Allen says. Since then, three antibody-drug conjugates (ADCs) have been approved to treat it.

AI-enabled digital pathology tools could greatly aid in identifying patients with lower expression of the HER2 receptors, he continues. But the regulatory framework supporting their use as a diagnostic test “isn’t entirely clear,” despite many such technology platforms emerging to identify the molecular marker.

So, as with other projects it has run in the molecular diagnostics space, Friends of Cancer Research teamed up with technology developers to assess variability between different digital pathology tools. The intent wasn’t to declare one technology better than another, says Allen, but to learn if they could produce consistent and accurate results—and, importantly, to assess the potential for using an independent reference set of samples to characterize test performance.

The study involved 31 contributing partners, including several big-pharma companies and major universities, and an assortment of health technology companies (BostonGene, Caris Life Sciences, Daiichi Sankyo, Indica Labs, Lunit, Nucleai, Panakeia, PathAI, and 4D Path), as well as the U.S. Food and Drug Administration (FDA) and National Cancer Institute. The analyses are complete, and a manuscript is being written for publication with identities of the platforms associated with the results anonymized, Allen says. Some of the initial results were also presented at the San Antonio Breast Cancer Symposium last December, and a Friends of Cancer Research public meeting in early 2025.

A new initiative is now being planned involving AI-enabled radiographic imaging tools, applying a similar approach to the Digital PATH Project, he notes. This Friends of Cancer Research partnership was recently launched with developers of tools able to measure changes in a tumor following treatment.

The approach of using a common set of samples evaluated by multiple tool developers, evaluating HER2 in the case of the Digital PATH Project, could also be applicable to companies developing ADCs leveraging digital pathology capabilities “to more sensitively classify different molecular alterations of patients that could benefit,” says Allen.

Addressing the Complexity

Precision medicine has been a high priority for Friends of Cancer Research for at least the last 15 years, says Allen. It is especially interested in ensuring the policies of the FDA reflect data and science to accelerate and improve cancer research, drug development, and treatment.

Targeted therapies have become a cornerstone of modern oncology medicine, which represents a significant advance in cancer treatment by offering more precise and often less toxic options than traditional chemotherapy. But they have at the same time added “some degree of complexity” to regulatory and development processes, says Allen, in referencing the importance of patients getting the same true outcome no matter what type of technology or test may be used to evaluate their cancer.

For many years now, the diagnostic journey for patients with solid tumors begins with a biopsy where the sample tissue gets stained with H&E (hematoxylin and eosin) to enhance its visualization under a microscope and highlight specific cellular morphologies or structures that are difficult to see otherwise, he explains. Certain types of cancer, including HER2-positive breast cancers, may be subsequently stained using different antibodies and immunofluorescence to be able to further characterize the presence of molecular alterations.

For the Digital PATH Project, technology partners evaluated biopsies that had been stained with H&E as well as for HER2 expression, he says. Each of the slides were digitized using a specialized computer scanner, enabling them to be analyzed by the various digital pathology tools.

Those used for the study all had an algorithmic component to assess and quantify HER2 expression, “the thought being, particularly with very low levels of HER2 expression, that human assessment may not always detect expression of HER2 with the same level of sensitivity as an AI-enabled pattern recognition tool might be able to,” says Allen. “That’s where these tools are helping to inform the ultimate conclusions made by the expert human pathologists.”

‘Transformative’ Technology

The variability seen between the tools when it came to evaluating low-HER2 breast cancers was not particularly surprising, Allen says. This is because their embedded AI models were trained before widespread recognition of the need to score HER2 at low levels because it was not yet an “actionable classification.”

It also speaks to the need for transparency about the performance of these types of technologies, “to understand where they should best be applied,” he adds. The speed at which the 1,100 breast cancer samples were evaluated—”in a matter of days and weeks”—highlights how the use of an independent reference set can make clinical validation a highly efficient process.

“We’re very interested in exploring what the policy implications may be and how the use of independent reference set could support the validation of these types of technologies and support validation of tools for evaluating other biomarkers in the future,” says Allen. Beyond the HER2 space, pattern recognition tools are emerging that may be able to identify molecular alterations, particularly rare biomarkers, from traditional morphology slides.

AI-powered digital pathology could enable more comprehensive assessment of tumor alterations and identify potential treatments from which a patient could benefit. “On top of that,” Allen says, “the idea that these samples can be computerized, stored, and sent for evaluation could greatly expand access and the information could be more portable than having to distribute and relocate actual human tissue samples… for purposes of drug discovery as part of clinical trials, or even for routine treatment.”

The technology could be “transformative” for the future of oncology and potentially other disease areas as well, but we also need to think about the policies that support ensuring the accuracy of these types of tools,” he concludes. “We are just at the beginning of this technological journey, but all indicators are that there is good reason to be excited.”