NEW YORK – Friends of Cancer Research will soon release a tumor mutational burden calibration tool that developers of next-generation sequencing assays can use to correct the variability in how TMB is calculated across different gene panels and analysis pipelines.
Although this inter-laboratory variability has been expected and observed for some time, FOCR’s TMB harmonization project — involving commercial and academic NGS test labs, reagent and reference materials providers, and academic investigators — has been the first to try to systematically characterize the factors that lead to differences in TMB readout and to develop strategies whereby tests can be harmonized to a common standard.
Interest in using TMB to predict which patients will respond to cancer immunotherapies has expanded rapidly in recent years, culminating in the US Food and Drug Administration’s tissue-agnostic approval of pembrolizumab (Merck’s Keytruda) last month as a treatment for patients with unresectable or metastatic solid tumors with a high mutational burden. Simultaneously, the agency also approved Foundation Medicine’s FoundationOne CDx to identify TMB-high patients who may be eligible for treatment.
Despite the excitement around this biomarker, stakeholders have also acknowledged that TMB is a complex biomarker that labs are gauging differently and that this will make it challenging for oncologists to interpret and make treatment decisions based on their patients’ TMB results.
For example, the latest tissue-agnostic approval for pembrolizumab specifies a TMB cutoff at 10 mutations per megabase using Foundation’s test. But Foundation’s NGS panel is not the only one being ordered by oncologists who report TMB. Absent a system for assay harmonization across labs, a clinician can’t be sure whether a TMB score of 10 mutations per megabase from another test is the same as a 10 from FoundationOne CDx.
This distinction matters for patient care as doctors rely on that TMB score to decide whether to give immunotherapy or another drug. Being able to rely on that TMB value is also crucial as oncologists must contend with an evolving landscape of markers, like PD-L1 expression status, microsatellite instability (MSI), and mismatch repair deficiency (dMMR), when deciding whether to prescribe immunotherapies to their patients.
In the eyes of the FOCR consortium, the gold standard for TMB is to calculate from whole-exome sequencing data, but targeted gene panels, which provide a more cost- and time-effective solution, have become the most frequently employed in research and clinical TMB assays. The result is a landscape of tests that query different sections of the genome, use different mutation filtering and bioinformatic pipelines, and were validated against different references.
In a publication describing the first phase of their effort this March, FOCR investigators reported on a cross-comparison in which 11 labs compared their tests’ TMB calls across 32 tumor types using whole-exome sequencing as a reference. The team found that while there was a strong overall correlation between the TMB calls of most individual tests, there were instances of significant variation.
And at the American Association for Cancer Research’s virtual annual meeting, the team followed this up with additional data on the sources of variation that contribute to the observed discordance between TMB assays and introduced an informatics solution the group is creating that might help labs reduce, or correct for, this divergence.
In a webinar this week, investigators provided new details about this correction strategy and said that they have built a software tool, using the widely-implemented software language R, that they hope to release soon, which will allow labs to implement the calibration method in-house.
Lisa McShane, an associate director in the division of cancer treatment and diagnosis at the National Cancer Institute, explained during the webinar that the process works by using information about how a particular NGS panel’s TMB scores differ from WES-TMB scores for the same samples. The software reverse engineers and corrects TMB scores from NGS panel tests based on that established divergence pattern.
“The basic idea is that if we have information about what range of panel-TMB values tend to result from specimens having a known WES-TMB value, which is serving as the reference standard here, then we should be able to put things in reverse to obtain an estimate of the unknown WES-TMB value based on an observed panel value,” she said.
One thing the group had to develop to make this work was a strategy to quantify the uncertainty in this calibration. The result is that the method gives both a recalibrated TMB value and an uncertainty or confidence interval range.
“If you would like to perform calibration for your laboratory’s assay … you [would] generate a set of paired measurements of WES-TMB reference values coupled with TMB values measured by your assay,” McShane explained. Although different data or sample sets could be used, one option would be to do this in silico using the same TCGA data that FOCR used in the first phase of its comparison project.
The resulting set of paired measurements then serves as training data for reverse-calculating WES-TMB, she added.
Among suggestions for applications of the calibration tool, McShane said that the software could help facilitate combining panel-TMB data across different studies employing different panels to correlate TMB with tumor response or survival outcomes.
McShane also said that being able to quantify the imprecision in panel-TMB values relative to WES-TMB could be useful clinically, “especially when values are near clinical cut points and when other factors may be brought to bear in making a decision about treatment with immunotherapy.”
But despite the value demonstrated in the group’s experiments, assay calibration may be an imperfect clinical solution. The FOCR team didn’t discuss, for example, how labs might use the generated uncertainty intervals in reporting individual TMB test results that will be used to guide treatment decisions.
McShane and her colleague Mickey Williams, director of the molecular characterization laboratory at the Frederick National Laboratory for Cancer Research, noted, though, that instances where this poses a clinical reporting dilemma may be infrequent. The data from the first phase of the harmonization project indicated that at lower TMB values, close to the 10-mutation-per-megabase cutoff approved by the FDA, the variation among existing panel assays, and between assays and WES, is relatively small.
It was at higher TMB levels, around 30 or 40 mutations per megabase and more, that TMB calls appeared to diverge more widely between tests.
That doesn’t mean, however, that there won’t be instances that pose clinical interpretation issues. For example, in one datapoint from the group’s tests within the FOCR consortium that McShane highlighted, an observed panel-TMB value of 15 was translated to a calibration estimate of 13, with an interval of uncertainty ranging from about seven to 20.
While both 15 and 13 are above the 10 mutations per megabase cutoff approved by the FDA for Foundation’s test, a spread of uncertainty ranging down to 7 does trespass that minimum value and raises doubts about whether immunotherapy would benefit the patient in such a case.
In another example from the application of the calibration tool to correct calls from a particular laboratory, a panel-TMB value of 10.2 was calibrated to 9.9, with an interval of uncertainty of 5.9 to 16.5.
As noted by McShane, Williams, and other experts during the webinar, harmonization and calibration unfortunately can’t address other lingering issues for the future evolution of TMB as a predictive biomarker, including continued uncertainty about how to optimize cutoff points to best select patients for treatment.
Participants agreed that TMB cutoffs may be different for different immunotherapy drugs, indications, and across different cancer types or subsets. They also recognized that the implementation of TMB in precision oncology will continue to be complicated by other existing immunotherapy biomarkers, like PD-L1 status, and likely by new predictors that emerge in areas like T-cell activity and the tumor microenvironment.