On February 5th, Friends of Cancer Research (Friends) hosted a public meeting, “Modernizing Oncology Endpoints: Pathways for Evidence and Policy“ to examine how emerging early endpoints, including circulating tumor DNA (ctDNA) and artificial intelligence (AI)-enabled tumor assessments, may improve trial efficiency and inform regulatory decision-making. Realizing this potential will require convergence around evidentiary standards, scalable validation frameworks, and regulatory pathways that enable innovation to move beyond pilots and into predictable, sustainable adoption. The meeting featured a keynote address from Representative Diana DeGette (CO-1) and a fireside discussion with former FDA leaders, Drs. Rick Pazdur and Janet Woodcock, followed by focused sessions on ctDNA, AI-enabled tumor assessment tools, and the regulatory infrastructure needed to support endpoint modernization.
Key Takeaways
- Sustained investment in drug development tools is critical. Former FDA leadership identified a systemic translational gap where foundational drug development tools, particularly for rare diseases, remain underfunded. Maintaining scientific integrity, workforce capacity, and predictable regulatory processes is essential to preserving regulatory excellence.
- ctDNA shows strong patient-level associates but requires prospective integration into clinical trials. The ctMoniTR Project demonstrated that while individual-level associations between early ctDNA reductions and survival are robust, trial-level correlations with overall survival (OS) can be attenuated by treatment crossover effects in randomized controlled trials. Advancing ctDNA toward formal qualification will require moving from exploratory analyses to a prespecified secondary endpoint in prospective trials, using standardized timepoints and assays with consistent analytical validity.
- AI-enabled tools need to be validated according to their intended context of use (CoU). Performance expectations should scale with how the tool will be used in clinical trials, ensuring analytical validity and generalizability. Transparent training data, defined reference standards, and measurable performance metrics can help support confidence and scalable validation.
- Regulatory modernization requires modular and resourced frameworks. A modular qualification approach that decouples analytic method validation from disease-specific clinical validation may help move biomarker development to progress through more predictable and scalable regulatory pathways. Securing dedicated resources through PDUFA (Prescription Drug User Fee Act) negotiations and advancing cross-sector data harmonization will be essential to prevent fragmentation and streamline evaluation of rapidly evolving technologies.
Keynote Fireside Chat with Kate Rawson, Prevision Policy and Richard Pazdur and Janet Woodcock, Former FDA
Kate Rawson noted that Dr. Woodcock and Dr. Pazdur are no longer at the FDA and were sharing perspectives based on their experience and current work.
Janet Woodcock: Drug Development Tools are Critical and Underfunded
Dr. Woodcock (Former Principal Deputy Commissioner, FDA) reflected on her work since leaving the FDA, particularly her engagement with rare disease and patient communities. She described a recurring challenge: communities are attempting to develop biomarkers and drug development tools with limited resources, even though these tools are foundational to clinical development.
Dr. Woodcock pointed to ctDNA as one example of a modern tool that offers improved methods to assess tumor activity, size, and volumetric change. She emphasized that these tools are critical to medicine and yet are not consistently funded at a level commensurate with their importance. She framed this as a translational gap, where families are often told that trials cannot proceed because measurement tools, endpoints, or biomarkers are not sufficiently established, and she emphasized her intent to continue working on solutions.
Richard Pazdur: Preserving Scientific Integrity and Workforce Capacity at FDA
Dr. Pazdur (Former Director, Oncology Center of Excellence, FDA) focused on institutional considerations affecting FDA’s ability to engage emerging science. He expressed concerns about political influence and described the erosion of a long-standing “firewall” that historically insulated scientific staff from political pressure. He also highlighted workforce attrition, emphasizing that rebuilding capacity is not simply, but restoring and retaining the qualifications, training, and experience needed to evaluate novel endpoints and evolving scientific approaches.
Building Trust: Governance and Structural Considerations
In discussing governance and trust, both speakers underscored the importance of protecting scientific integrity. Dr. Pazdur noted that credibility depends on strong safeguards against inappropriate influence, while Dr. Woodcock observed that although some interface with political sphere is inevitable, scientific decision-making must remain insulated from undue pressure Both emphasized that staff must “hold the line” against decisions that are scientifically inappropriate.
New FDA Programs: Emphasize Efficiency, Not Speed.
The discussion turned to recently announced FDA initiatives, including the Commissioner’s National Priority Voucher Pilot and the Plausible Mechanism Pathway.
Dr. Woodcock cautioned that voucher-like programs may shift burden onto review staff and disrupt other functions if not carefully structured. She suggested that FDA modernization should focus not only on speed but also on improving documentation practices and workload sustainability.
Dr. Pazdur echoed this sentiment, emphasizing that the goal should be efficiency rather than speed. He noted that FDA has previously implemented approaches intended to improve review efficiency (e.g., Real-Time Oncology Review and Assessment Aid) and argued that any new initiative should account for variation across applications (e.g., differences in risk profiles and patient populations), incorporate staff input, and be designed transparently.
“Speed is a very sophomoric aim. It should be about efficiency. When you focus only on speed, you don’t understand the process—and it’s the process that determines whether reviews are rigorous, sustainable, and trustworthy.”
- Richard Pazdur, Former Director, Oncology Center of Excellence, FDA
Biomarker Qualification and Novel Endpoints: Evidence is the Constraint
The speakers returned to the meeting’s core theme—early endpoints and novel biomarker measurement tools.
Dr. Woodcock emphasized that when biomarkers or novel tools are intended to support regulatory or clinical decision-making, evidence is the limiting factor. Variability across tools, limited characterization of analytical performance, lack of harmonized data across trials, and the cost of prospective and meta-analytical validation all contribute to the challenge. She argued that funding for this translational work remains insufficient and suggested that broader investment, potentially including expanded NIH support, would be necessary to build the evidentiary base. Dr. Pazdur reinforced that context of use drives evidentiary expectations and noted that biomarker development is inherently iterative, requiring sustained community engagement and prospective evaluation. Both speakers noted that current workforce constraints may limit FDA’s capacity to lead tool development to the extent stakeholders have historically expected, suggesting that non-government stakeholders may increasingly need to drive evidentiary generation in the near term.
“What many stakeholders overlook is the assumption that merely exerting pressure on the FDA will lead to the automatic acceptance of novel technologies, such as volumetric CT or AI-enabled tumor assessments, solely on principle. There is a formalized regulatory process in place; fundamentally, if one intends to base regulatory or clinical decisions on a new biomarker or diagnostic tool, that tool must be supported by substantial evidence. This necessity for rigorous data is frequently misunderstood, yet the reality remains that generating such evidence is an expensive and inherently arduous endeavors.”
– Janet Woodcock, Former Principal Deputy Commissioner, FDA
Session 1 – ctDNA: Lessons Learned and Future Directions
Session 1 focused on the evidentiary requirements for qualifying ctDNA as an early endpoint in oncology trials and reviewing multi-year analyses from the Friends ctMoniTR Project. The session was moderated by Névine Zariffa (NMD Group), with panelists Jon Baden (Bristol Myers Squibb), Shibing Deng (Pfizer Inc.), Minetta Liu (Natera Inc.), and Paz Vellanki (FDA).
Presentation by Névine Zariffa: Lessons Learned from ctMoniTR and Future Directions
The Evidentiary Framework for Validation
Zariffa established that validating an early endpoint requires a bifurcated evidentiary framework. First, Individual patient-level associations, which evaluate whether changes in early endpoint are associated with differences in OS and PFS, must demonstrate that a change in the biomarker (e.g., molecular response or ctDNA clearance) predicts the patient’s long-term clinical outcome. Second, trial-level associations, which examine whether treatment effects on the early endpoint align with treatment effects on longer-term clinical outcomes, must prove through randomized clinical trials (RCTs) that the treatment effect on the biomarker consistently aligns with the treatment effect on OS and PFS. For regulatory surrogacy, the field generally seeks a correlation coefficient (R2) values approaching 0.7, with a lower confidence interval boundary exceeding 0.5.
Key Findings from the ctMoniTR Project and Takeaways
The ctMoniTR Project aggregated data from over 20 trials to test these associations, using a stepwise approach to address feasibility, heterogeneity, and generalizability across treatment classes and advanced tumor types. Across multiple modules and datasets, analyses consistently showed that reductions in ctDNA levels relative to baseline were associated with improved survival at the individual patient level. For responders, they generally experienced two- to three-fold improvements across various solid tumors and treatment modalities.
While individual-level data was robust, trial-level correlations were initially weak. This was largely attributed to treatment crossover, where control-arm patients receive the investigational drug upon progression, thereby dampening the observable OS treatment effect and weakening the biomarker’s correlation. In contrast, ctDNA clearance showed a much stronger trial-level correlation with PFS (R2 ≈0.77).
“We need circulating tumor DNA to be a declared endpoint of the trial, to be collected with as much care and quality as the other important endpoints of the trial—completeness, full specifications, and an understanding that assays may change. That means bringing ctDNA out of the exploratory phase and into a formal role, for example as a secondary endpoint.”
- Névine Zariffa, NMD Group
What We Heard from the Panel
Technical Rigor and Assay Considerations
The panel delved into technical requirements for implementing ctDNA in both advanced and early-stage, minimal residue disease (MRD) settings:
Shibing Deng and Jon Baden emphasized that in early-stage disease, ctDNA levels are often at the limit of detection (LOD), requiring assay that are analytically precise and reproducible down to one part per million.
Minetta Liu highlighted that tumor-informed assays currently offer superior sensitivity and specificity for detecting MRD, though they require successful tissue acquisition, which can be logistically challenging in clinical practice.
The panelists agreed on the necessity of harmonized collection time points and sampling schedules (e.g., pre-surgery, post-surgery, and early on-treatment) to allow for data pooling and cross-trial comparisons.
Regulatory and Clinical Implications and the Need for Collaboration
Paz Vellanki provided a regulatory perspective clarifying that while ctDNA is a biologically meaningful marker, it is not yet a validated endpoint for standalone drug approval. She encouraged sponsors to move ctDNA from an exploratory status to a prespecified secondary endpoint in prospective trials to build the necessary evidence that could inform the use of ctDNA for accelerated approval with a subsequent OS confirmatory endpoint.
Baden noted that ctDNA is already instrumental in internal go/no-go decision-making and dose optimization during early drug development.
From a forward-looking perspective, Vellanki and Liu noted that ctDNA may provide essential efficacy data for patients with non-measurable disease on standard imaging and offer earlier detection of recurrence compared to radiographic assessment. Concluding the discussion on ctDNA, the panelists agreed that the transition of ctDNA into a qualified biomarker requires a shift from exploratory pilots to standardized, prospective data collection, supported by cross-sector collaboration to harmonize analytical methods and clinical validation frameworks.
Session 2 – AI-Enabled Tumor Assessments: Opportunities and Challenges for Incorporation into Clinical Trials
Session 2 focused on the opportunities and challenges of incorporating AI-enabled tumor assessment tools into clinical trials to modernize radiographic measurements and develop novel volumetric endpoints. The session was moderated by Ariel Bourla (Johnson & Johnson), with panelists Felix Baldauf-Lenschen (Altis Labs), Lauren Brady (Genmab), Nathaniel Braman (Picture Health), Lia Ridout (Friends Advisory Advocate), Larry Schwartz (Memorial Sloan Kettering Cancer Center), and Alain Silk (Tempus AI)
Presentation by Larry Schwartz: AI Enabled Tumor Assessment Tools
Limitations of Traditional RECIST and the Shift to AI-Enabled Assessments
Larry Schwartz detailed the evolution of RECIST, using the “Princess and the Pea” analogy. The analogy explained the historical necessity of RECIST to ensure reproducibility, but argued its unidimensional focus is insufficient for assessing the efficacy of modern therapies. He highlighted that simple measurements fail to distinguish pseudo-progressions, or capture the phenotypic changes (e.g., vascularity and consistency) seen with immunotherapies and targeted agents. Larry observed that while a greater magnitude of tumor shrinkage historically correlates with improved prognosis, the trial-level association between objective response and OS remains weaker than often appreciated. The limitations of manual lesion selection and unidimensional metrics have been exacerbated by atypical response patterns of current treatment modalities, including antibody-drug conjugates (ADCs) and cell therapies, which calls for a more technologically robust volumetric assessments to capture the total tumor burden.
AI-Enabled Radiomics: Quantifying Sub-Visual Biological Phenotypes
AI-enabled tools offer the capability to perform automated, multi-lesion assessments and analyze radiomic features (e.g., internal consistency), that reflect biological changes occurring independently of physical tumor dimensions. For example, in gastrointestinal stromal tumors, a properly treated lesion may change in internal consistency without a corresponding change in physical size, representing a nuanced biological signal that traditional radiology reports cannot adequately quantify as a biomarker.
Kinetic Modeling and the G-Value: Enhancing Sensitivity in Efficacy Readouts
Schwartz presented evidence that tumor growth kinetics can provide a more sensitive supplemental measure of clinical activity than traditional PFS. This kinetic approach potentially permits the detection of therapeutic effects in vastly reduced patient cohorts.
A Stepwise Roadmap for Regulatory Integration and AI Evolution
To facilitate regulatory adoption, Schwartz proposed a stepwise roadmap for AI integration, transitioning from human-in-the-loop assistance and adjudication to the independent assessment of raw imaging data from real-world sources. He concluded that while AI is a disruptive technology, it will not replace the radiologist. Rather, AI will evolve the role by providing earlier, more precise, and more comparable results across diverse clinical settings.
“Many of us recognize there are real challenges and deficiencies in RECIST as it stands today. The goal with AI tools is to address some of those challenges and develop something that’s more comprehensive, more reproducible, and ultimately more meaningful.”
- Larry Schwartz, Memorial Sloan Kettering Cancer Center
What We Heard from the Panel
Scaling Validation to the Context of Use (CoU)
Alain Silk and Felix Baldauf-Lenschen noted that regulatory acceptance is contingent upon transparency regarding how a tool is developed and how it performs compared to existing standards. Baldauf-Lenschen highlighted the FDA’s qualification of the first AI-based drug development tool, the AI-Based Histologic Measurement of NASH (AIM-NASH) to help pathologists assess MASH (metabolic dysfunction-associated steatohepatitis). The AI-enabled tool functioned as a secondary reader, demonstrating that AI can reduce inter-reader variability and support more standardized assessment in clinical trials.
The panel agreed that AI should currently function in a human-in-the-loop capacity. Lia Ridout and Nathaniel Braman stressed that AI must be designed to complement clinical judgment rather than replace it, ensuring accountability and avoiding the workflow disruptions seen in early computer-aided diagnosis (CAD) attempts.
Radiology as a Source of Phenotypic Biomarkers
A major discussion theme was the shift in perception from radiology as a diagnostic monitoring tool to radiology as a dense source for evaluating biomarkers.
Baldauf-Lenschen and Lauren Brady discussed how AI can quantify “sub-visual” phenotypic information, such as tumor vascularity, heterogeneity, and the tumor-associated microenvironment that are difficult to discern by the human eye. Panelists noted that while RECIST is limited by its unidimensional focus and subjectivity in lesion selection, AI enables volumetric assessments and more comprehensive tracking of total tumor burden. This is particularly relevant for modern therapies (i.e., immunotherapies), where traditional metrics can struggle to distinguish treatment-related effects such as pseudo-progression.
Radiology as a Source of Phenotypic Biomarkers
Despite the promise of AI, the panel identified several hurdles related to generalizability and standardization of these tools.
Brady raised concerns regarding whether an AI model validated for one disease setting (e.g., lung cancer) can maintain accuracy in another (e.g., pancreatic cancer) or across different mechanisms of action (MOA), such as ADCs versus cell therapies.
Silk and Baldauf-Lenschen advocated for the creation of harmonized, publicly available reference datasets to benchmark diverse AI tools against a common ground truth. Such transparency is necessary to move past the black box problem seen in AI models and ensure that both regulators and patients understand the basis for therapeutic decisions.
Vision for the Future: Multimodal Integration
Looking ahead, Schwartz and Brady envisioned a multimodal approach to precision medicine. By combining AI-enabled imaging with other biomarkers, such as ctDNA or digital pathology, the panel believes the industry can develop more sensitive, earlier readouts of clinical activity, ultimately reducing the necessary sample sizes for late-stage trials and accelerating patient access to promising treatments.
Session 3 – Modernizing Regulatory Frameworks: Future Policies & Priorities
Session 3 examined the intersection of rapid therapeutic innovation and the regulatory policy structures required to ensure scientific rigor, predictability, and patient access. The discussion took place against the backdrop of the upcoming PDUFA reauthorization cycle, which is a critical legislative cycle for advancing drug development regulatory policy and FDA’s capacity. The session was moderated by Jeff Allen, Friends of Cancer Research, with panelists Tala Fakhouri (Parexel), Michael Montalto (Amgen Inc.), John Stone (BGR Group), and Lowell Zeta (FDA).
FDA Strategic Priorities and Infrastructure
Lowell Zeta outlined the FDA’s current focus on delivering infrastructure guidances and programmatic updates intended to accelerate drug discovery and development within existing authorities. Key priorities include:
- Manufacturing Flexibility – The agency is exploring increased flexibility for small sub-populations (e.g., N=1 therapies) and the implementation of advanced manufacturing technologies to provide greater predictability for sponsors. This includes potential updates to Part 211 (Good Manufacturing Practice regulations) to modernize the “backbone framework” of pharmaceutical production.
- Operational Efficiencies – Internal efforts are focused on leveraging technology to support reviewers and strengthen post-market surveillance.
- Access and Affordability – Strategic focus remains on the ACNU (Additional Condition for Nonprescription Use) pathway and biosimilar guidance documents to foster market competition.
The PDUFA Reauthorization and Legislative Landscape
John Stone provided a legislative perspective on the upcoming 2027 PDUFA reauthorization, noting that draft commitment letters typically prioritize real-world evidence, adaptive clinical trial designs, biomarkers, and domestic manufacturing incentives. John observed that while industry and the FDA negotiate within current authorities, Congress often utilizes the reauthorization as a vehicle for “riders,” which is an ancillary legislation intended to advance more aggressive regulatory reforms that may have fallen short in the commitment letter negotiations.
The Biomarker Qualification Program (BQP) and Institutional Friction
Underutilization and Resource Gaps
The panel assessed the BQP established under the 21st Century Cures Act. Stone noted the BQP has been relatively underutilized, with only a small fraction of accepted projects (approximately 13-16%) reaching qualification. A primary historical friction point was the transition from a proprietary pathway, which offered high incentives for the industry, to a public domain framework that may lack the necessary proprietary protections to drive broad private-sector investment.
Operational Challenges
Michael Montalto highlighted that the BQP lacks dedicated resources and user-fee-tied timelines, often leading to protracted five-year qualification cycles. Moreover, sponsors often hesitate to include exploratory markers as secondary endpoints in pivotal trials for fear that failing exploratory data could negatively impact the regulatory assessment of the primary drug application.
Addressing “Death by Pilots” and AI Validation
Tala Fakhouri and Montalto addressed the “tsunami of innovation” in oncology platforms, such as ADCs, molecular glues, and degraders, which are outpacing traditional policy development. Fakhouri advocated for a modular qualification program that decouples the analytic validation of a method from the clinical validation for a specific disease. She noted that this would allow a qualified method to be plugged into various drug development programs, reducing the repetitive errors and resource depletion currently seen in “death by pilots,” where innovation occurs in isolated pockets, but fails to scale across the industry.
Multidisciplinary Expertise and Global Harmonization
The panel noted that AI-enabled tools often fall into a “no-man’s land” between Center for Drug Evaluation Research (CDER) and the Center for Devices and Radiological Health (CDRH), pointing to a need for a more integrated, multidisciplinary review team at the FDA.
Given that drug development is a global enterprise, Fakhouri emphasized that the FDA and EMA (European Medicines Agency) must find consistency through ICH (International Council for Harmonization) to reduce the uncertainty that discordant requirements create for multiregional clinical trials.
Enhancing Predictability and Consistency
The panel concluded that while flexible policies and guidance are a necessary first step, the next phase is ensuring predictability and consistency during implementation, particularly among different review divisions and during inspections. This requires sustained investment in upskilling the regulatory workforce and fostering pre-competitive spaces for data sharing to ensure that patient-centered, innovative endpoints can be successfully translated into regulatory use, thereby accelerating patient access to promising, potentially life-saving therapies.
“There’s this tension where you don’t want regulators to overregulate, because they should be promoting innovation, but at the same time there’s a need for concrete information on what validation would actually look like for AI-enabled outcomes. That tension is still there, and it’s going to require much more collaboration across the ecosystem.”
– Tala Fakhouri, Parexel
